IDEAS home Printed from https://ideas.repec.org/p/ifs/cemmap/03-17.html
   My bibliography  Save this paper

Inference in linear regression models with many covariates and heteroskedasticity

Author

Listed:
  • Matias Cattaneo

    (Institute for Fiscal Studies and University of Michigan)

  • Michael Jansson

    (Institute for Fiscal Studies and Berkeley)

  • Whitney K. Newey

    (Institute for Fiscal Studies and MIT)

Abstract

The linear regression model is widely used in empirical work in Economics, Statistics, and many other disciplines. Researchers often include many covariates in their linear model speci?cation in an attempt to control for confounders. We give inference methods that allow for many covariates and heteroskedasticity. Our results are obtained using high-dimensional approximations, where the number of included covariates are allowed to grow as fast as the sample size. We fi?nd that all of the usual versions of Eicker-White heteroskedasticity consistent standard error estimators for linear models are inconsistent under this asymptotics. We then propose a new heteroskedasticity consistent standard error formula that is fully automatic and robust to both (conditional) heteroskedasticity of unknown form and the inclusion of possibly many covariates. We apply our fi?ndings to three settings: parametric linear models with many covariates, linear panel models with many ?fixed effects, and semiparametric semi-linear models with many technical regressors. Simulation evidence consistent with our theoretical results is also provided. The proposed methods are also illustrated with an empirical application.

Suggested Citation

  • Matias Cattaneo & Michael Jansson & Whitney K. Newey, 2017. "Inference in linear regression models with many covariates and heteroskedasticity," CeMMAP working papers CWP03/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
  • Handle: RePEc:ifs:cemmap:03/17
    as

    Download full text from publisher

    File URL: https://www.ifs.org.uk/uploads/cemmap/wps/cwp031717.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. James H. Stock & Mark W. Watson, 2008. "Heteroskedasticity-Robust Standard Errors for Fixed Effects Panel Data Regression," Econometrica, Econometric Society, vol. 76(1), pages 155-174, January.
    2. J.J. Heckman & E.E. Leamer (ed.), 2007. "Handbook of Econometrics," Handbook of Econometrics, Elsevier, edition 1, volume 6, number 6b.
    3. Pedro Carneiro & James J. Heckman & Edward J. Vytlacil, 2011. "Estimating Marginal Returns to Education," American Economic Review, American Economic Association, vol. 101(6), pages 2754-2781, October.
    4. James G. MacKinnon, 2012. "Thirty Years Of Heteroskedasticity-robust Inference," Working Paper 1268, Economics Department, Queen's University.
    5. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    6. Kline, Patrick & Santos, Andres, 2012. "Higher order properties of the wild bootstrap under misspecification," Journal of Econometrics, Elsevier, vol. 171(1), pages 54-70.
    7. Koenker, Roger, 1988. "Asymptotic Theory and Econometric Practice," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 3(2), pages 139-147, April.
    8. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    9. Newey, Whitney K., 1997. "Convergence rates and asymptotic normality for series estimators," Journal of Econometrics, Elsevier, vol. 79(1), pages 147-168, July.
    10. Alberto Abadie & Guido W. Imbens & Fanyin Zheng, 2014. "Inference for Misspecified Models With Fixed Regressors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(508), pages 1601-1614, December.
    11. Shurong Zheng & Dandan Jiang & Zhidong Bai & Xuming He, 2014. "Inference on multiple correlation coefficients with moderately high dimensional data," Biometrika, Biometrika Trust, vol. 101(3), pages 748-754.
    12. Chesher, Andrew, 1989. "Hajek Inequalities, Measures of Leverage and the Size of Heteroskedasticity Robust Wald Tests," Econometrica, Econometric Society, vol. 57(4), pages 971-977, July.
    13. J.J. Heckman & E.E. Leamer (ed.), 2007. "Handbook of Econometrics," Handbook of Econometrics, Elsevier, edition 1, volume 6, number 6a.
    14. Belloni, Alexandre & Chernozhukov, Victor & Chetverikov, Denis & Kato, Kengo, 2015. "Some new asymptotic theory for least squares series: Pointwise and uniform results," Journal of Econometrics, Elsevier, vol. 186(2), pages 345-366.
    15. White, Halbert, 1980. "A Heteroskedasticity-Consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica, Econometric Society, vol. 48(4), pages 817-838, May.
    16. Chesher, Andrew & Jewitt, Ian, 1987. "The Bias of a Heteroskedasticity Consistent Covariance Matrix Estimator," Econometrica, Econometric Society, vol. 55(5), pages 1217-1222, September.
    17. Joshua Angrist & Jinyong Hahn, 2004. "When to Control for Covariates? Panel Asymptotics for Estimates of Treatment Effects," The Review of Economics and Statistics, MIT Press, vol. 86(1), pages 58-72, February.
    18. Chen, Xiaohong, 2007. "Large Sample Sieve Estimation of Semi-Nonparametric Models," Handbook of Econometrics, in: J.J. Heckman & E.E. Leamer (ed.), Handbook of Econometrics, edition 1, volume 6, chapter 76, Elsevier.
    19. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    20. MacKinnon, James G. & White, Halbert, 1985. "Some heteroskedasticity-consistent covariance matrix estimators with improved finite sample properties," Journal of Econometrics, Elsevier, vol. 29(3), pages 305-325, September.
    21. Ulrich K. Müller, 2013. "Risk of Bayesian Inference in Misspecified Models, and the Sandwich Covariance Matrix," Econometrica, Econometric Society, vol. 81(5), pages 1805-1849, September.
    22. Goncalves, Silvia & White, Halbert, 2005. "Bootstrap Standard Error Estimates for Linear Regression," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 970-979, September.
    23. Donald, S. G. & Newey, W. K., 1994. "Series Estimation of Semilinear Models," Journal of Multivariate Analysis, Elsevier, vol. 50(1), pages 30-40, July.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Cattaneo, Matias D. & Jansson, Michael & Newey, Whitney K., 2018. "Alternative Asymptotics And The Partially Linear Model With Many Regressors," Econometric Theory, Cambridge University Press, vol. 34(2), pages 277-301, April.
    2. Matias D Cattaneo & Michael Jansson & Xinwei Ma, 2019. "Two-Step Estimation and Inference with Possibly Many Included Covariates," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 86(3), pages 1095-1122.
    3. Yang Ning & Sida Peng & Jing Tao, 2020. "Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data," Papers 2009.03151, arXiv.org.
    4. Byunghoon Kang, 2018. "Inference in Nonparametric Series Estimation with Specification Searches for the Number of Series Terms," Working Papers 240829404, Lancaster University Management School, Economics Department.
    5. Dong, Chaohua & Gao, Jiti & Linton, Oliver, 2023. "High dimensional semiparametric moment restriction models," Journal of Econometrics, Elsevier, vol. 232(2), pages 320-345.
    6. Byunghoon Kang, 2019. "Inference in Nonparametric Series Estimation with Specification Searches for the Number of Series Terms," Papers 1909.12162, arXiv.org, revised Feb 2020.
    7. Jochmans, K., 2019. "Heteroskedasticity-Robust Inference in Linear Regression Models," Cambridge Working Papers in Economics 1957, Faculty of Economics, University of Cambridge.
    8. Qiu, Chen & Otsu, Taisuke, 2022. "Information theoretic approach to high dimensional multiplicative models: stochastic discount factor and treatment effect," LSE Research Online Documents on Economics 110494, London School of Economics and Political Science, LSE Library.
    9. Byunghoon Kang, 2017. "Inference in Nonparametric Series Estimation with Data-Dependent Undersmoothing," Working Papers 170712442, Lancaster University Management School, Economics Department.
    10. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP72/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    12. Kyle Colangelo & Ying-Ying Lee, 2019. "Double debiased machine learning nonparametric inference with continuous treatments," CeMMAP working papers CWP54/19, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    13. Richard H. Spady & Sami Stouli, 2018. "Simultaneous Mean-Variance Regression," Bristol Economics Discussion Papers 18/697, School of Economics, University of Bristol, UK.
    14. Pötscher, Benedikt M. & Preinerstorfer, David, 2021. "Valid Heteroskedasticity Robust Testing," MPRA Paper 107420, University Library of Munich, Germany.
    15. Robert A. Moffitt & Matthew V. Zahn, 2019. "The Marginal Labor Supply Disincentives of Welfare: Evidence from Administrative Barriers to Participation," NBER Working Papers 26028, National Bureau of Economic Research, Inc.
    16. Pötscher, Benedikt M. & Preinerstorfer, David, 2023. "How Reliable Are Bootstrap-Based Heteroskedasticity Robust Tests?," Econometric Theory, Cambridge University Press, vol. 39(4), pages 789-847, August.
    17. Kyle Colangelo & Ying-Ying Lee, 2020. "Double Debiased Machine Learning Nonparametric Inference with Continuous Treatments," Papers 2004.03036, arXiv.org, revised Sep 2023.
    18. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2011. "Inference on Treatment Effects After Selection Amongst High-Dimensional Controls," Papers 1201.0224, arXiv.org, revised May 2012.
    19. Romano, Joseph P. & Wolf, Michael, 2017. "Resurrecting weighted least squares," Journal of Econometrics, Elsevier, vol. 197(1), pages 1-19.
    20. Nan Liu & Yanbo Liu & Yuya Sasaki, 2024. "Estimation and Inference for Causal Functions with Multiway Clustered Data," Papers 2409.06654, arXiv.org.

    More about this item

    Keywords

    high-dimensional models; linear regression; many regressors; heteroskedastic- ity; standard errors.;
    All these keywords.

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ifs:cemmap:03/17. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Emma Hyman (email available below). General contact details of provider: https://edirc.repec.org/data/cmifsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.