IDEAS home Printed from https://ideas.repec.org/p/rut/rutres/201610.html
   My bibliography  Save this paper

The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications

Author

Listed:
  • Christian Hansen

    (Booth School of Business, University of Chicago)

  • Yuan Liao

    (Department of Economics, Rutgers University)

Abstract

We consider inference about coefficients on a small number of variables of interest in a linear panel data model with additive unobserved individual and time specific effects and a large number of additional time-varying confounding variables. We allow the number of these additional confounding variables to be larger than the sample size, and suppose that, in addition to unrestricted time and individual specific effects, these confounding variables are generated by a small number of common factors and high-dimensional weakly-dependent disturbances. We allow that both the factors and the disturbances are related to the outcome variable and other variables of interest. To make informative inference feasible, we impose that the contribution of the part of the confounding variables not captured by time specific effects, individual specific effects, or the common factors can be captured by a relatively small number of terms whose identities are unknown. Within this framework, we provide a convenient computational algorithm based on factor extraction followed by lasso regression for inference about parameters of interest and show that the resulting procedure has good asymptotic properties. We also provide a simple k-step bootstrap procedure that may be used to construct inferential statements about parameters of interest and prove its asymptotic validity. The proposed bootstrap may be of substantive independent interest outside of the present context as the proposed bootstrap may readily be adapted to other contexts involving inference after lasso variable selection and the proof of its validity requires some new technical arguments. We also provide simulation evidence about performance of our procedure and illustrate its use in two empirical applications.

Suggested Citation

  • Christian Hansen & Yuan Liao, 2016. "The Factor-Lasso and K-Step Bootstrap Approach for Inference in High-Dimensional Economic Applications," Departmental Working Papers 201610, Rutgers University, Department of Economics.
  • Handle: RePEc:rut:rutres:201610
    as

    Download full text from publisher

    File URL: http://www.sas.rutgers.edu/virtual/snde/wp/2016-10.pdf
    Download Restriction: no
    ---><---

    Other versions of this item:

    References listed on IDEAS

    as
    1. Choi, In, 2012. "Efficient Estimation Of Factor Models," Econometric Theory, Cambridge University Press, vol. 28(2), pages 274-308, April.
    2. Eric Gautier & Alexandre Tsybakov, 2011. "High-Dimensional Instrumental Variables Regression and Confidence Sets," Working Papers 2011-13, Center for Research in Economics and Statistics.
    3. Alexandre Belloni & Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP77/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    4. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    5. Jushan Bai & Serena Ng, 2002. "Determining the Number of Factors in Approximate Factor Models," Econometrica, Econometric Society, vol. 70(1), pages 191-221, January.
    6. Stéphane Bonhomme & Elena Manresa, 2015. "Grouped Patterns of Heterogeneity in Panel Data," Econometrica, Econometric Society, vol. 83(3), pages 1147-1184, May.
    7. Su, Liangjun & Chen, Qihui, 2013. "Testing Homogeneity In Panel Data Models With Interactive Fixed Effects," Econometric Theory, Cambridge University Press, vol. 29(6), pages 1079-1135, December.
    8. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform post selection inference for LAD regression and other z-estimation problems," CeMMAP working papers CWP74/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Hyungsik Roger Moon & Martin Weidner, 2015. "Linear Regression for Panel With Unknown Number of Factors as Interactive Fixed Effects," Econometrica, Econometric Society, vol. 83(4), pages 1543-1579, July.
    10. Chernozhukov, Victor & Hansen, Christian, 2008. "The reduced form: A simple approach to inference with weak instruments," Economics Letters, Elsevier, vol. 100(1), pages 68-71, July.
    11. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    12. Donald W. K. Andrews, 2002. "Higher-Order Improvements of a Computationally Attractive "k"-Step Bootstrap for Extremum Estimators," Econometrica, Econometric Society, vol. 70(1), pages 119-162, January.
    13. Seung C. Ahn & Alex R. Horenstein, 2013. "Eigenvalue Ratio Test for the Number of Factors," Econometrica, Econometric Society, vol. 81(3), pages 1203-1227, May.
    14. Alexandre Belloni & Victor Chernozhukov & Ying Wei, 2013. "Honest confidence regions for a regression parameter in logistic regression with a large number of controls," CeMMAP working papers CWP67/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    15. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    16. Hansen, Christian B., 2007. "Asymptotic properties of a robust variance matrix estimator for panel data when T is large," Journal of Econometrics, Elsevier, vol. 141(2), pages 597-620, December.
    17. Damian Kozbur, 2017. "Testing-Based Forward Model Selection," American Economic Review, American Economic Association, vol. 107(5), pages 266-269, May.
    18. NESTEROV, Yu., 2007. "Gradient methods for minimizing composite objective function," LIDAM Discussion Papers CORE 2007076, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    19. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "Inference on Treatment Effects after Selection among High-Dimensional Controlsâ€," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 81(2), pages 608-650.
    20. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform post selection inference for LAD regression models," CeMMAP working papers CWP24/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    21. Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2012. "Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors," Papers 1212.6906, arXiv.org, revised Jan 2018.
    22. Marianne Bertrand & Esther Duflo & Sendhil Mullainathan, 2004. "How Much Should We Trust Differences-In-Differences Estimates?," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 119(1), pages 249-275.
    23. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers CWP49/16, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    24. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    25. Jushan Bai, 2003. "Inferential Theory for Factor Models of Large Dimensions," Econometrica, Econometric Society, vol. 71(1), pages 135-171, January.
    26. Jushan Bai & Serena Ng, 2006. "Confidence Intervals for Diffusion Index Forecasts and Inference for Factor-Augmented Regressions," Econometrica, Econometric Society, vol. 74(4), pages 1133-1150, July.
    27. Daron Acemoglu & Simon Johnson & James A. Robinson, 2001. "The Colonial Origins of Comparative Development: An Empirical Investigation," American Economic Review, American Economic Association, vol. 91(5), pages 1369-1401, December.
    28. M. Hashem Pesaran, 2006. "Estimation and Inference in Large Heterogeneous Panels with a Multifactor Error Structure," Econometrica, Econometric Society, vol. 74(4), pages 967-1012, July.
    29. Cook, Philip J. & Ludwig, Jens, 2006. "The social costs of gun ownership," Journal of Public Economics, Elsevier, vol. 90(1-2), pages 379-391, January.
    30. P. Richard Hahn & Carlos M. Carvalho & Sayan Mukherjee, 2013. "Partial Factor Modeling: Predictor-Dependent Shrinkage for Linear Regression," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 999-1008, September.
    31. Arellano, M, 1987. "Computing Robust Standard Errors for Within-Groups Estimators," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 49(4), pages 431-434, November.
    32. Chatterjee, A. & Lahiri, S. N., 2011. "Bootstrapping Lasso Estimators," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 608-625.
    33. Jushan Bai, 2009. "Panel Data Models With Interactive Fixed Effects," Econometrica, Econometric Society, vol. 77(4), pages 1229-1279, July.
    34. Stock J.H. & Watson M.W., 2002. "Forecasting Using Principal Components From a Large Number of Predictors," Journal of the American Statistical Association, American Statistical Association, vol. 97, pages 1167-1179, December.
    35. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Simon Freyaldenhoven & Christian Hansen & Jesse M. Shapiro, 2019. "Pre-event Trends in the Panel Event-Study Design," American Economic Review, American Economic Association, vol. 109(9), pages 3307-3338, September.
    2. Jad Beyhum & Jonas Striaukas, 2023. "Factor-augmented sparse MIDAS regressions with an application to nowcasting," Papers 2306.13362, arXiv.org, revised Nov 2024.
    3. Philippe Goulet Coulombe, 2021. "The Macroeconomy as a Random Forest," Working Papers 21-05, Chair in macroeconomics and forecasting, University of Quebec in Montreal's School of Management.
    4. Victor Chernozhukov & Kaspar Wüthrich & Yinchu Zhu, 2021. "An Exact and Robust Conformal Inference Method for Counterfactual and Synthetic Controls," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 116(536), pages 1849-1864, October.
    5. Harold D. Chiang & Kengo Kato & Yukun Ma & Yuya Sasaki, 2022. "Multiway Cluster Robust Double/Debiased Machine Learning," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1046-1056, June.
    6. Vogt, M. & Walsh, C. & Linton, O., 2022. "CCE Estimation of High-Dimensional Panel Data Models with Interactive Fixed Effects," Cambridge Working Papers in Economics 2242, Faculty of Economics, University of Cambridge.
    7. Smeekes, Stephan & Wijler, Etienne, 2018. "Macroeconomic forecasting using penalized regression methods," International Journal of Forecasting, Elsevier, vol. 34(3), pages 408-430.
    8. Philippe Goulet Coulombe, 2020. "The Macroeconomy as a Random Forest," Papers 2006.12724, arXiv.org, revised Mar 2021.
    9. Michael Vogt & Christopher Walsh & Oliver Linton, 2022. "CCE Estimation of High-Dimensional Panel Data Models with Interactive Fixed Effects," Papers 2206.12152, arXiv.org.
    10. Vogt, M. & Walsh, C. & Linton, O., 2022. "CCE Estimation of High-Dimensional Panel Data Models with Interactive Fixed Effects," Janeway Institute Working Papers 2218, Faculty of Economics, University of Cambridge.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bai, Jushan & Liao, Yuan, 2017. "Inferences in panel data with interactive effects using large covariance matrices," Journal of Econometrics, Elsevier, vol. 200(1), pages 59-78.
    2. Yoshimasa Uematsu & Takashi Yamagata, 2019. "Estimation of Weak Factor Models," DSSR Discussion Papers 96, Graduate School of Economics and Management, Tohoku University.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    4. Lu, Xun & Su, Liangjun, 2016. "Shrinkage estimation of dynamic panel data models with interactive fixed effects," Journal of Econometrics, Elsevier, vol. 190(1), pages 148-175.
    5. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    6. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," American Economic Review, American Economic Association, vol. 105(5), pages 486-490, May.
    7. Damian Kozbur, 2017. "Testing-Based Forward Model Selection," American Economic Review, American Economic Association, vol. 107(5), pages 266-269, May.
    8. Stéphane Bonhomme & Elena Manresa, 2015. "Grouped Patterns of Heterogeneity in Panel Data," Econometrica, Econometric Society, vol. 83(3), pages 1147-1184, May.
    9. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Nov 2024.
    10. Joakim Westerlund, 2020. "A cross‐section average‐based principal components approach for fixed‐T panels," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 35(6), pages 776-785, September.
    11. Christian Hansen & Damian Kozbur & Sanjog Misra, 2016. "Targeted undersmoothing," ECON - Working Papers 282, Department of Economics - University of Zurich, revised Apr 2018.
    12. Fan, Jianqing & Jiang, Bai & Sun, Qiang, 2022. "Bayesian factor-adjusted sparse regression," Journal of Econometrics, Elsevier, vol. 230(1), pages 3-19.
    13. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    14. Guowei Cui & Milda NorkutÄ— & Vasilis Sarafidis & Takashi Yamagata, 2022. "Two-stage instrumental variable estimation of linear panel data models with interactive effects [Eigenvalue ratio test for the number of factors]," The Econometrics Journal, Royal Economic Society, vol. 25(2), pages 340-361.
    15. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    16. Dmitry Arkhangelsky & Guido Imbens, 2023. "Causal Models for Longitudinal and Panel Data: A Survey," Papers 2311.15458, arXiv.org, revised Jun 2024.
    17. Georg Keilbar & Juan M. Rodriguez-Poo & Alexandra Soberon & Weining Wang, 2022. "A semiparametric approach for interactive fixed effects panel data models," Papers 2201.11482, arXiv.org, revised Mar 2023.
    18. Guowei Cui & Kazuhiko Hayakawa & Shuichi Nagata & Takashi Yamagata, 2018. "A robust approach to heteroskedasticity, error serial correlation and slope heterogeneity for large linear panel data models with interactive effects," ISER Discussion Paper 1037r, Institute of Social and Economic Research, Osaka University, revised Jun 2019.
    19. Liang Chen & Juan J. Dolado & Jesús Gonzalo, 2021. "Quantile Factor Models," Econometrica, Econometric Society, vol. 89(2), pages 875-910, March.
    20. Wei Shi & Lung-fei Lee, 2018. "The effects of gun control on crimes: a spatial interactive fixed effects approach," Empirical Economics, Springer, vol. 55(1), pages 233-263, August.

    More about this item

    Keywords

    treatment effects; panel data;

    JEL classification:

    • C33 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Models with Panel Data; Spatio-temporal Models

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:rut:rutres:201610. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: the person in charge (email available below). General contact details of provider: https://edirc.repec.org/data/derutus.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.