IDEAS home Printed from https://ideas.repec.org/p/arx/papers/1912.12867.html
   My bibliography  Save this paper

Adaptive Discrete Smoothing for High-Dimensional and Nonlinear Panel Data

Author

Listed:
  • Xi Chen
  • Ye Luo
  • Martin Spindler

Abstract

In this paper we develop a data-driven smoothing technique for high-dimensional and non-linear panel data models. We allow for individual specific (non-linear) functions and estimation with econometric or machine learning methods by using weighted observations from other individuals. The weights are determined by a data-driven way and depend on the similarity between the corresponding functions and are measured based on initial estimates. The key feature of such a procedure is that it clusters individuals based on the distance / similarity between them, estimated in a first stage. Our estimation method can be combined with various statistical estimation procedures, in particular modern machine learning methods which are in particular fruitful in the high-dimensional case and with complex, heterogeneous data. The approach can be interpreted as a \textquotedblleft soft-clustering\textquotedblright\ in comparison to traditional\textquotedblleft\ hard clustering\textquotedblright that assigns each individual to exactly one group. We conduct a simulation study which shows that the prediction can be greatly improved by using our estimator. Finally, we analyze a big data set from didichuxing.com, a leading company in transportation industry, to analyze and predict the gap between supply and demand based on a large set of covariates. Our estimator clearly performs much better in out-of-sample prediction compared to existing linear panel data estimators.

Suggested Citation

  • Xi Chen & Ye Luo & Martin Spindler, 2019. "Adaptive Discrete Smoothing for High-Dimensional and Nonlinear Panel Data," Papers 1912.12867, arXiv.org, revised Jan 2020.
  • Handle: RePEc:arx:papers:1912.12867
    as

    Download full text from publisher

    File URL: http://arxiv.org/pdf/1912.12867
    File Function: Latest version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. A. Belloni & V. Chernozhukov & L. Wang, 2011. "Square-root lasso: pivotal recovery of sparse signals via conic programming," Biometrika, Biometrika Trust, vol. 98(4), pages 791-806.
    2. Alexandre Belloni & Victor Chernozhukov & Christian Hansen & Damian Kozbur, 2016. "Inference in High-Dimensional Panel Models With an Application to Gun Control," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 34(4), pages 590-605, October.
    3. Thibaut Lamadon & Elena Manresa & Stephane Bonhomme, 2016. "Discretizing Unobserved Heterogeneity," 2016 Meeting Papers 1536, Society for Economic Dynamics.
    4. Degui Li & Junhui Qian & Liangjun Su, 2016. "Panel Data Models With Interactive Fixed Effects and Multiple Structural Breaks," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1804-1819, October.
    5. Qian, Junhui & Su, Liangjun, 2016. "Shrinkage estimation of common breaks in panel data models via adaptive group fused Lasso," Journal of Econometrics, Elsevier, vol. 191(1), pages 86-109.
    6. Henderson, Daniel J. & Carroll, Raymond J. & Li, Qi, 2008. "Nonparametric estimation and testing of fixed effects panel data models," Journal of Econometrics, Elsevier, vol. 144(1), pages 257-275, May.
    7. Jeffrey M Wooldridge, 2010. "Econometric Analysis of Cross Section and Panel Data," MIT Press Books, The MIT Press, edition 2, volume 1, number 0262232588, April.
    8. Racine, Jeff & Li, Qi, 2004. "Nonparametric estimation of regression functions with both categorical and continuous data," Journal of Econometrics, Elsevier, vol. 119(1), pages 99-130, March.
    9. Kock, Anders Bredahl, 2013. "Oracle Efficient Variable Selection In Random And Fixed Effects Panel Data Models," Econometric Theory, Cambridge University Press, vol. 29(1), pages 115-152, February.
    10. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    11. Michael Vogt & Oliver Linton, 2017. "Classification of non-parametric regression functions in longitudinal data models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 5-27, January.
    12. Kock, Anders Bredahl, 2016. "Oracle inequalities, variable selection and uniform inference in high-dimensional correlated random effects panel data models," Journal of Econometrics, Elsevier, vol. 195(1), pages 71-85.
    13. Manuel Arellano & Stèphane Bonhomme, 2011. "Nonlinear Panel Data Analysis," Annual Review of Economics, Annual Reviews, vol. 3(1), pages 395-424, September.
    14. Baltagi, Badi H & Raj, Baldev, 1992. "A Survey of Recent Theoretical Developments in the Econometrics of Panel Data," Empirical Economics, Springer, vol. 17(1), pages 85-109.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Chiang, Harold D. & Rodrigue, Joel & Sasaki, Yuya, 2023. "Post-Selection Inference In Three-Dimensional Panel Data," Econometric Theory, Cambridge University Press, vol. 39(3), pages 623-658, June.
    2. Harold D. Chiang, 2018. "Many Average Partial Effects: with An Application to Text Regression," Papers 1812.09397, arXiv.org, revised Jan 2022.
    3. Oliver Linton & Maximilian Ruecker & Michael Vogt & Christopher Walsh, 2022. "Estimation and Inference in High-Dimensional Panel Data Models with Interactive Fixed Effects," Papers 2206.12152, arXiv.org, revised Nov 2024.
    4. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Alexandre Belloni & Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP77/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    6. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.
    7. Okui, Ryo & Wang, Wendun, 2021. "Heterogeneous structural breaks in panel data models," Journal of Econometrics, Elsevier, vol. 220(2), pages 447-473.
    8. Harold D. Chiang & Kengo Kato & Yukun Ma & Yuya Sasaki, 2022. "Multiway Cluster Robust Double/Debiased Machine Learning," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 40(3), pages 1046-1056, June.
    9. Babii, Andrii & Ball, Ryan T. & Ghysels, Eric & Striaukas, Jonas, 2023. "Machine learning panel data regressions with heavy-tailed dependent data: Theory and application," Journal of Econometrics, Elsevier, vol. 237(2).
    10. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    11. Ruiqi Liu & Ben Boukai & Zuofeng Shang, 2019. "Statistical Inference on Partially Linear Panel Model under Unobserved Linearity," Papers 1911.08830, arXiv.org.
    12. Mert Hakan Hekimoğlu & Burak Kazaz, 2020. "Analytics for Wine Futures: Realistic Prices," Production and Operations Management, Production and Operations Management Society, vol. 29(9), pages 2096-2120, September.
    13. Guo, Zijian & Kang, Hyunseung & Cai, T. Tony & Small, Dylan S., 2018. "Testing endogeneity with high dimensional covariates," Journal of Econometrics, Elsevier, vol. 207(1), pages 175-187.
    14. Kock, Anders Bredahl, 2016. "Oracle inequalities, variable selection and uniform inference in high-dimensional correlated random effects panel data models," Journal of Econometrics, Elsevier, vol. 195(1), pages 71-85.
    15. Anders Bredahl Kock & Haihan Tang, 2014. "Inference in High-dimensional Dynamic Panel Data Models," CREATES Research Papers 2014-58, Department of Economics and Business Economics, Aarhus University.
    16. Achim Ahrens, 2015. "Civil conflicts in Africa: Climate, economic shocks, nighttime lights and spill-over effects," SEEC Discussion Papers 1501, Spatial Economics and Econometrics Centre, Heriot Watt University.
    17. Lamarche, Carlos & Parker, Thomas, 2023. "Wild bootstrap inference for penalized quantile regression for longitudinal data," Journal of Econometrics, Elsevier, vol. 235(2), pages 1799-1826.
    18. Zhentao Shi & Liangjun Su & Tian Xie, 2020. "L2-Relaxation: With Applications to Forecast Combination and Portfolio Analysis," Papers 2010.09477, arXiv.org, revised Aug 2022.
    19. Vogt, M. & Walsh, C. & Linton, O., 2022. "CCE Estimation of High-Dimensional Panel Data Models with Interactive Fixed Effects," Janeway Institute Working Papers 2218, Faculty of Economics, University of Cambridge.
    20. Kock, Anders Bredahl & Callot, Laurent, 2015. "Oracle inequalities for high dimensional vector autoregressions," Journal of Econometrics, Elsevier, vol. 186(2), pages 325-344.

    More about this item

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:1912.12867. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.