IDEAS home Printed from https://ideas.repec.org/p/ehl/lserod/103043.html
   My bibliography  Save this paper

Statistical inference for high-dimensional models via recursive online-score estimation

Author

Listed:
  • Shi, Chengchun
  • Song, Rui
  • Lu, Wenbin
  • Li, Runzi

Abstract

In this article, we develop a new estimation and valid inference method for single or low-dimensional regression coefficients in high-dimensional generalized linear models. The number of the predictors is allowed to grow exponentially fast with respect to the sample size. The proposed estimator is computed by solving a score function. We recursively conduct model selection to reduce the dimensionality from high to a moderate scale and construct the score equation based on the selected variables. The proposed confidence interval (CI) achieves valid coverage without assuming consistency of the model selection procedure. When the selection consistency is achieved, we show the length of the proposed CI is asymptotically the same as the CI of the “oracle” method which works as well as if the support of the control variables were known. In addition, we prove the proposed CI is asymptotically narrower than the CIs constructed based on the desparsified Lasso estimator and the decorrelated score statistic. Simulation studies and real data applications are presented to back up our theoretical findings. Supplementary materials for this article are available online.

Suggested Citation

  • Shi, Chengchun & Song, Rui & Lu, Wenbin & Li, Runzi, 2020. "Statistical inference for high-dimensional models via recursive online-score estimation," LSE Research Online Documents on Economics 103043, London School of Economics and Political Science, LSE Library.
  • Handle: RePEc:ehl:lserod:103043
    as

    Download full text from publisher

    File URL: http://eprints.lse.ac.uk/103043/
    File Function: Open access version.
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    2. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    3. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    4. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Shi, Chengchun & Li, Lexin, 2022. "Testing mediation effects using logic of Boolean matrices," LSE Research Online Documents on Economics 108881, London School of Economics and Political Science, LSE Library.
    2. Alan Agresti & Sabrina Giordano & Anna Gottard, 2022. "A Review of Score-Test-Based Inference for Categorical Data," Journal of Quantitative Economics, Springer;The Indian Econometric Society (TIES), vol. 20(1), pages 31-48, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jingxuan Luo & Lili Yue & Gaorong Li, 2023. "Overview of High-Dimensional Measurement Error Regression Models," Mathematics, MDPI, vol. 11(14), pages 1-22, July.
    2. Chun-Xia Zhang & Jiang-She Zhang & Sang-Woon Kim, 2016. "PBoostGA: pseudo-boosting genetic algorithm for variable ranking and selection," Computational Statistics, Springer, vol. 31(4), pages 1237-1262, December.
    3. Xue Wu & Chixiang Chen & Zheng Li & Lijun Zhang & Vernon M. Chinchilli & Ming Wang, 2024. "A three-stage approach to identify biomarker signatures for cancer genetic data with survival endpoints," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 33(3), pages 863-883, July.
    4. Kock, Anders Bredahl, 2016. "Oracle inequalities, variable selection and uniform inference in high-dimensional correlated random effects panel data models," Journal of Econometrics, Elsevier, vol. 195(1), pages 71-85.
    5. Xiaoyi Zhu & Yuhong Yang, 2015. "Variable selection after screening: with or without data splitting?," Computational Statistics, Springer, vol. 30(1), pages 191-203, March.
    6. Caner, Mehmet & Kock, Anders Bredahl, 2018. "Asymptotically honest confidence regions for high dimensional parameters by the desparsified conservative Lasso," Journal of Econometrics, Elsevier, vol. 203(1), pages 143-168.
    7. Xiaorui Zhu & Yichen Qin & Peng Wang, 2023. "Sparsified Simultaneous Confidence Intervals for High-Dimensional Linear Models," Papers 2307.07574, arXiv.org.
    8. Shi, Chengchun & Li, Lexin, 2022. "Testing mediation effects using logic of Boolean matrices," LSE Research Online Documents on Economics 108881, London School of Economics and Political Science, LSE Library.
    9. Adel Javanmard & Jason D. Lee, 2020. "A flexible framework for hypothesis testing in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(3), pages 685-718, July.
    10. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    11. Meng An & Haixiang Zhang, 2023. "High-Dimensional Mediation Analysis for Time-to-Event Outcomes with Additive Hazards Model," Mathematics, MDPI, vol. 11(24), pages 1-11, December.
    12. Chenchuan (Mark) Li & Ulrich K. Müller, 2021. "Linear regression with many controls of limited explanatory power," Quantitative Economics, Econometric Society, vol. 12(2), pages 405-442, May.
    13. Shuichi Kawano, 2014. "Selection of tuning parameters in bridge regression models via Bayesian information criterion," Statistical Papers, Springer, vol. 55(4), pages 1207-1223, November.
    14. Zhaoyu Xing & Yang Wan & Juan Wen & Wei Zhong, 2024. "GOLFS: feature selection via combining both global and local information for high dimensional clustering," Computational Statistics, Springer, vol. 39(5), pages 2651-2675, July.
    15. Guo, Xu & Li, Runze & Liu, Jingyuan & Zeng, Mudong, 2023. "Statistical inference for linear mediation models with high-dimensional mediators and application to studying stock reaction to COVID-19 pandemic," Journal of Econometrics, Elsevier, vol. 235(1), pages 166-179.
    16. Shan Luo & Zehua Chen, 2014. "Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1229-1240, September.
    17. Shi Chen & Wolfgang Karl Hardle & Brenda L'opez Cabrera, 2020. "Regularization Approach for Network Modeling of German Power Derivative Market," Papers 2009.09739, arXiv.org.
    18. Wang, Christina Dan & Chen, Zhao & Lian, Yimin & Chen, Min, 2022. "Asset selection based on high frequency Sharpe ratio," Journal of Econometrics, Elsevier, vol. 227(1), pages 168-188.
    19. Laurent Ferrara & Anna Simoni, 2023. "When are Google Data Useful to Nowcast GDP? An Approach via Preselection and Shrinkage," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 41(4), pages 1188-1202, October.
    20. Toshio Honda, 2021. "The de-biased group Lasso estimation for varying coefficient models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(1), pages 3-29, February.

    More about this item

    Keywords

    confidence interval; Ultrahigh dimensions; Generalized linear models; online estimations; Online estimation; Confidence interval;
    All these keywords.

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:ehl:lserod:103043. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: LSERO Manager (email available below). General contact details of provider: https://edirc.repec.org/data/lsepsuk.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.