IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i10p1657-d814173.html
   My bibliography  Save this article

Asymptotic Normality in Linear Regression with Approximately Sparse Structure

Author

Listed:
  • Saulius Jokubaitis

    (Faculty of Mathematics and Informatics, Institute of Applied Mathematics, Vilnius University, Naugarduko 24, LT-03225 Vilnius, Lithuania
    These authors contributed equally to this work.)

  • Remigijus Leipus

    (Faculty of Mathematics and Informatics, Institute of Applied Mathematics, Vilnius University, Naugarduko 24, LT-03225 Vilnius, Lithuania
    These authors contributed equally to this work.)

Abstract

In this paper, we study the asymptotic normality in high-dimensional linear regression. We focus on the case where the covariance matrix of the regression variables has a KMS structure, in asymptotic settings where the number of predictors, p , is proportional to the number of observations, n . The main result of the paper is the derivation of the exact asymptotic distribution for the suitably centered and normalized squared norm of the product between predictor matrix, X , and outcome variable, Y , i.e., the statistic ∥ X ′ Y ∥ 2 2 , under rather unrestrictive assumptions for the model parameters β j . We employ variance-gamma distribution in order to derive the results, which, along with the asymptotic results, allows us to easily define the exact distribution of the statistic. Additionally, we consider a specific case of approximate sparsity of the model parameter vector β and perform a Monte Carlo simulation study. The simulation results suggest that the statistic approaches the limiting distribution fairly quickly even under high variable multi-correlation and relatively small number of observations, suggesting possible applications to the construction of statistical testing procedures for the real-world data and related problems.

Suggested Citation

  • Saulius Jokubaitis & Remigijus Leipus, 2022. "Asymptotic Normality in Linear Regression with Approximately Sparse Structure," Mathematics, MDPI, vol. 10(10), pages 1-28, May.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:10:p:1657-:d:814173
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/10/1657/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/10/1657/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Dilip B. Madan & Peter P. Carr & Eric C. Chang, 1998. "The Variance Gamma Process and Option Pricing," Review of Finance, European Finance Association, vol. 2(1), pages 79-105.
    2. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    3. Robert E. Gaunt, 2019. "A note on the distribution of the product of zero‐mean correlated normal random variables," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 73(2), pages 176-179, May.
    4. Dai, Zhifeng & Zhu, Haoyang & Zhang, Xinhua, 2022. "Dynamic spillover effects and portfolio strategies between crude oil, gold and Chinese stock markets related to new energy vehicle," Energy Economics, Elsevier, vol. 109(C).
    5. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    6. Caner, Mehmet & Kock, Anders Bredahl, 2018. "Asymptotically honest confidence regions for high dimensional parameters by the desparsified conservative Lasso," Journal of Econometrics, Elsevier, vol. 203(1), pages 143-168.
    7. A. Belloni & D. Chen & V. Chernozhukov & C. Hansen, 2012. "Sparse Models and Methods for Optimal Instruments With an Application to Eminent Domain," Econometrica, Econometric Society, vol. 80(6), pages 2369-2429, November.
    8. Gold, David & Lederer, Johannes & Tao, Jing, 2020. "Inference for high-dimensional instrumental variables regression," Journal of Econometrics, Elsevier, vol. 217(1), pages 79-111.
    9. Yang, Yihe & Zhou, Jie & Pan, Jianxin, 2021. "Estimation and optimal structure selection of high-dimensional Toeplitz covariance matrix," Journal of Multivariate Analysis, Elsevier, vol. 184(C).
    10. Yang Ning & Sida Peng & Jing Tao, 2020. "Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data," Papers 2009.03151, arXiv.org.
    11. Lee H. Dicker, 2014. "Variance estimation in high-dimensional linear models," Biometrika, Biometrika Trust, vol. 101(2), pages 269-284.
    12. A. Belloni & V. Chernozhukov & L. Wang, 2011. "Square-root lasso: pivotal recovery of sparse signals via conic programming," Biometrika, Biometrika Trust, vol. 98(4), pages 791-806.
    13. Abhijeet R Patil & Sangjin Kim, 2020. "Combination of Ensembles of Regularized Regression Models with Resampling-Based Lasso Feature Selection in High Dimensional Data," Mathematics, MDPI, vol. 8(1), pages 1-23, January.
    14. Lucas Janson & Rina Foygel Barber & Emmanuel Candès, 2017. "EigenPrism: inference for high dimensional signal-to-noise ratios," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1037-1065, September.
    15. Dai, Zhifeng & Zhu, Haoyang, 2022. "Time-varying spillover effects and investment strategies between WTI crude oil, natural gas and Chinese stock markets related to belt and road initiative," Energy Economics, Elsevier, vol. 108(C).
    16. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Anders Bredahl Kock & Haihan Tang, 2014. "Inference in High-dimensional Dynamic Panel Data Models," CREATES Research Papers 2014-58, Department of Economics and Business Economics, Aarhus University.
    2. Alain Hecq & Luca Margaritella & Stephan Smeekes, 2023. "Granger Causality Testing in High-Dimensional VARs: A Post-Double-Selection Procedure," Journal of Financial Econometrics, Oxford University Press, vol. 21(3), pages 915-958.
    3. Zhan Gao & Ji Hyung Lee & Ziwei Mei & Zhentao Shi, 2024. "Econometric Inference for High Dimensional Predictive Regressions," Papers 2409.10030, arXiv.org, revised Nov 2024.
    4. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-dimensional econometrics and regularized GMM," CeMMAP working papers CWP35/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    5. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2019. "Valid Post-Selection Inference in High-Dimensional Approximately Sparse Quantile Regression Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 114(526), pages 749-758, April.
    6. Alexandre Belloni & Victor Chernozhukov & Ivan Fernandez-Val & Christian Hansen, 2013. "Program evaluation with high-dimensional data," CeMMAP working papers CWP77/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    7. Toshio Honda, 2021. "The de-biased group Lasso estimation for varying coefficient models," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 73(1), pages 3-29, February.
    8. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
    9. Agboola, Oluwagbenga David & Yu, Han, 2023. "Neighborhood-based cross fitting approach to treatment effects with high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 186(C).
    10. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2020. "lassopack: Model selection and prediction with regularized regression in Stata," Stata Journal, StataCorp LP, vol. 20(1), pages 176-235, March.
    11. Yang Ning & Sida Peng & Jing Tao, 2020. "Doubly Robust Semiparametric Difference-in-Differences Estimators with High-Dimensional Data," Papers 2009.03151, arXiv.org.
    12. Harold D. Chiang, 2018. "Many Average Partial Effects: with An Application to Text Regression," Papers 1812.09397, arXiv.org, revised Jan 2022.
    13. A. Belloni & V. Chernozhukov & I. Fernández‐Val & C. Hansen, 2017. "Program Evaluation and Causal Inference With High‐Dimensional Data," Econometrica, Econometric Society, vol. 85, pages 233-298, January.
    14. Timothy B. Armstrong & Michal Kolesár & Soonwoo Kwon, 2020. "Bias-Aware Inference in Regularized Regression Models," Working Papers 2020-2, Princeton University. Economics Department..
    15. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2018. "Double/debiased machine learning for treatment and structural parameters," Econometrics Journal, Royal Economic Society, vol. 21(1), pages 1-68, February.
    16. Farrell, Max H., 2015. "Robust inference on average treatment effects with possibly more covariates than observations," Journal of Econometrics, Elsevier, vol. 189(1), pages 1-23.
    17. Guo, Zijian & Kang, Hyunseung & Cai, T. Tony & Small, Dylan S., 2018. "Testing endogeneity with high dimensional covariates," Journal of Econometrics, Elsevier, vol. 207(1), pages 175-187.
    18. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney Newey & James Robins, 2016. "Double/Debiased Machine Learning for Treatment and Causal Parameters," Papers 1608.00060, arXiv.org, revised Nov 2024.
    19. Kaspar Wuthrich & Ying Zhu, 2019. "Omitted variable bias of Lasso-based inference methods: A finite sample analysis," Papers 1903.08704, arXiv.org, revised Sep 2021.
    20. Qingliang Fan & Zijian Guo & Ziwei Mei, 2022. "A Heteroskedasticity-Robust Overidentifying Restriction Test with High-Dimensional Covariates," Papers 2205.00171, arXiv.org, revised May 2024.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:10:p:1657-:d:814173. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.