IDEAS home Printed from https://ideas.repec.org/p/pra/mprapa/88502.html
   My bibliography  Save this paper

Concentration Based Inference in High Dimensional Generalized Regression Models (I: Statistical Guarantees)

Author

Listed:
  • Zhu, Ying

Abstract

We develop simple and non-asymptotically justified methods for hypothesis testing about the coefficients ($\theta^{*}\in\mathbb{R}^{p}$) in the high dimensional generalized regression models where $p$ can exceed the sample size. Given a function $h:\,\mathbb{R}^{p}\mapsto\mathbb{R}^{m}$, we consider $H_{0}:\,h(\theta^{*})=\mathbf{0}_{m}$ against $H_{1}:\,h(\theta^{*})\neq\mathbf{0}_{m}$, where $m$ can be any integer in $\left[1,\,p\right]$ and $h$ can be nonlinear in $\theta^{*}$. Our test statistics is based on the sample ``quasi score'' vector evaluated at an estimate $\hat{\theta}_{\alpha}$ that satisfies $h(\hat{\theta}_{\alpha})=\mathbf{0}_{m}$, where $\alpha$ is the prespecified Type I error. By exploiting the concentration phenomenon in Lipschitz functions, the key component reflecting the dimension complexity in our non-asymptotic thresholds uses a Monte-Carlo approximation to mimic the expectation that is concentrated around and automatically captures the dependencies between the coordinates. We provide probabilistic guarantees in terms of the Type I and Type II errors for the quasi score test. Confidence regions are also constructed for the population quasi-score vector evaluated at $\theta^{*}$. The first set of our results are specific to the standard Gaussian linear regression models; the second set allow for reasonably flexible forms of non-Gaussian responses, heteroscedastic noise, and nonlinearity in the regression coefficients, while only requiring the correct specification of $\mathbb{E}\left(Y_{i}|X_{i}\right)$s. The novelty of our methods is that their validity does not rely on good behavior of $\left\Vert \hat{\theta}_{\alpha}-\theta^{*}\right\Vert _{2}$ (or even $n^{-1/2}\left\Vert X\left(\hat{\theta}_{\alpha}-\theta^{*}\right)\right\Vert _{2}$ in the linear regression case) nonasymptotically or asymptotically.

Suggested Citation

  • Zhu, Ying, 2018. "Concentration Based Inference in High Dimensional Generalized Regression Models (I: Statistical Guarantees)," MPRA Paper 88502, University Library of Munich, Germany.
  • Handle: RePEc:pra:mprapa:88502
    as

    Download full text from publisher

    File URL: https://mpra.ub.uni-muenchen.de/88502/1/MPRA_paper_88502.pdf
    File Function: original version
    Download Restriction: no

    File URL: https://mpra.ub.uni-muenchen.de/89281/1/MPRA_paper_89281.pdf
    File Function: revised version
    Download Restriction: no

    File URL: https://mpra.ub.uni-muenchen.de/94645/1/MPRA_paper_94645.pdf
    File Function: revised version
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Joel L. Horowitz, 2017. "Non-asymptotic inference in instrumental variables estimation," CeMMAP working papers CWP46/17, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    2. Victor Chernozhukov & Denis Chetverikov & Kengo Kato, 2012. "Gaussian approximations and multiplier bootstrap for maxima of sums of high-dimensional random vectors," Papers 1212.6906, arXiv.org, revised Jan 2018.
    3. Cun-Hui Zhang & Stephanie S. Zhang, 2014. "Confidence intervals for low dimensional parameters in high dimensional linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(1), pages 217-242, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hansen, Christian & Liao, Yuan, 2019. "The Factor-Lasso And K-Step Bootstrap Approach For Inference In High-Dimensional Economic Applications," Econometric Theory, Cambridge University Press, vol. 35(3), pages 465-509, June.
    2. Alexandre Belloni & Victor Chernozhukov & Kengo Kato, 2013. "Uniform post selection inference for LAD regression and other z-estimation problems," CeMMAP working papers CWP74/13, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    3. Victor Chernozhukov & Denis Chetverikov & Mert Demirer & Esther Duflo & Christian Hansen & Whitney K. Newey, 2016. "Double machine learning for treatment and causal parameters," CeMMAP working papers 49/16, Institute for Fiscal Studies.
    4. Philipp Bach & Victor Chernozhukov & Malte S. Kurz & Martin Spindler & Sven Klaassen, 2021. "DoubleML -- An Object-Oriented Implementation of Double Machine Learning in R," Papers 2103.09603, arXiv.org, revised Jun 2024.
    5. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Post-Selection and Post-Regularization Inference in Linear Models with Many Controls and Instruments," American Economic Review, American Economic Association, vol. 105(5), pages 486-490, May.
    6. Victor Chernozhukov & Whitney K Newey & Rahul Singh, 2022. "Debiased machine learning of global and local parameters using regularized Riesz representers [Semiparametric instrumental variable estimation of treatment response models]," The Econometrics Journal, Royal Economic Society, vol. 25(3), pages 576-601.
    7. Jelena Bradic & Victor Chernozhukov & Whitney K. Newey & Yinchu Zhu, 2019. "Minimax Semiparametric Learning With Approximate Sparsity," Papers 1912.12213, arXiv.org, revised Aug 2022.
    8. Timothy B. Armstrong & Michal Kolesár & Soonwoo Kwon, 2020. "Bias-Aware Inference in Regularized Regression Models," Working Papers 2020-2, Princeton University. Economics Department..
    9. Harold D. Chiang, 2018. "Many Average Partial Effects: with An Application to Text Regression," Papers 1812.09397, arXiv.org, revised Jan 2022.
    10. Alexander Giessing & Jianqing Fan, 2020. "Bootstrapping $\ell_p$-Statistics in High Dimensions," Papers 2006.13099, arXiv.org, revised Aug 2020.
    11. Victor Chernozhukov & Wolfgang K. Hardle & Chen Huang & Weining Wang, 2018. "LASSO-Driven Inference in Time and Space," Papers 1806.05081, arXiv.org, revised May 2020.
    12. Victor Chernozhukov & Christian Hansen & Martin Spindler, 2015. "Valid Post-Selection and Post-Regularization Inference: An Elementary, General Approach," Annual Review of Economics, Annual Reviews, vol. 7(1), pages 649-688, August.
    13. Victor Chernozhukov & Whitney K. Newey & Rahul Singh, 2022. "Automatic Debiased Machine Learning of Causal and Structural Effects," Econometrica, Econometric Society, vol. 90(3), pages 967-1027, May.
    14. Christian Hansen & Damian Kozbur & Sanjog Misra, 2016. "Targeted undersmoothing," ECON - Working Papers 282, Department of Economics - University of Zurich, revised Apr 2018.
    15. Philipp Bach & Sven Klaassen & Jannis Kueck & Martin Spindler, 2020. "Estimation and Uniform Inference in Sparse High-Dimensional Additive Models," Papers 2004.01623, arXiv.org, revised Apr 2024.
    16. Alexandre Belloni & Victor Chernozhukov & Denis Chetverikov & Christian Hansen & Kengo Kato, 2018. "High-Dimensional Econometrics and Regularized GMM," Papers 1806.01888, arXiv.org, revised Jun 2018.
    17. Jana Janková & Rajen D. Shah & Peter Bühlmann & Richard J. Samworth, 2020. "Goodness‐of‐fit testing in high dimensional generalized linear models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(3), pages 773-795, July.
    18. Shengchun Kong & Zhuqing Yu & Xianyang Zhang & Guang Cheng, 2021. "High‐dimensional robust inference for Cox regression models using desparsified Lasso," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 48(3), pages 1068-1095, September.
    19. Adamek, Robert & Smeekes, Stephan & Wilms, Ines, 2023. "Lasso inference for high-dimensional time series," Journal of Econometrics, Elsevier, vol. 235(2), pages 1114-1143.
    20. Victor Chernozhukov & Whitney K. Newey & James Robins, 2018. "Double/de-biased machine learning using regularized Riesz representers," CeMMAP working papers CWP15/18, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.

    More about this item

    Keywords

    Nonasymptotic inference; concentration inequalities; high dimensional inference; hypothesis testing; confidence sets;
    All these keywords.

    JEL classification:

    • C1 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General
    • C12 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Hypothesis Testing: General
    • C2 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables
    • C21 - Mathematical and Quantitative Methods - - Single Equation Models; Single Variables - - - Cross-Sectional Models; Spatial Models; Treatment Effect Models

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:pra:mprapa:88502. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Joachim Winter (email available below). General contact details of provider: https://edirc.repec.org/data/vfmunde.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.