IDEAS home Printed from https://ideas.repec.org/a/spr/stmapp/v30y2021i1d10.1007_s10260-020-00511-z.html
   My bibliography  Save this article

Penalised robust estimators for sparse and high-dimensional linear models

Author

Listed:
  • Umberto Amato

    (Italian National Research Council)

  • Anestis Antoniadis

    (Université Joseph Fourier
    University of Cape Town)

  • Italia De Feis

    (Italian National Research Council)

  • Irene Gijbels

    (KU Leuven)

Abstract

We introduce a new class of robust M-estimators for performing simultaneous parameter estimation and variable selection in high-dimensional regression models. We first explain the motivations for the key ingredient of our procedures which are inspired by regularization methods used in wavelet thresholding in noisy signal processing. The derived penalized estimation procedures are shown to enjoy theoretically the oracle property both in the classical finite dimensional case as well as the high-dimensional case when the number of variables p is not fixed but can grow with the sample size n, and to achieve optimal asymptotic rates of convergence. A fast accelerated proximal gradient algorithm, of coordinate descent type, is proposed and implemented for computing the estimates and appears to be surprisingly efficient in solving the corresponding regularization problems including the case for ultra high-dimensional data where $$p \gg n$$ p ≫ n . Finally, a very extensive simulation study and some real data analysis, compare several recent existing M-estimation procedures with the ones proposed in the paper, and demonstrate their utility and their advantages.

Suggested Citation

  • Umberto Amato & Anestis Antoniadis & Italia De Feis & Irene Gijbels, 2021. "Penalised robust estimators for sparse and high-dimensional linear models," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(1), pages 1-48, March.
  • Handle: RePEc:spr:stmapp:v:30:y:2021:i:1:d:10.1007_s10260-020-00511-z
    DOI: 10.1007/s10260-020-00511-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10260-020-00511-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10260-020-00511-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jelena Bradic & Jianqing Fan & Weiwei Wang, 2011. "Penalized composite quasi‐likelihood for ultrahigh dimensional variable selection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 73(3), pages 325-349, June.
    2. Anestis Antoniadis & Irène Gijbels & Mila Nikolova, 2011. "Penalized likelihood regression for generalized linear models with non-quadratic penalties," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 63(3), pages 585-615, June.
    3. NESTEROV, Yu., 2007. "Gradient methods for minimizing composite objective function," LIDAM Discussion Papers CORE 2007076, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE).
    4. Pace, R Kelley & Gilley, Otis W, 1997. "Using the Spatial Configuration of the Data to Improve Estimation," The Journal of Real Estate Finance and Economics, Springer, vol. 14(3), pages 333-340, May.
    5. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    6. A. Belloni & V. Chernozhukov & L. Wang, 2011. "Square-root lasso: pivotal recovery of sparse signals via conic programming," Biometrika, Biometrika Trust, vol. 98(4), pages 791-806.
    7. Wang, Hansheng & Li, Guodong & Jiang, Guohua, 2007. "Robust Regression Shrinkage and Consistent Variable Selection Through the LAD-Lasso," Journal of Business & Economic Statistics, American Statistical Association, vol. 25, pages 347-355, July.
    8. Chang, Xiao-Wen & Qu, Leming, 2004. "Wavelet estimation of partially linear models," Computational Statistics & Data Analysis, Elsevier, vol. 47(1), pages 31-48, August.
    9. Ziqi Chen & Man-Lai Tang & Wei Gao & Ning-Zhong Shi, 2014. "New Robust Variable Selection Methods for Linear Regression Models," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 41(3), pages 725-741, September.
    10. Andrea Cerioli & Marco Riani & Anthony C. Atkinson & Aldo Corbellini, 2018. "The power of monitoring: how to make the most of a contaminated multivariate sample," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 27(4), pages 559-587, December.
    11. Jianqing Fan & Quefeng Li & Yuyan Wang, 2017. "Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(1), pages 247-265, January.
    12. Antoniadis A. & Fan J., 2001. "Regularization of Wavelet Approximations," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 939-967, September.
    13. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    14. She, Yiyuan & Owen, Art B., 2011. "Outlier Detection Using Nonconvex Penalized Regression," Journal of the American Statistical Association, American Statistical Association, vol. 106(494), pages 626-639.
    15. Gijbels, I. & Vrinssen, I., 2015. "Robust nonnegative garrote variable selection in linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 85(C), pages 1-22.
    16. Wang, Lie, 2013. "The L1 penalized LAD estimator for high dimensional linear regression," Journal of Multivariate Analysis, Elsevier, vol. 120(C), pages 135-151.
    17. Smucler, Ezequiel & Yohai, Victor J., 2017. "Robust and sparse estimators for linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 111(C), pages 116-130.
    18. Harrison, David Jr. & Rubinfeld, Daniel L., 1978. "Hedonic housing prices and the demand for clean air," Journal of Environmental Economics and Management, Elsevier, vol. 5(1), pages 81-102, March.
    19. Arslan, Olcay, 2012. "Weighted LAD-LASSO method for robust parameter estimation and variable selection in regression," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1952-1965.
    20. Tingni Sun & Cun-Hui Zhang, 2012. "Scaled sparse linear regression," Biometrika, Biometrika Trust, vol. 99(4), pages 879-898.
    21. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    22. Wang, Hansheng & Leng, Chenlei, 2007. "Unified LASSO Estimation by Least Squares Approximation," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1039-1048, September.
    23. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    24. Khan, Jafar A. & Van Aelst, Stefan & Zamar, Ruben H., 2007. "Robust Linear Model Selection Based on Least Angle Regression," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 1289-1299, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Umberto Amato & Anestis Antoniadis & Italia Feis & Irène Gijbels, 2022. "Penalized wavelet estimation and robust denoising for irregular spaced data," Computational Statistics, Springer, vol. 37(4), pages 1621-1651, September.
    2. Thompson, Ryan, 2022. "Robust subset selection," Computational Statistics & Data Analysis, Elsevier, vol. 169(C).
    3. Daniela De Canditiis & Italia De Feis, 2021. "Anomaly Detection in Multichannel Data Using Sparse Representation in RADWT Frames," Mathematics, MDPI, vol. 9(11), pages 1-26, June.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mingqiu Wang & Guo-Liang Tian, 2016. "Robust group non-convex estimations for high-dimensional partially linear models," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(1), pages 49-67, March.
    2. Xiaofei Wu & Rongmei Liang & Hu Yang, 2022. "Penalized and constrained LAD estimation in fixed and high dimension," Statistical Papers, Springer, vol. 63(1), pages 53-95, February.
    3. Kepplinger, David, 2023. "Robust variable selection and estimation via adaptive elastic net S-estimators for linear regression," Computational Statistics & Data Analysis, Elsevier, vol. 183(C).
    4. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    5. Sophie Lambert-Lacroix & Laurent Zwald, 2016. "The adaptive BerHu penalty in robust regression," Journal of Nonparametric Statistics, Taylor & Francis Journals, vol. 28(3), pages 487-514, September.
    6. Abdul Wahid & Dost Muhammad Khan & Ijaz Hussain, 2017. "Robust Adaptive Lasso method for parameter’s estimation and variable selection in high-dimensional sparse models," PLOS ONE, Public Library of Science, vol. 12(8), pages 1-17, August.
    7. Sermpinis, Georgios & Tsoukas, Serafeim & Zhang, Ping, 2018. "Modelling market implied ratings using LASSO variable selection techniques," Journal of Empirical Finance, Elsevier, vol. 48(C), pages 19-35.
    8. Fan, Rui & Lee, Ji Hyung & Shin, Youngki, 2023. "Predictive quantile regression with mixed roots and increasing dimensions: The ALQR approach," Journal of Econometrics, Elsevier, vol. 237(2).
    9. Xuan Liu & Jianbao Chen, 2021. "Variable Selection for the Spatial Autoregressive Model with Autoregressive Disturbances," Mathematics, MDPI, vol. 9(12), pages 1-20, June.
    10. Barbato, Michele & Ceselli, Alberto, 2024. "Mathematical programming for simultaneous feature selection and outlier detection under l1 norm," European Journal of Operational Research, Elsevier, vol. 316(3), pages 1070-1084.
    11. Zemin Zheng & Jie Zhang & Yang Li, 2022. "L 0 -Regularized Learning for High-Dimensional Additive Hazards Regression," INFORMS Journal on Computing, INFORMS, vol. 34(5), pages 2762-2775, September.
    12. Bartosz Uniejewski, 2024. "Regularization for electricity price forecasting," Papers 2404.03968, arXiv.org.
    13. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    14. Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP56/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    15. Yongjin Li & Qingzhao Zhang & Qihua Wang, 2017. "Penalized estimation equation for an extended single-index model," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 69(1), pages 169-187, February.
    16. Yinjun Chen & Hao Ming & Hu Yang, 2024. "Efficient variable selection for high-dimensional multiplicative models: a novel LPRE-based approach," Statistical Papers, Springer, vol. 65(6), pages 3713-3737, August.
    17. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    18. Umberto Amato & Anestis Antoniadis & Italia Feis & Irène Gijbels, 2022. "Penalized wavelet estimation and robust denoising for irregular spaced data," Computational Statistics, Springer, vol. 37(4), pages 1621-1651, September.
    19. T. Cai & J. Huang & L. Tian, 2009. "Regularized Estimation for the Accelerated Failure Time Model," Biometrics, The International Biometric Society, vol. 65(2), pages 394-404, June.
    20. Schneider Ulrike & Wagner Martin, 2012. "Catching Growth Determinants with the Adaptive Lasso," German Economic Review, De Gruyter, vol. 13(1), pages 71-85, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stmapp:v:30:y:2021:i:1:d:10.1007_s10260-020-00511-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.