IDEAS home Printed from https://ideas.repec.org/a/spr/stmapp/v33y2024i5d10.1007_s10260-024-00760-2.html
   My bibliography  Save this article

Robust adaptive LASSO in high-dimensional logistic regression

Author

Listed:
  • Ayanendranath Basu

    (Indian Statistical Institute)

  • Abhik Ghosh

    (Indian Statistical Institute)

  • Maria Jaenada

    (Statistics and O.R., Complutense University of Madrid)

  • Leandro Pardo

    (Statistics and O.R., Complutense University of Madrid)

Abstract

Penalized logistic regression is extremely useful for binary classification with large number of covariates (higher than the sample size), having several real life applications, including genomic disease classification. However, the existing methods based on the likelihood loss function are sensitive to data contamination and other noise and, hence, robust methods are needed for stable and more accurate inference. In this paper, we propose a family of robust estimators for sparse logistic models utilizing the popular density power divergence based loss function and the general adaptively weighted LASSO penalties. We study the local robustness of the proposed estimators through its influence function and also derive its oracle properties and asymptotic distribution. With extensive empirical illustrations, we demonstrate the significantly improved performance of our proposed estimators over the existing ones with particular gain in robustness. Our proposal is finally applied to analyse four different real datasets for cancer classification, obtaining robust and accurate models, that simultaneously performs gene selection and patient classification.

Suggested Citation

  • Ayanendranath Basu & Abhik Ghosh & Maria Jaenada & Leandro Pardo, 2024. "Robust adaptive LASSO in high-dimensional logistic regression," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 33(5), pages 1217-1249, November.
  • Handle: RePEc:spr:stmapp:v:33:y:2024:i:5:d:10.1007_s10260-024-00760-2
    DOI: 10.1007/s10260-024-00760-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10260-024-00760-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10260-024-00760-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Ghosh, Abhik & Mandal, Abhijit & Martín, Nirian & Pardo, Leandro, 2016. "Influence analysis of robust Wald-type tests," Journal of Multivariate Analysis, Elsevier, vol. 147(C), pages 102-126.
    2. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    3. A. Basu & A. Mandal & N. Martin & L. Pardo, 2013. "Testing statistical hypotheses based on the density power divergence," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(2), pages 319-348, April.
    4. A. Basu & A. Mandal & N. Martin & L. Pardo, 2015. "Robust tests for the equality of two normal means based on the density power divergence," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 78(5), pages 611-634, July.
    5. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    6. Filzmoser, Peter & Maronna, Ricardo & Werner, Mark, 2008. "Outlier identification in high dimensions," Computational Statistics & Data Analysis, Elsevier, vol. 52(3), pages 1694-1711, January.
    7. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    8. Pi Guo & Fangfang Zeng & Xiaomin Hu & Dingmei Zhang & Shuming Zhu & Yu Deng & Yuantao Hao, 2015. "Improved Variable Selection Algorithm Using a LASSO-Type Penalty, with an Application to Assessing Hepatitis B Infection Relevant Factors in Community Residents," PLOS ONE, Public Library of Science, vol. 10(7), pages 1-23, July.
    9. Marco Avella-Medina & Elvezio Ronchetti, 2018. "Robust and consistent variable selection in high-dimensional generalized linear models," Biometrika, Biometrika Trust, vol. 105(1), pages 31-44.
    10. Yingying Fan & Cheng Yong Tang, 2013. "Tuning parameter selection in high dimensional penalized likelihood," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 75(3), pages 531-552, June.
    11. Xueqin Wang & Yunlu Jiang & Mian Huang & Heping Zhang, 2013. "Robust Variable Selection With Exponential Squared Loss," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(502), pages 632-643, June.
    12. S. le Cessie & J. C. van Houwelingen, 1992. "Ridge Estimators in Logistic Regression," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 41(1), pages 191-201, March.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Elena McDonald & Xin Wang, 2024. "Generalized regression estimators with concave penalties and a comparison to lasso type estimators," METRON, Springer;Sapienza Università di Roma, vol. 82(2), pages 213-239, August.
    2. Hui Xiao & Yiguo Sun, 2019. "On Tuning Parameter Selection in Model Selection and Model Averaging: A Monte Carlo Study," JRFM, MDPI, vol. 12(3), pages 1-16, June.
    3. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    4. Hui Xiao & Yiguo Sun, 2020. "Forecasting the Returns of Cryptocurrency: A Model Averaging Approach," JRFM, MDPI, vol. 13(11), pages 1-15, November.
    5. Christopher J Greenwood & George J Youssef & Primrose Letcher & Jacqui A Macdonald & Lauryn J Hagg & Ann Sanson & Jenn Mcintosh & Delyse M Hutchinson & John W Toumbourou & Matthew Fuller-Tyszkiewicz &, 2020. "A comparison of penalised regression methods for informing the selection of predictive markers," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-14, November.
    6. Naimoli, Antonio, 2022. "Modelling the persistence of Covid-19 positivity rate in Italy," Socio-Economic Planning Sciences, Elsevier, vol. 82(PA).
    7. Camila Epprecht & Dominique Guegan & Álvaro Veiga & Joel Correa da Rosa, 2017. "Variable selection and forecasting via automated methods for linear models: LASSO/adaLASSO and Autometrics," Post-Print halshs-00917797, HAL.
    8. Peter Bühlmann & Jacopo Mandozzi, 2014. "High-dimensional variable screening and bias in subsequent inference, with an empirical comparison," Computational Statistics, Springer, vol. 29(3), pages 407-430, June.
    9. Peter Martey Addo & Dominique Guegan & Bertrand Hassani, 2018. "Credit Risk Analysis Using Machine and Deep Learning Models," Risks, MDPI, vol. 6(2), pages 1-20, April.
    10. Capanu, Marinela & Giurcanu, Mihai & Begg, Colin B. & Gönen, Mithat, 2023. "Subsampling based variable selection for generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 184(C).
    11. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    12. Yunxiao Chen & Xiaoou Li & Jingchen Liu & Zhiliang Ying, 2017. "Regularized Latent Class Analysis with Application in Cognitive Diagnosis," Psychometrika, Springer;The Psychometric Society, vol. 82(3), pages 660-692, September.
    13. Zeyu Bian & Erica E. M. Moodie & Susan M. Shortreed & Sahir Bhatnagar, 2023. "Variable selection in regression‐based estimation of dynamic treatment regimes," Biometrics, The International Biometric Society, vol. 79(2), pages 988-999, June.
    14. Li, Xinjue & Zboňáková, Lenka & Wang, Weining & Härdle, Wolfgang Karl, 2019. "Combining Penalization and Adaption in High Dimension with Application in Bond Risk Premia Forecasting," IRTG 1792 Discussion Papers 2019-030, Humboldt University of Berlin, International Research Training Group 1792 "High Dimensional Nonstationary Time Series".
    15. Jianqing Fan & Yang Feng & Jiancheng Jiang & Xin Tong, 2016. "Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 275-287, March.
    16. Jingxuan Luo & Lili Yue & Gaorong Li, 2023. "Overview of High-Dimensional Measurement Error Regression Models," Mathematics, MDPI, vol. 11(14), pages 1-22, July.
    17. Alexandre Belloni & Victor Chernozhukov & Christian Hansen, 2014. "High-Dimensional Methods and Inference on Structural and Treatment Effects," Journal of Economic Perspectives, American Economic Association, vol. 28(2), pages 29-50, Spring.
    18. Zhang, Tonglin, 2024. "Variables selection using L0 penalty," Computational Statistics & Data Analysis, Elsevier, vol. 190(C).
    19. Zhixuan Fu & Shuangge Ma & Haiqun Lin & Chirag R. Parikh & Bingqing Zhou, 2017. "Penalized Variable Selection for Multi-center Competing Risks Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 9(2), pages 379-405, December.
    20. Camila Epprecht & Dominique Guegan & Álvaro Veiga & Joel Correa da Rosa, 2017. "Variable selection and forecasting via automated methods for linear models: LASSO/adaLASSO and Autometrics," Université Paris1 Panthéon-Sorbonne (Post-Print and Working Papers) halshs-00917797, HAL.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stmapp:v:33:y:2024:i:5:d:10.1007_s10260-024-00760-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.