IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v152y2020ics0167947320301225.html
   My bibliography  Save this article

An adapted linear discriminant analysis with variable selection for the classification in high-dimension, and an application to medical data

Author

Listed:
  • Le, Khuyen T.
  • Chaux, Caroline
  • Richard, Frédéric J.P.
  • Guedj, Eric

Abstract

The classification of normally distributed data in a high-dimensional setting when variables are more numerous than observations is considered. Under the assumption that the inverse covariance matrices (the precision matrices) are the same over all groups, the method of the linear discriminant analysis (LDA) is adapted by including a sparse estimate of these matrices. Furthermore, a variable selection procedure is developed based on the graph associated to the estimated precision matrix. For that, a discriminant capacity is defined for each connected component of the graph, and variables of the most discriminant components are kept. The adapted LDA and the variable selection procedure are both evaluated on synthetic data, and applied to real data from PET brain images for the classification of patients with Alzheimer’s disease.

Suggested Citation

  • Le, Khuyen T. & Chaux, Caroline & Richard, Frédéric J.P. & Guedj, Eric, 2020. "An adapted linear discriminant analysis with variable selection for the classification in high-dimension, and an application to medical data," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
  • Handle: RePEc:eee:csdana:v:152:y:2020:i:c:s0167947320301225
    DOI: 10.1016/j.csda.2020.107031
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947320301225
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2020.107031?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Daniela M. Witten & Robert Tibshirani, 2009. "Covariance‐regularized regression and classification for high dimensional problems," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 71(3), pages 615-636, June.
    2. Jianqing Fan & Yang Feng & Xin Tong, 2012. "A road to classification in high dimensional space: the regularized optimal affine discriminant," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 74(4), pages 745-771, September.
    3. Ming Yuan & Yi Lin, 2007. "Model selection and estimation in the Gaussian graphical model," Biometrika, Biometrika Trust, vol. 94(1), pages 19-35.
    4. Patrick L. Combettes & Jean-Christophe Pesquet, 2011. "Proximal Splitting Methods in Signal Processing," Springer Optimization and Its Applications, in: Heinz H. Bauschke & Regina S. Burachik & Patrick L. Combettes & Veit Elser & D. Russell Luke & Henry (ed.), Fixed-Point Algorithms for Inverse Problems in Science and Engineering, chapter 0, pages 185-212, Springer.
    5. Qing Mai & Hui Zou & Ming Yuan, 2012. "A direct approach to sparse discriminant analysis in ultra-high dimensions," Biometrika, Biometrika Trust, vol. 99(1), pages 29-42.
    6. Wang, Cheng & Jiang, Binyan, 2020. "An efficient ADMM algorithm for high dimensional precision matrix estimation via penalized quadratic loss," Computational Statistics & Data Analysis, Elsevier, vol. 142(C).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Michael Fop & Pierre-Alexandre Mattei & Charles Bouveyron & Thomas Brendan Murphy, 2022. "Unobserved classes and extra variables in high-dimensional discriminant analysis," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(1), pages 55-92, March.
    2. Rasoul Lotfi & Davood Shahsavani & Mohammad Arashi, 2022. "Classification in High Dimension Using the Ledoit–Wolf Shrinkage Method," Mathematics, MDPI, vol. 10(21), pages 1-13, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zeyu Wu & Cheng Wang & Weidong Liu, 2023. "A unified precision matrix estimation framework via sparse column-wise inverse operator under weak sparsity," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 75(4), pages 619-648, August.
    2. Aaron J Molstad & Adam J Rothman, 2018. "Shrinking characteristics of precision matrix estimators," Biometrika, Biometrika Trust, vol. 105(3), pages 563-574.
    3. Liu, Jianyu & Yu, Guan & Liu, Yufeng, 2019. "Graph-based sparse linear discriminant analysis for high-dimensional classification," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 250-269.
    4. Pan, Yuqing & Mai, Qing, 2020. "Efficient computation for differential network analysis with applications to quadratic discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    5. Guan Yu & Yufeng Liu, 2016. "Sparse Regression Incorporating Graphical Structure Among Predictors," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 707-720, April.
    6. He, Yong & Zhang, Xinsheng & Wang, Pingping, 2016. "Discriminant analysis on high dimensional Gaussian copula model," Statistics & Probability Letters, Elsevier, vol. 117(C), pages 100-112.
    7. Gao, Zhenguo & Wang, Xinye & Kang, Xiaoning, 2023. "Ensemble LDA via the modified Cholesky decomposition," Computational Statistics & Data Analysis, Elsevier, vol. 188(C).
    8. Sihai Dave Zhao, 2017. "Integrative genetic risk prediction using non-parametric empirical Bayes classification," Biometrics, The International Biometric Society, vol. 73(2), pages 582-592, June.
    9. Vahe Avagyan, 2022. "Precision matrix estimation using penalized Generalized Sylvester matrix equation," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(4), pages 950-967, December.
    10. van Wieringen, Wessel N. & Peeters, Carel F.W., 2016. "Ridge estimation of inverse covariance matrices from high-dimensional data," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 284-303.
    11. Oda, Ryoya & Suzuki, Yuya & Yanagihara, Hirokazu & Fujikoshi, Yasunori, 2020. "A consistent variable selection method in high-dimensional canonical discriminant analysis," Journal of Multivariate Analysis, Elsevier, vol. 175(C).
    12. Jianqing Fan & Yang Feng & Jiancheng Jiang & Xin Tong, 2016. "Feature Augmentation via Nonparametrics and Selection (FANS) in High-Dimensional Classification," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(513), pages 275-287, March.
    13. Luo, Shan & Chen, Zehua, 2020. "A procedure of linear discrimination analysis with detected sparsity structure for high-dimensional multi-class classification," Journal of Multivariate Analysis, Elsevier, vol. 179(C).
    14. L. A. Stefanski & Yichao Wu & Kyle White, 2014. "Variable Selection in Nonparametric Classification Via Measurement Error Model Selection Likelihoods," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(506), pages 574-589, June.
    15. Irina Gaynanova & James G. Booth & Martin T. Wells, 2016. "Simultaneous Sparse Estimation of Canonical Vectors in the ≫ Setting," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 696-706, April.
    16. Sheng, Ying & Wang, Qihua, 2019. "Simultaneous variable selection and class fusion with penalized distance criterion based classifiers," Computational Statistics & Data Analysis, Elsevier, vol. 133(C), pages 138-152.
    17. Guillaume Sagnol & Edouard Pauwels, 2019. "An unexpected connection between Bayes A-optimal designs and the group lasso," Statistical Papers, Springer, vol. 60(2), pages 565-584, April.
    18. Byrd, Michael & Nghiem, Linh H. & McGee, Monnie, 2021. "Bayesian regularization of Gaussian graphical models with measurement error," Computational Statistics & Data Analysis, Elsevier, vol. 156(C).
    19. Ernest K. Ryu & Yanli Liu & Wotao Yin, 2019. "Douglas–Rachford splitting and ADMM for pathological convex optimization," Computational Optimization and Applications, Springer, vol. 74(3), pages 747-778, December.
    20. Duo Jiang & Thomas Sharpton & Yuan Jiang, 2021. "Microbial Interaction Network Estimation via Bias-Corrected Graphical Lasso," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 329-350, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:152:y:2020:i:c:s0167947320301225. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.