IDEAS home Printed from https://ideas.repec.org/a/spr/stpapr/v59y2018i3d10.1007_s00362-016-0799-y.html
   My bibliography  Save this article

A flexible shrinkage operator for fussy grouped variable selection

Author

Listed:
  • Xiaoli Gao

Abstract

Existing grouped variable selection methods rely heavily on prior group information, thus they may not be reliable if an incorrect group assignment is used. In this paper, we propose a family of shrinkage variable selection operators by controlling the k-th largest norm (KAN). The proposed KAN method exhibits some flexible group-wise variable selection naturally even though no correct prior group information is available. We also construct a group KAN shrinkage operator using a composite of KAN constraints. Neither ignoring nor relying completely on prior group information, the group KAN method has the flexibility of controlling within group strength and therefore can reduce the effect caused by incorrect group information. Finally, we investigate an unbiased estimator of the degrees of freedom for (group) KAN estimates in the framework of Stein’s unbiased risk estimation. Extensive simulation studies and real data analysis are performed to demonstrate the advantage of KAN and group KAN over the LASSO and group LASSO, respectively.

Suggested Citation

  • Xiaoli Gao, 2018. "A flexible shrinkage operator for fussy grouped variable selection," Statistical Papers, Springer, vol. 59(3), pages 985-1008, September.
  • Handle: RePEc:spr:stpapr:v:59:y:2018:i:3:d:10.1007_s00362-016-0799-y
    DOI: 10.1007/s00362-016-0799-y
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00362-016-0799-y
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00362-016-0799-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    2. Lichun Wang & Yuan You & Heng Lian, 2015. "Convergence and sparsity of Lasso and group Lasso in high-dimensional generalized linear models," Statistical Papers, Springer, vol. 56(3), pages 819-828, August.
    3. Pollard, David, 1991. "Asymptotics for Least Absolute Deviation Regression Estimators," Econometric Theory, Cambridge University Press, vol. 7(2), pages 186-199, June.
    4. Kato, Kengo, 2009. "On the degrees of freedom in shrinkage estimation," Journal of Multivariate Analysis, Elsevier, vol. 100(7), pages 1338-1352, August.
    5. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Reetika Sarkar & Sithija Manage & Xiaoli Gao, 2024. "Stable Variable Selection for High-Dimensional Genomic Data with Strong Correlations," Annals of Data Science, Springer, vol. 11(4), pages 1139-1164, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Liu, Jianyu & Yu, Guan & Liu, Yufeng, 2019. "Graph-based sparse linear discriminant analysis for high-dimensional classification," Journal of Multivariate Analysis, Elsevier, vol. 171(C), pages 250-269.
    2. Qingliang Fan & Yaqian Wu, 2020. "Endogenous Treatment Effect Estimation with some Invalid and Irrelevant Instruments," Papers 2006.14998, arXiv.org.
    3. Luke Mosley & Idris A. Eckley & Alex Gibberd, 2022. "Sparse temporal disaggregation," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 2203-2233, October.
    4. David Degras, 2021. "Sparse group fused lasso for model segmentation: a hybrid approach," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(3), pages 625-671, September.
    5. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    6. Bang, Sungwan & Jhun, Myoungshic, 2012. "Simultaneous estimation and factor selection in quantile regression via adaptive sup-norm regularization," Computational Statistics & Data Analysis, Elsevier, vol. 56(4), pages 813-826.
    7. Gabriel E Hoffman & Benjamin A Logsdon & Jason G Mezey, 2013. "PUMA: A Unified Framework for Penalized Multiple Regression Analysis of GWAS Data," PLOS Computational Biology, Public Library of Science, vol. 9(6), pages 1-19, June.
    8. Hirose, Kei & Tateishi, Shohei & Konishi, Sadanori, 2013. "Tuning parameter selection in sparse regression modeling," Computational Statistics & Data Analysis, Elsevier, vol. 59(C), pages 28-40.
    9. Yawei He & Zehua Chen, 2016. "The EBIC and a sequential procedure for feature selection in interactive linear models with high-dimensional data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 68(1), pages 155-180, February.
    10. Ting‐Huei Chen & Hanaa Boughal, 2021. "A penalized structural equation modeling method accounting for secondary phenotypes for variable selection on genetically regulated expression from PrediXcan for Alzheimer's disease," Biometrics, The International Biometric Society, vol. 77(1), pages 362-371, March.
    11. Akira Shinkyu, 2023. "Forward Selection for Feature Screening and Structure Identification in Varying Coefficient Models," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 85(1), pages 485-511, February.
    12. She, Yiyuan, 2012. "An iterative algorithm for fitting nonconvex penalized generalized linear models with grouped predictors," Computational Statistics & Data Analysis, Elsevier, vol. 56(10), pages 2976-2990.
    13. Minami, Kentaro, 2020. "Degrees of freedom in submodular regularization: A computational perspective of Stein’s unbiased risk estimate," Journal of Multivariate Analysis, Elsevier, vol. 175(C).
    14. Weihua Zhao & Riquan Zhang & Jicai Liu, 2013. "Robust variable selection for the varying coefficient model based on composite L 1 -- L 2 regression," Journal of Applied Statistics, Taylor & Francis Journals, vol. 40(9), pages 2024-2040, September.
    15. Kaixu Yang & Tapabrata Maiti, 2022. "Ultrahigh‐dimensional generalized additive model: Unified theory and methods," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(3), pages 917-942, September.
    16. Lian, Heng, 2014. "Semiparametric Bayesian information criterion for model selection in ultra-high dimensional additive models," Journal of Multivariate Analysis, Elsevier, vol. 123(C), pages 304-310.
    17. Yanhang Zhang & Junxian Zhu & Jin Zhu & Xueqin Wang, 2023. "A Splicing Approach to Best Subset of Groups Selection," INFORMS Journal on Computing, INFORMS, vol. 35(1), pages 104-119, January.
    18. Zhihua Sun & Yi Liu & Kani Chen & Gang Li, 2022. "Broken adaptive ridge regression for right-censored survival data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(1), pages 69-91, February.
    19. Zhou Yu & Yuexiao Dong & Li-Xing Zhu, 2016. "Trace Pursuit: A General Framework for Model-Free Variable Selection," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(514), pages 813-821, April.
    20. Jiang, He & Luo, Shihua & Dong, Yao, 2021. "Simultaneous feature selection and clustering based on square root optimization," European Journal of Operational Research, Elsevier, vol. 289(1), pages 214-231.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stpapr:v:59:y:2018:i:3:d:10.1007_s00362-016-0799-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.