IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v55y2011i11p2908-2924.html
   My bibliography  Save this article

Modified versions of the Bayesian Information Criterion for sparse Generalized Linear Models

Author

Listed:
  • Zak-Szatkowska, Malgorzata
  • Bogdan, Malgorzata

Abstract

The classical model selection criteria, such as the Bayesian Information Criterion (BIC) or Akaike information criterion (AIC), have a strong tendency to overestimate the number of regressors when the search is performed over a large number of potential explanatory variables. To handle the problem of the overestimation, several modifications of the BIC have been proposed. These versions rely on supplementing the original BIC with some prior distributions on the class of possible models. Three such modifications are presented and compared in the context of sparse Generalized Linear Models (GLMs). The related choices of priors are discussed and the conditions for the asymptotic equivalence of these criteria are provided. The performance of the modified versions of the BIC is illustrated with an extensive simulation study and a real data analysis. Also, simplified versions of the modified BIC, based on least squares regression, are investigated.

Suggested Citation

  • Zak-Szatkowska, Malgorzata & Bogdan, Malgorzata, 2011. "Modified versions of the Bayesian Information Criterion for sparse Generalized Linear Models," Computational Statistics & Data Analysis, Elsevier, vol. 55(11), pages 2908-2924, November.
  • Handle: RePEc:eee:csdana:v:55:y:2011:i:11:p:2908-2924
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947311001459
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    2. Ye, Gui-Bo & Xie, Xiaohui, 2011. "Split Bregman method for large scale fused Lasso," Computational Statistics & Data Analysis, Elsevier, vol. 55(4), pages 1552-1569, April.
    3. Crews, Hugh B. & Boos, Dennis D. & Stefanski, Leonard A., 2011. "FSR methods for second-order regression models," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2026-2037, June.
    4. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    5. Kapetanios, George, 2007. "Variable selection in regression models using nonstandard optimisation of information criteria," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 4-15, September.
    6. Erhardt Vinzenz & Bogdan Malgorzata & Czado Claudia, 2010. "Locating Multiple Interacting Quantitative Trait Loci with the Zero-Inflated Generalized Poisson Regression," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-27, June.
    7. Małgorzata Bogdan & Florian Frommlet & Przemysław Biecek & Riyan Cheng & Jayanta K. Ghosh & R.W. Doerge, 2008. "Extending the Modified Bayesian Information Criterion (mBIC) to Dense Markers and Multiple Interval Mapping," Biometrics, The International Biometric Society, vol. 64(4), pages 1162-1169, December.
    8. Baierl, Andreas & Futschik, Andreas & Bogdan, Malgorzata & Biecek, Przemyslaw, 2007. "Locating multiple interacting quantitative trait loci using robust model selection," Computational Statistics & Data Analysis, Elsevier, vol. 51(12), pages 6423-6434, August.
    9. Marra, Giampiero & Wood, Simon N., 2011. "Practical variable selection for generalized additive models," Computational Statistics & Data Analysis, Elsevier, vol. 55(7), pages 2372-2387, July.
    10. Karl W. Broman & Terence P. Speed, 2002. "A model selection approach for the identification of quantitative trait loci in experimental crosses," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(4), pages 641-656, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Frommlet Florian & Ljubic Ivana & Arnardóttir Helga Björk & Bogdan Malgorzata, 2012. "QTL Mapping Using a Memetic Algorithm with Modifications of BIC as Fitness Function," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(4), pages 1-26, May.
    2. Jian Huang & Yuling Jiao & Lican Kang & Jin Liu & Yanyan Liu & Xiliang Lu, 2022. "GSDAR: a fast Newton algorithm for $$\ell _0$$ ℓ 0 regularized generalized linear models with statistical guarantee," Computational Statistics, Springer, vol. 37(1), pages 507-533, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Frommlet, Florian & Ruhaltinger, Felix & Twaróg, Piotr & Bogdan, Małgorzata, 2012. "Modified versions of Bayesian Information Criterion for genome-wide association studies," Computational Statistics & Data Analysis, Elsevier, vol. 56(5), pages 1038-1051.
    2. Guo, Jie & Tang, Manlai & Tian, Maozai & Zhu, Kai, 2013. "Variable selection in high-dimensional partially linear additive models for composite quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 65(C), pages 56-67.
    3. Ryan A. Peterson & Joseph E. Cavanaugh, 2022. "Ranked sparsity: a cogent regularization framework for selecting and estimating feature interactions and polynomials," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 106(3), pages 427-454, September.
    4. Yawei He & Zehua Chen, 2016. "The EBIC and a sequential procedure for feature selection in interactive linear models with high-dimensional data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 68(1), pages 155-180, February.
    5. Chun Wang, 2021. "Using Penalized EM Algorithm to Infer Learning Trajectories in Latent Transition CDM," Psychometrika, Springer;The Psychometric Society, vol. 86(1), pages 167-189, March.
    6. Kaixu Yang & Tapabrata Maiti, 2022. "Ultrahigh‐dimensional generalized additive model: Unified theory and methods," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 49(3), pages 917-942, September.
    7. Wang, Tao & Zhu, Lixing, 2011. "Consistent tuning parameter selection in high dimensional sparse linear regression," Journal of Multivariate Analysis, Elsevier, vol. 102(7), pages 1141-1151, August.
    8. Frommlet Florian & Ljubic Ivana & Arnardóttir Helga Björk & Bogdan Malgorzata, 2012. "QTL Mapping Using a Memetic Algorithm with Modifications of BIC as Fitness Function," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 11(4), pages 1-26, May.
    9. Erhardt Vinzenz & Bogdan Malgorzata & Czado Claudia, 2010. "Locating Multiple Interacting Quantitative Trait Loci with the Zero-Inflated Generalized Poisson Regression," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 9(1), pages 1-27, June.
    10. Xiaotong Shen & Wei Pan & Yunzhang Zhu & Hui Zhou, 2013. "On constrained and regularized high-dimensional regression," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 65(5), pages 807-832, October.
    11. Shan Luo & Zehua Chen, 2014. "Sequential Lasso Cum EBIC for Feature Selection With Ultra-High Dimensional Feature Space," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1229-1240, September.
    12. Li, Xinyi & Wang, Li & Nettleton, Dan, 2019. "Sparse model identification and learning for ultra-high-dimensional additive partially linear models," Journal of Multivariate Analysis, Elsevier, vol. 173(C), pages 204-228.
    13. Chenchen Ma & Jing Ouyang & Gongjun Xu, 2023. "Learning Latent and Hierarchical Structures in Cognitive Diagnosis Models," Psychometrika, Springer;The Psychometric Society, vol. 88(1), pages 175-207, March.
    14. Sakyajit Bhattacharya & Paul McNicholas, 2014. "A LASSO-penalized BIC for mixture model selection," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(1), pages 45-61, March.
    15. Qingliang Fan & Yaqian Wu, 2020. "Endogenous Treatment Effect Estimation with some Invalid and Irrelevant Instruments," Papers 2006.14998, arXiv.org.
    16. Zhao, Bangxin & Liu, Xin & He, Wenqing & Yi, Grace Y., 2021. "Dynamic tilted current correlation for high dimensional variable screening," Journal of Multivariate Analysis, Elsevier, vol. 182(C).
    17. Luke Mosley & Idris A. Eckley & Alex Gibberd, 2022. "Sparse temporal disaggregation," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 185(4), pages 2203-2233, October.
    18. Yunquan Song & Zitong Li & Minglu Fang, 2022. "Robust Variable Selection Based on Penalized Composite Quantile Regression for High-Dimensional Single-Index Models," Mathematics, MDPI, vol. 10(12), pages 1-17, June.
    19. Zhang, Shucong & Zhou, Yong, 2018. "Variable screening for ultrahigh dimensional heterogeneous data via conditional quantile correlations," Journal of Multivariate Analysis, Elsevier, vol. 165(C), pages 1-13.
    20. Luoying Yang & Tong Tong Wu, 2023. "Model‐based clustering of high‐dimensional longitudinal data via regularization," Biometrics, The International Biometric Society, vol. 79(2), pages 761-774, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:55:y:2011:i:11:p:2908-2924. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.