IDEAS home Printed from https://ideas.repec.org/a/oup/biomet/v101y2014i1p37-55..html
   My bibliography  Save this article

Information criteria for variable selection under sparsity

Author

Listed:
  • Maarten Jansen

Abstract

The optimization of an information criterion in a variable selection procedure leads to an additional bias, which can be substantial for sparse, high-dimensional data. One can compensate for the bias by applying shrinkage while estimating within the selected models. This paper presents modified information criteria for use in variable selection and estimation without shrinkage. The analysis motivating the modified criteria follows two routes. The first, which we explore for signal-plus-noise observations only, proceeds by comparing estimators with and without shrinkage. The second, discussed for general regression models, describes the optimization or selection bias as a double-sided effect, which we call a mirror effect: among the numerous insignificant variables, those with large, noisy values appear more valuable than an arbitrary variable, while in fact they carry more noise than an arbitrary variable. The mirror effect is investigated for Akaike’s information criterion and for Mallows’ Cp, with special attention paid to the latter criterion as a stopping rule in a least-angle regression routine. The result is a new stopping rule, which focuses not on the quality of a lasso shrinkage selection but on the least-squares estimator without shrinkage within the same selection.

Suggested Citation

  • Maarten Jansen, 2014. "Information criteria for variable selection under sparsity," Biometrika, Biometrika Trust, vol. 101(1), pages 37-55.
  • Handle: RePEc:oup:biomet:v:101:y:2014:i:1:p:37-55.
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1093/biomet/ast055
    Download Restriction: Access to full text is restricted to subscribers.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Leo Egghe, 2006. "Theory and practise of the g-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 69(1), pages 131-152, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Wei, Yuting & Wang, Qihua & Duan, Xiaogang & Qin, Jing, 2021. "Bias-corrected Kullback–Leibler distance criterion based model selection with covariables missing at random," Computational Statistics & Data Analysis, Elsevier, vol. 160(C).
    2. Ali Charkhi & Gerda Claeskens, 2018. "Asymptotic post-selection inference for the Akaike information criterion," Biometrika, Biometrika Trust, vol. 105(3), pages 645-664.
    3. Bastien Marquis & Maarten Jansen, 2022. "Information criteria bias correction for group selection," Statistical Papers, Springer, vol. 63(5), pages 1387-1414, October.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Deming Lin & Tianhui Gong & Wenbin Liu & Martin Meyer, 2020. "An entropy-based measure for the evolution of h index research," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2283-2298, December.
    2. Gaviria-Marin, Magaly & Merigó, José M. & Baier-Fuentes, Hugo, 2019. "Knowledge management: A global examination based on bibliometric analysis," Technological Forecasting and Social Change, Elsevier, vol. 140(C), pages 194-220.
    3. David L. Anderson & John Tressler, 2013. "The Relevance of the “h-” and “g-” Index to Economics in the Context of A Nation-Wide Research Evaluation Scheme: The New Zealand Case," Economic Papers, The Economic Society of Australia, vol. 32(1), pages 81-94, March.
    4. Ash Mohammad Abbas, 2011. "Weighted indices for evaluating the quality of research with multiple authorship," Scientometrics, Springer;Akadémiai Kiadó, vol. 88(1), pages 107-131, July.
    5. Soutar, Geoffrey N. & Murphy, Jamie, 2009. "Journal quality: A Google Scholar analysis," Australasian marketing journal, Elsevier, vol. 17(3), pages 150-153.
    6. Thor, Andreas & Marx, Werner & Leydesdorff, Loet & Bornmann, Lutz, 2016. "Introducing CitedReferencesExplorer (CRExplorer): A program for reference publication year spectroscopy with cited references standardization," Journal of Informetrics, Elsevier, vol. 10(2), pages 503-515.
    7. Perc, Matjaž, 2010. "Zipf’s law and log-normal distributions in measures of scientific output across fields and institutions: 40 years of Slovenia’s research as an example," Journal of Informetrics, Elsevier, vol. 4(3), pages 358-364.
    8. Aniruddha Maiti & Sai Shi & Slobodan Vucetic, 2023. "An ablation study on the use of publication venue quality to rank computer science departments," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(8), pages 4197-4218, August.
    9. Lathabai, Hiran H., 2020. "ψ-index: A new overall productivity index for actors of science and technology," Journal of Informetrics, Elsevier, vol. 14(4).
    10. Hui-Zhen Fu & Yuh-Shan Ho, 2013. "Comparison of independent research of China’s top universities using bibliometric indicators," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(1), pages 259-276, July.
    11. Ying Huang & Wolfgang Glänzel & Lin Zhang, 2021. "Tracing the development of mapping knowledge domains," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6201-6224, July.
    12. Aditi S. Saha & Rakesh D. Raut & Vinay Surendra Yadav & Abhijit Majumdar, 2022. "Blockchain Changing the Outlook of the Sustainable Food Supply Chain to Achieve Net Zero?," Sustainability, MDPI, vol. 14(24), pages 1-21, December.
    13. Serge Galam, 2011. "Tailor based allocations for multiple authorship: a fractional gh-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 89(1), pages 365-379, October.
    14. Nikolaos A. Kazakis, 2014. "Bibliometric evaluation of the research performance of the Greek civil engineering departments in National and European context," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 505-525, October.
    15. Paul Handro & Bogdan Dima, 2024. "Analyzing Financial Markets Efficiency: Insights from a Bibliometric and Content Review," Journal of Financial Studies, Institute of Financial Studies, vol. 16(9), pages 119-175, May.
    16. Parul Khurana & Kiran Sharma, 2022. "Impact of h-index on author’s rankings: an improvement to the h-index for lower-ranked authors," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(8), pages 4483-4498, August.
    17. Suddin Lada & Brahim Chekima & Rudy Ansar & Mohamad Isa Abdul Jalil & Lim Ming Fook & Caroline Geetha & Mohamed Bouteraa & Mohd Rahimie Abdul Karim, 2023. "Islamic Economy and Sustainability: A Bibliometric Analysis Using R," Sustainability, MDPI, vol. 15(6), pages 1-21, March.
    18. Leo Egghe, 2014. "Comments on “year-based h-type indicators”," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 2369-2370, March.
    19. Judit Bar-Ilan & Mark Levene, 2015. "The hw-rank: an h-index variant for ranking web pages," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2247-2253, March.
    20. Rey-Martí, Andrea & Ribeiro-Soriano, Domingo & Palacios-Marqués, Daniel, 2016. "A bibliometric analysis of social entrepreneurship," Journal of Business Research, Elsevier, vol. 69(5), pages 1651-1655.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:oup:biomet:v:101:y:2014:i:1:p:37-55.. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Oxford University Press (email available below). General contact details of provider: https://academic.oup.com/biomet .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.