IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v56y2012i6p1319-1332.html
   My bibliography  Save this article

On efficient calculations for Bayesian variable selection

Author

Listed:
  • Ruggieri, Eric
  • Lawrence, Charles E.

Abstract

We describe an efficient, exact Bayesian algorithm applicable to both variable selection and model averaging problems. A fully Bayesian approach provides a more complete characterization of the posterior ensemble of possible sub-models, but presents a computational challenge as the number of candidate variables increases. While several approximation techniques have been developed to deal with problems that contain a large numbers of candidate variables, including BMA, IBMA, MCMC and Gibbs Sampling approaches, here we focus on improving the time complexity of exact inference using a recursive algorithm (Exact Bayesian Inference in Regression, or EBIR) that uses components of one sub-model to rapidly generate another and prove that its time complexity is O(m2), where m is the number candidate variables. Testing against simulated data shows that EBIR significantly reduces compute time without sacrificing accuracy, while comparisons to the results obtained by MCMC approaches on the Crime and Punishment data set show that model averaging yields improved predictive performance over two model selection approaches. In addition, we show that finite mixtures of centroid solutions provide a means to better characterize the shape of multimodal posterior spaces than any individual model. Finally, we describe how the BIC approximations employed in the BMA and IBMA algorithms can be replaced by an EBIR calculation of equal time complexity and illustrate the departure of the BIC approximation from the exact Bayesian inference of EBIR.

Suggested Citation

  • Ruggieri, Eric & Lawrence, Charles E., 2012. "On efficient calculations for Bayesian variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1319-1332.
  • Handle: RePEc:eee:csdana:v:56:y:2012:i:6:p:1319-1332
    DOI: 10.1016/j.csda.2011.09.026
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947311003574
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2011.09.026?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    2. Fernandez, Carmen & Ley, Eduardo & Steel, Mark F. J., 2001. "Benchmark priors for Bayesian model averaging," Journal of Econometrics, Elsevier, vol. 100(2), pages 381-427, February.
    3. Jiahua Chen & Zehua Chen, 2008. "Extended Bayesian information criteria for model selection with large model spaces," Biometrika, Biometrika Trust, vol. 95(3), pages 759-771.
    4. Park, Trevor & Casella, George, 2008. "The Bayesian Lasso," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 681-686, June.
    5. Gary S. Becker, 1974. "Crime and Punishment: An Economic Approach," NBER Chapters, in: Essays in the Economics of Crime and Punishment, pages 1-54, National Bureau of Economic Research, Inc.
    6. Ehrlich, Isaac, 1975. "The Deterrent Effect of Capital Punishment: A Question of Life and Death," American Economic Review, American Economic Association, vol. 65(3), pages 397-417, June.
    7. George J. Stigler, 1974. "The Optimum Enforcement of Laws," NBER Chapters, in: Essays in the Economics of Crime and Punishment, pages 55-67, National Bureau of Economic Research, Inc.
    8. Eicher, Theo S. & Papageorgiou, Chris & Roehn, Oliver, 2007. "Unraveling the fortunes of the fortunate: An Iterative Bayesian Model Averaging (IBMA) approach," Journal of Macroeconomics, Elsevier, vol. 29(3), pages 494-514, September.
    9. Carmen Fernandez & Eduardo Ley & Mark F. J. Steel, 2001. "Model uncertainty in cross-country growth regressions," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 16(5), pages 563-576.
    10. Smith, Michael & Kohn, Robert, 1996. "Nonparametric regression using Bayesian variable selection," Journal of Econometrics, Elsevier, vol. 75(2), pages 317-343, December.
    11. Efron, Bradley, 2004. "Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 96-104, January.
    12. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    13. Chris Hans, 2009. "Bayesian lasso regression," Biometrika, Biometrika Trust, vol. 96(4), pages 835-845.
    14. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    15. Wang, Hansheng, 2009. "Forward Regression for Ultra-High Dimensional Variable Screening," Journal of the American Statistical Association, American Statistical Association, vol. 104(488), pages 1512-1524.
    16. Ehrlich, Isaac, 1973. "Participation in Illegitimate Activities: A Theoretical and Empirical Investigation," Journal of Political Economy, University of Chicago Press, vol. 81(3), pages 521-565, May-June.
    17. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    18. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Sweata Sen & Damitri Kundu & Kiranmoy Das, 2023. "Variable selection for categorical response: a comparative study," Computational Statistics, Springer, vol. 38(2), pages 809-826, June.
    2. Wei Sun & Lexin Li, 2012. "Multiple Loci Mapping via Model-free Variable Selection," Biometrics, The International Biometric Society, vol. 68(1), pages 12-22, March.
    3. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    4. Dai, Linlin & Chen, Kani & Sun, Zhihua & Liu, Zhenqiu & Li, Gang, 2018. "Broken adaptive ridge regression and its asymptotic properties," Journal of Multivariate Analysis, Elsevier, vol. 168(C), pages 334-351.
    5. Xiangyu Wang & Chenlei Leng, 2016. "High dimensional ordinary least squares projection for screening variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(3), pages 589-611, June.
    6. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    7. Gilles Celeux & Mohammed El Anbari & Jean-Michel Marin & Christian P. Robert, 2010. "Regularization in Regression : Comparing Bayesian and Frequentist Methods in a Poorly Informative Situation," Working Papers 2010-43, Center for Research in Economics and Statistics.
    8. Philip Kostov & Thankom Arun & Samuel Annim, 2014. "Financial Services to the Unbanked: the case of the Mzansi intervention in South Africa," Contemporary Economics, University of Economics and Human Sciences in Warsaw., vol. 8(2), June.
    9. Li, Xingxiang & Cheng, Guosheng & Wang, Liming & Lai, Peng & Song, Fengli, 2017. "Ultrahigh dimensional feature screening via projection," Computational Statistics & Data Analysis, Elsevier, vol. 114(C), pages 88-104.
    10. Liming Wang & Xingxiang Li & Xiaoqing Wang & Peng Lai, 2022. "Unified mean-variance feature screening for ultrahigh-dimensional regression," Computational Statistics, Springer, vol. 37(4), pages 1887-1918, September.
    11. Posch, Konstantin & Arbeiter, Maximilian & Pilz, Juergen, 2020. "A novel Bayesian approach for variable selection in linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    12. Huiwen Wang & Ruiping Liu & Shanshan Wang & Zhichao Wang & Gilbert Saporta, 2020. "Ultra-high dimensional variable screening via Gram–Schmidt orthogonalization," Computational Statistics, Springer, vol. 35(3), pages 1153-1170, September.
    13. Christis Katsouris, 2023. "High Dimensional Time Series Regression Models: Applications to Statistical Learning Methods," Papers 2308.16192, arXiv.org.
    14. Bergersen Linn Cecilie & Glad Ingrid K. & Lyng Heidi, 2011. "Weighted Lasso with Data Integration," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-29, August.
    15. van Erp, Sara & Oberski, Daniel L. & Mulder, Joris, 2018. "Shrinkage priors for Bayesian penalized regression," OSF Preprints cg8fq, Center for Open Science.
    16. Matthew Gentzkow & Bryan T. Kelly & Matt Taddy, 2017. "Text as Data," NBER Working Papers 23276, National Bureau of Economic Research, Inc.
    17. Cox Lwaka Tamba & Yuan-Li Ni & Yuan-Ming Zhang, 2017. "Iterative sure independence screening EM-Bayesian LASSO algorithm for multi-locus genome-wide association studies," PLOS Computational Biology, Public Library of Science, vol. 13(1), pages 1-20, January.
    18. Chen Xu & Jiahua Chen, 2014. "The Sparse MLE for Ultrahigh-Dimensional Feature Screening," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 109(507), pages 1257-1269, September.
    19. Howard D. Bondell & Brian J. Reich, 2012. "Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1610-1624, December.
    20. Zhihua Sun & Yi Liu & Kani Chen & Gang Li, 2022. "Broken adaptive ridge regression for right-censored survival data," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 74(1), pages 69-91, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:56:y:2012:i:6:p:1319-1332. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.