IDEAS home Printed from https://ideas.repec.org/a/taf/amstat/v69y2015i3p165-173.html
   My bibliography  Save this article

Bayesian Variable Selection Under Collinearity

Author

Listed:
  • Joyee Ghosh
  • Andrew E. Ghattas

Abstract

In this article, we highlight some interesting facts about Bayesian variable selection methods for linear regression models in settings where the design matrix exhibits strong collinearity. We first demonstrate via real data analysis and simulation studies that summaries of the posterior distribution based on marginal and joint distributions may give conflicting results for assessing the importance of strongly correlated covariates. The natural question is which one should be used in practice. The simulation studies suggest that posterior inclusion probabilities and Bayes factors that evaluate the importance of correlated covariates jointly are more appropriate, and some priors may be more adversely affected in such a setting. To obtain a better understanding behind the phenomenon, we study some toy examples with Zellner's g -prior. The results show that strong collinearity may lead to a multimodal posterior distribution over models, in which joint summaries are more appropriate than marginal summaries. Thus, we recommend a routine examination of the correlation matrix and calculation of the joint inclusion probabilities for correlated covariates, in addition to marginal inclusion probabilities, for assessing the importance of covariates in Bayesian variable selection.

Suggested Citation

  • Joyee Ghosh & Andrew E. Ghattas, 2015. "Bayesian Variable Selection Under Collinearity," The American Statistician, Taylor & Francis Journals, vol. 69(3), pages 165-173, August.
  • Handle: RePEc:taf:amstat:v:69:y:2015:i:3:p:165-173
    DOI: 10.1080/00031305.2015.1031827
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/00031305.2015.1031827
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/00031305.2015.1031827?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. James O. Berger & German Molina, 2005. "Posterior model probabilities via path‐based pairwise priors," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 59(1), pages 3-15, February.
    2. Ghosh, Joyee & Clyde, Merlise A., 2011. "Rao–Blackwellization for Bayesian Variable Selection and Model Averaging in Linear and Binary Regression: A Novel Data Augmentation Approach," Journal of the American Statistical Association, American Statistical Association, vol. 106(495), pages 1041-1052.
    3. Fernandez, Carmen & Ley, Eduardo & Steel, Mark F. J., 2001. "Benchmark priors for Bayesian model averaging," Journal of Econometrics, Elsevier, vol. 100(2), pages 381-427, February.
    4. Brown P.J. & Fearn T & Vannucci M, 2001. "Bayesian Wavelet Regression on Curves With Application to a Spectroscopic Calibration Problem," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 398-408, June.
    5. Liang, Feng & Paulo, Rui & Molina, German & Clyde, Merlise A. & Berger, Jim O., 2008. "Mixtures of g Priors for Bayesian Variable Selection," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 410-423, March.
    6. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    7. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Fouskakis, Dimitris & Ntzoufras, Ioannis & Perrakis, Konstantinos, 2020. "Variations of power-expected-posterior priors in normal regression models," Computational Statistics & Data Analysis, Elsevier, vol. 143(C).
    2. Sarsen Zhanabekov, 2022. "Robust determinants of the shadow economy," Bulletin of Economic Research, Wiley Blackwell, vol. 74(4), pages 1017-1052, October.
    3. Mark F. J. Steel, 2020. "Model Averaging and Its Use in Economics," Journal of Economic Literature, American Economic Association, vol. 58(3), pages 644-719, September.
    4. Drachal, Krzysztof, 2016. "Forecasting spot oil price in a dynamic model averaging framework — Have the determinants changed over time?," Energy Economics, Elsevier, vol. 60(C), pages 35-46.
    5. Camba-Méndez, Gonzalo & Werner, Thomas, 2017. "The inflation risk premium in the post-Lehman period," Working Paper Series 2033, European Central Bank.
    6. Lanzafame, Matteo & Felipe, Jesus & Sotocinal, Noli & Bayudan-Dacuycuy, Connie, 2016. "The Pillars of Potential Growth and the Role of Policy: A Panel Data Approach," ADB Economics Working Paper Series 482, Asian Development Bank.
    7. Benjamin Heuclin & Frédéric Mortier & Catherine Trottier & Marie Denis, 2021. "Bayesian varying coefficient model with selection: An application to functional mapping," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(1), pages 24-50, January.
    8. Vicente Rios & Lisa Gianmoena, 2021. "On the link between temperature and regional COVID‐19 severity: Evidence from Italy," Regional Science Policy & Practice, Wiley Blackwell, vol. 13(S1), pages 109-137, November.
    9. Peter Congdon, 2016. "Assessing Impacts on Unplanned Hospitalisations of Care Quality and Access Using a Structural Equation Method: With a Case Study of Diabetes," IJERPH, MDPI, vol. 13(9), pages 1-19, September.
    10. Li, Hanning & Pati, Debdeep, 2017. "Variable selection using shrinkage priors," Computational Statistics & Data Analysis, Elsevier, vol. 107(C), pages 107-119.
    11. Heyard, Rachel & Held, Leonhard, 2019. "The quantile probability model," Computational Statistics & Data Analysis, Elsevier, vol. 132(C), pages 84-99.
    12. Kuo-Jung Lee & Yi-Chi Chen, 2018. "Of needles and haystacks: revisiting growth determinants by robust Bayesian variable selection," Empirical Economics, Springer, vol. 54(4), pages 1517-1547, June.
    13. Lee, Kuo-Jung & Feldkircher, Martin & Chen, Yi-Chi, 2021. "Variable selection in finite mixture of regression models with an unknown number of components," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
    14. Matteo Lanzafame, 2016. "Potential Growth in Asia and Its Determinants: An Empirical Investigation," Asian Development Review, MIT Press, vol. 33(2), pages 1-27, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Gilles Celeux & Mohammed El Anbari & Jean-Michel Marin & Christian P. Robert, 2010. "Regularization in Regression : Comparing Bayesian and Frequentist Methods in a Poorly Informative Situation," Working Papers 2010-43, Center for Research in Economics and Statistics.
    2. Korobilis, Dimitris, 2013. "Hierarchical shrinkage priors for dynamic regressions with many predictors," International Journal of Forecasting, Elsevier, vol. 29(1), pages 43-59.
    3. Posch, Konstantin & Arbeiter, Maximilian & Pilz, Juergen, 2020. "A novel Bayesian approach for variable selection in linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    4. Li Ma, 2015. "Scalable Bayesian Model Averaging Through Local Information Propagation," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 110(510), pages 795-809, June.
    5. David Kaplan, 2021. "On the Quantification of Model Uncertainty: A Bayesian Perspective," Psychometrika, Springer;The Psychometric Society, vol. 86(1), pages 215-238, March.
    6. Kim, Hyun Hak & Swanson, Norman R., 2014. "Forecasting financial and macroeconomic variables using data reduction methods: New empirical evidence," Journal of Econometrics, Elsevier, vol. 178(P2), pages 352-367.
    7. Ruggieri, Eric & Lawrence, Charles E., 2012. "On efficient calculations for Bayesian variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1319-1332.
    8. Dimitris Korobilis & Kenichi Shimizu, 2022. "Bayesian Approaches to Shrinkage and Sparse Estimation," Foundations and Trends(R) in Econometrics, now publishers, vol. 11(4), pages 230-354, June.
    9. Nott, David J. & Leng, Chenlei, 2010. "Bayesian projection approaches to variable selection in generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3227-3241, December.
    10. Latouche, Pierre & Mattei, Pierre-Alexandre & Bouveyron, Charles & Chiquet, Julien, 2016. "Combining a relaxed EM algorithm with Occam’s razor for Bayesian variable selection in high-dimensional regression," Journal of Multivariate Analysis, Elsevier, vol. 146(C), pages 177-190.
    11. Dimitris Korobilis, 2018. "Machine Learning Macroeconometrics: A Primer," Working Paper series 18-30, Rimini Centre for Economic Analysis.
    12. Elliott, Graham & Gargano, Antonio & Timmermann, Allan, 2013. "Complete subset regressions," Journal of Econometrics, Elsevier, vol. 177(2), pages 357-373.
    13. Sweata Sen & Damitri Kundu & Kiranmoy Das, 2023. "Variable selection for categorical response: a comparative study," Computational Statistics, Springer, vol. 38(2), pages 809-826, June.
    14. Matthias Pelster & Johannes Vilsmeier, 2018. "The determinants of CDS spreads: evidence from the model space," Review of Derivatives Research, Springer, vol. 21(1), pages 63-118, April.
    15. Baragatti, M. & Pommeret, D., 2012. "A study of variable selection using g-prior distribution with ridge parameter," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1920-1934.
    16. Narayan, Seema & Smyth, Russell, 2015. "The financial econometrics of price discovery and predictability," International Review of Financial Analysis, Elsevier, vol. 42(C), pages 380-393.
    17. Jiawen Luo & Qun Zhang, 2024. "Air pollution, weather factors, and realized volatility forecasts of agricultural commodity futures," Journal of Futures Markets, John Wiley & Sons, Ltd., vol. 44(2), pages 151-217, February.
    18. Luo, Ruiyan & Qi, Xin, 2015. "Sparse wavelet regression with multiple predictive curves," Journal of Multivariate Analysis, Elsevier, vol. 134(C), pages 33-49.
    19. Yen-Shiu Chin & Ting-Li Chen, 2016. "Minimizing variable selection criteria by Markov chain Monte Carlo," Computational Statistics, Springer, vol. 31(4), pages 1263-1286, December.
    20. Howard D. Bondell & Brian J. Reich, 2012. "Consistent High-Dimensional Bayesian Variable Selection via Penalized Credible Regions," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(500), pages 1610-1624, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:amstat:v:69:y:2015:i:3:p:165-173. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UTAS20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.