IDEAS home Printed from https://ideas.repec.org/a/spr/compst/v32y2017i1d10.1007_s00180-016-0665-3.html
   My bibliography  Save this article

Bayesian variable selection with sparse and correlation priors for high-dimensional data analysis

Author

Listed:
  • Aijun Yang

    (Nanjing Forestry University
    Southeast University)

  • Xuejun Jiang

    (South University of Science and Technology of China)

  • Lianjie Shu

    (University of Macau)

  • Jinguan Lin

    (Southeast University)

Abstract

The main challenge in working with gene expression microarrays is that the sample size is small compared to the large number of variables (genes). In many studies, the main focus is on finding a small subset of the genes, which are the most important ones for differentiating between different types of cancer, for simpler and cheaper diagnostic arrays. In this paper, a sparse Bayesian variable selection method in probit model is proposed for gene selection and classification. We assign a sparse prior for regression parameters and perform variable selection by indexing the covariates of the model with a binary vector. The correlation prior for the binary vector assigned in this paper is able to distinguish models with the same size. The performance of the proposed method is demonstrated with one simulated data and two well known real data sets, and the results show that our method is comparable with other existing methods in variable selection and classification.

Suggested Citation

  • Aijun Yang & Xuejun Jiang & Lianjie Shu & Jinguan Lin, 2017. "Bayesian variable selection with sparse and correlation priors for high-dimensional data analysis," Computational Statistics, Springer, vol. 32(1), pages 127-143, March.
  • Handle: RePEc:spr:compst:v:32:y:2017:i:1:d:10.1007_s00180-016-0665-3
    DOI: 10.1007/s00180-016-0665-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00180-016-0665-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00180-016-0665-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Gupta, Mayetri & Ibrahim, Joseph G., 2007. "Variable Selection in Regression Mixture Modeling for the Discovery of Gene Regulatory Networks," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 867-880, September.
    2. Park, Trevor & Casella, George, 2008. "The Bayesian Lasso," Journal of the American Statistical Association, American Statistical Association, vol. 103, pages 681-686, June.
    3. Panagiotelis, Anastasios & Smith, Michael, 2008. "Bayesian identification, selection and estimation of semiparametric functions in high-dimensional additive models," Journal of Econometrics, Elsevier, vol. 143(2), pages 291-316, April.
    4. Chakraborty, Sounak, 2009. "Bayesian binary kernel probit model for microarray based cancer classification and gene selection," Computational Statistics & Data Analysis, Elsevier, vol. 53(12), pages 4198-4209, October.
    5. Yuan, Ming & Lin, Yi, 2005. "Efficient Empirical Bayes Variable Selection and Estimation in Linear Models," Journal of the American Statistical Association, American Statistical Association, vol. 100, pages 1215-1225, December.
    6. Baragatti, M. & Pommeret, D., 2012. "A study of variable selection using g-prior distribution with ridge parameter," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1920-1934.
    7. Naijun Sha & Marina Vannucci & Mahlet G. Tadesse & Philip J. Brown & Ilaria Dragoni & Nick Davies & Tracy C. Roberts & Andrea Contestabile & Mike Salmon & Chris Buckley & Francesco Falciani, 2004. "Bayesian Variable Selection in Multinomial Probit Models to Identify Molecular Signatures of Disease Stage," Biometrics, The International Biometric Society, vol. 60(3), pages 812-819, September.
    8. Li, Fan & Zhang, Nancy R., 2010. "Bayesian Variable Selection in Structured High-Dimensional Covariate Spaces With Applications in Genomics," Journal of the American Statistical Association, American Statistical Association, vol. 105(491), pages 1202-1214.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Bernardi, Mauro & Costola, Michele, 2019. "High-dimensional sparse financial networks through a regularised regression model," SAFE Working Paper Series 244, Leibniz Institute for Financial Research SAFE.
    2. Lee, Kuo-Jung & Chen, Ray-Bing & Wu, Ying Nian, 2016. "Bayesian variable selection for finite mixture model of linear regressions," Computational Statistics & Data Analysis, Elsevier, vol. 95(C), pages 1-16.
    3. Yang Aijun & Xiang Ju & Yang Hongqiang & Lin Jinguan, 2018. "Sparse Bayesian Variable Selection in Probit Model for Forecasting U.S. Recessions Using a Large Set of Predictors," Computational Economics, Springer;Society for Computational Economics, vol. 51(4), pages 1123-1138, April.
    4. Posch, Konstantin & Arbeiter, Maximilian & Pilz, Juergen, 2020. "A novel Bayesian approach for variable selection in linear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 144(C).
    5. Aijun Yang & Yuzhu Tian & Yunxian Li & Jinguan Lin, 2020. "Sparse Bayesian variable selection in kernel probit model for analyzing high-dimensional data," Computational Statistics, Springer, vol. 35(1), pages 245-258, March.
    6. Aijun Yang & Ju Xiang & Lianjie Shu & Hongqiang Yang, 2018. "Sparse Bayesian Variable Selection with Correlation Prior for Forecasting Macroeconomic Variable using Highly Correlated Predictors," Computational Economics, Springer;Society for Computational Economics, vol. 51(2), pages 323-338, February.
    7. Baragatti, M. & Pommeret, D., 2012. "A study of variable selection using g-prior distribution with ridge parameter," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1920-1934.
    8. Aijun Yang & Yunxian Li & Niansheng Tang & Jinguan Lin, 2015. "Bayesian variable selection in multinomial probit model for classifying high-dimensional data," Computational Statistics, Springer, vol. 30(2), pages 399-418, June.
    9. Min Wang & Xiaoqian Sun & Tao Lu, 2015. "Bayesian structured variable selection in linear regression models," Computational Statistics, Springer, vol. 30(1), pages 205-229, March.
    10. Alberto Cassese & Michele Guindani & Philipp Antczak & Francesco Falciani & Marina Vannucci, 2015. "A Bayesian model for the identification of differentially expressed genes in Daphnia magna exposed to munition pollutants," Biometrics, The International Biometric Society, vol. 71(3), pages 803-811, September.
    11. Zhao, Kaifeng & Lian, Heng, 2016. "The Expectation–Maximization approach for Bayesian quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 96(C), pages 1-11.
    12. Dimitris Korobilis & Kenichi Shimizu, 2022. "Bayesian Approaches to Shrinkage and Sparse Estimation," Foundations and Trends(R) in Econometrics, now publishers, vol. 11(4), pages 230-354, June.
    13. Wang, Xiaoqing & Feng, Xiangnan & Song, Xinyuan, 2020. "Joint analysis of semicontinuous data with latent variables," Computational Statistics & Data Analysis, Elsevier, vol. 151(C).
    14. Nott, David J. & Leng, Chenlei, 2010. "Bayesian projection approaches to variable selection in generalized linear models," Computational Statistics & Data Analysis, Elsevier, vol. 54(12), pages 3227-3241, December.
    15. Stefan Lang & Nikolaus Umlauf & Peter Wechselberger & Kenneth Harttgen & Thomas Kneib, 2012. "Multilevel structured additive regression," Working Papers 2012-07, Faculty of Economics and Statistics, Universität Innsbruck.
    16. Fabian Scheipl & Thomas Kneib & Ludwig Fahrmeir, 2013. "Penalized likelihood and Bayesian function selection in regression models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 97(4), pages 349-385, October.
    17. Chakraborty, Sounak & Guo, Ruixin, 2011. "A Bayesian hybrid Huberized support vector machine and its applications in high-dimensional medical data," Computational Statistics & Data Analysis, Elsevier, vol. 55(3), pages 1342-1356, March.
    18. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    19. R. Alhamzawi & K. Yu & D. F. Benoit, 2011. "Bayesian adaptive Lasso quantile regression," Working Papers of Faculty of Economics and Business Administration, Ghent University, Belgium 11/728, Ghent University, Faculty of Economics and Business Administration.
    20. Chenlei Leng & Minh-Ngoc Tran & David Nott, 2014. "Bayesian adaptive Lasso," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(2), pages 221-244, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:compst:v:32:y:2017:i:1:d:10.1007_s00180-016-0665-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.