IDEAS home Printed from https://ideas.repec.org/a/gam/jstats/v5y2022i2p21-384d794755.html
   My bibliography  Save this article

ordinalbayes: Fitting Ordinal Bayesian Regression Models to High-Dimensional Data Using R

Author

Listed:
  • Kellie J. Archer

    (Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH 43210, USA)

  • Anna Eames Seffernick

    (Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH 43210, USA)

  • Shuai Sun

    (Division of Biostatistics, College of Public Health, The Ohio State University, Columbus, OH 43210, USA)

  • Yiran Zhang

    (Amgen Inc., 1 Amgen Center Dr, Thousand Oaks, CA 91320, USA)

Abstract

The stage of cancer is a discrete ordinal response that indicates the aggressiveness of disease and is often used by physicians to determine the type and intensity of treatment to be administered. For example, the FIGO stage in cervical cancer is based on the size and depth of the tumor as well as the level of spread. It may be of clinical relevance to identify molecular features from high-throughput genomic assays that are associated with the stage of cervical cancer to elucidate pathways related to tumor aggressiveness, identify improved molecular features that may be useful for staging, and identify therapeutic targets. High-throughput RNA-Seq data and corresponding clinical data (including stage) for cervical cancer patients have been made available through The Cancer Genome Atlas Project (TCGA). We recently described penalized Bayesian ordinal response models that can be used for variable selection for over-parameterized datasets, such as the TCGA-CESC dataset. Herein, we describe our ordinalbayes R package, available from the Comprehensive R Archive Network (CRAN), which enhances the runjags R package by enabling users to easily fit cumulative logit models when the outcome is ordinal and the number of predictors exceeds the sample size, P > N , such as for TCGA and other high-throughput genomic data. We demonstrate the use of this package by applying it to the TCGA cervical cancer dataset. Our ordinalbayes package can be used to fit models to high-dimensional datasets, and it effectively performs variable selection.

Suggested Citation

  • Kellie J. Archer & Anna Eames Seffernick & Shuai Sun & Yiran Zhang, 2022. "ordinalbayes: Fitting Ordinal Bayesian Regression Models to High-Dimensional Data Using R," Stats, MDPI, vol. 5(2), pages 1-14, April.
  • Handle: RePEc:gam:jstats:v:5:y:2022:i:2:p:21-384:d:794755
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2571-905X/5/2/21/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2571-905X/5/2/21/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Chris Hans, 2009. "Bayesian lasso regression," Biometrika, Biometrika Trust, vol. 96(4), pages 835-845.
    2. Denwood, Matthew J., 2016. "runjags: An R Package Providing Interface Utilities, Model Templates, Parallel Computing Methods and Additional Distributions for MCMC Models in JAGS," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 71(i09).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Merkle, Edgar C. & Steyvers, Mark & Mellers, Barbara & Tetlock, Philip E., 2017. "A neglected dimension of good forecasting judgment: The questions we choose also matter," International Journal of Forecasting, Elsevier, vol. 33(4), pages 817-832.
    2. Bai, Jushan & Ando, Tomohiro, 2013. "Multifactor asset pricing with a large number of observable risk factors and unobservable common and group-specific factors," MPRA Paper 52785, University Library of Munich, Germany, revised Dec 2013.
    3. Johnson, Fred A. & Zimmerman, Guthrie S. & Jensen, Gitte H. & Clausen, Kevin K. & Frederiksen, Morten & Madsen, Jesper, 2020. "Using integrated population models for insights into monitoring programs: An application using pink-footed geese," Ecological Modelling, Elsevier, vol. 415(C).
    4. Dexen DZ. Xi & C.B. Dean & Stephen W. Taylor, 2020. "Modeling the duration and size of extended attack wildfires as dependent outcomes," Environmetrics, John Wiley & Sons, Ltd., vol. 31(5), August.
    5. Ji, Yonggang & Lin, Nan & Zhang, Baoxue, 2012. "Model selection in binary and tobit quantile regression using the Gibbs sampler," Computational Statistics & Data Analysis, Elsevier, vol. 56(4), pages 827-839.
    6. Enwei Zhu & Stanislav Sobolevsky, 2018. "House Price Modeling with Digital Census," Papers 1809.03834, arXiv.org.
    7. Ng'ombe, John, 2019. "Economics of the Greenseeder Hand Planter, Discrete Choice Modeling, and On-Farm Field Experimentation," Thesis Commons jckt7, Center for Open Science.
    8. Laura Melissa Guzman & Elizabeth Elle & Lora A. Morandin & Neil S. Cobb & Paige R. Chesshire & Lindsie M. McCabe & Alice Hughes & Michael Orr & Leithen K. M’Gonigle, 2024. "Impact of pesticide use on wild bee distributions across the United States," Nature Sustainability, Nature, vol. 7(10), pages 1324-1334, October.
    9. Aghabazaz, Zeynab & Kazemi, Iraj, 2023. "Under-reported time-varying MINAR(1) process for modeling multivariate count series," Computational Statistics & Data Analysis, Elsevier, vol. 188(C).
    10. Guangbao Guo & Guoqi Qian & Lu Lin & Wei Shao, 2021. "Parallel inference for big data with the group Bayesian method," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 84(2), pages 225-243, February.
    11. Alkhaleel, Basem A., 2024. "Machine learning applications in the resilience of interdependent critical infrastructure systems—A systematic literature review," International Journal of Critical Infrastructure Protection, Elsevier, vol. 44(C).
    12. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    13. Badri Padhukasahasram & Chandan K Reddy & Yan Li & David E Lanfear, 2015. "Joint Impact of Clinical and Behavioral Variables on the Risk of Unplanned Readmission and Death after a Heart Failure Hospitalization," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-11, June.
    14. Mahdiyeh, Zahra & Kazemi, Iraj, 2019. "An innovative strategy on the construction of multivariate multimodal linear mixed-effects models," Journal of Multivariate Analysis, Elsevier, vol. 174(C).
    15. Philip Kostov & Thankom Arun & Samuel Annim, 2014. "Financial Services to the Unbanked: the case of the Mzansi intervention in South Africa," Contemporary Economics, University of Economics and Human Sciences in Warsaw., vol. 8(2), June.
    16. Fangpo Wang & Anirban Bhattacharya & Alan E. Gelfand, 2018. "Process modeling for slope and aspect with application to elevation data maps," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 27(4), pages 749-772, December.
    17. Ruggieri, Eric & Lawrence, Charles E., 2012. "On efficient calculations for Bayesian variable selection," Computational Statistics & Data Analysis, Elsevier, vol. 56(6), pages 1319-1332.
    18. Adam N. Smith & Jim E. Griffin, 2023. "Shrinkage priors for high-dimensional demand estimation," Quantitative Marketing and Economics (QME), Springer, vol. 21(1), pages 95-146, March.
    19. Larissa S. Melo & Veber A. F. Costa & Wilson S. Fernandes, 2023. "Assessing the Anthropogenic and Climatic Components in Runoff Changes of the São Francisco River Catchment," Water Resources Management: An International Journal, Published for the European Water Resources Association (EWRA), Springer;European Water Resources Association (EWRA), vol. 37(9), pages 3615-3629, July.
    20. Bernardi, Mauro & Bottone, Marco & Petrella, Lea, 2018. "Bayesian quantile regression using the skew exponential power distribution," Computational Statistics & Data Analysis, Elsevier, vol. 126(C), pages 92-111.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jstats:v:5:y:2022:i:2:p:21-384:d:794755. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.