IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v77y2021i4p1431-1444.html
   My bibliography  Save this article

A Bayesian approach to restricted latent class models for scientifically structured clustering of multivariate binary outcomes

Author

Listed:
  • Zhenke Wu
  • Livia Casciola‐Rosen
  • Antony Rosen
  • Scott L. Zeger

Abstract

This paper presents a model‐based method for clustering multivariate binary observations that incorporates constraints consistent with the scientific context. The approach is motivated by the precision medicine problem of identifying autoimmune disease patient subsets or classes who may require different treatments. We start with a family of restricted latent class models or RLCMs. However, in the motivating example and many others like it, the unknown number of classes and the definition of classes using binary states are among the targets of inference. We use a Bayesian approach to RLCMs in order to use informative prior assumptions on the number and definitions of latent classes to be consistent with scientific knowledge so that the posterior distribution tends to concentrate on smaller numbers of clusters and sparser binary patterns. The paper derives a posterior sampling algorithm based on Markov chain Monte Carlo with split‐merge updates to efficiently explore the space of clustering allocations. Through simulations under the assumed model and realistic deviations from it, we demonstrate greater interpretability of results and superior finite‐sample clustering performance for our method compared to common alternatives. The methods are illustrated with an analysis of protein data to detect clusters representing autoantibody classes among scleroderma patients.

Suggested Citation

  • Zhenke Wu & Livia Casciola‐Rosen & Antony Rosen & Scott L. Zeger, 2021. "A Bayesian approach to restricted latent class models for scientifically structured clustering of multivariate binary outcomes," Biometrics, The International Biometric Society, vol. 77(4), pages 1431-1444, December.
  • Handle: RePEc:bla:biomet:v:77:y:2021:i:4:p:1431-1444
    DOI: 10.1111/biom.13388
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13388
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13388?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Peter D. Hoff, 2005. "Subset Clustering of Binary Sequences, with an Application to Genomic Abnormality Data," Biometrics, The International Biometric Society, vol. 61(4), pages 1027-1036, December.
    2. Jeffrey W. Miller & Matthew T. Harrison, 2018. "Mixture Models With a Prior on the Number of Components," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(521), pages 340-356, January.
    3. Gongjun Xu & Zhuoran Shang, 2018. "Identifying Latent Structures in Restricted Latent Class Models," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 113(523), pages 1284-1295, July.
    4. Zhenke Wu & Maria Deloria-Knoll & Laura L. Hammitt & Scott L. Zeger, 2016. "Partially latent class models for case–control studies of childhood pneumonia aetiology," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 65(1), pages 97-114, January.
    5. Elizabeth S. Garrett & Scott L. Zeger, 2000. "Latent Class Model Diagnosis," Biometrics, The International Biometric Society, vol. 56(4), pages 1055-1067, December.
    6. Dunson, David B. & Xing, Chuanhua, 2009. "Nonparametric Bayes Modeling of Multivariate Categorical Data," Journal of the American Statistical Association, American Statistical Association, vol. 104(487), pages 1042-1051.
    7. Yuqi Gu & Gongjun Xu, 2019. "The Sufficient and Necessary Condition for the Identifiability and Estimability of the DINA Model," Psychometrika, Springer;The Psychometric Society, vol. 84(2), pages 468-483, June.
    8. Chia-Yi Chiu & Jeffrey Douglas & Xiaodong Li, 2009. "Cluster Analysis for Cognitive Diagnosis: Theory and Applications," Psychometrika, Springer;The Psychometric Society, vol. 74(4), pages 633-665, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guanhua Fang & Jingchen Liu & Zhiliang Ying, 2019. "On the Identifiability of Diagnostic Classification Models," Psychometrika, Springer;The Psychometric Society, vol. 84(1), pages 19-40, March.
    2. Chenchen Ma & Jing Ouyang & Gongjun Xu, 2023. "Learning Latent and Hierarchical Structures in Cognitive Diagnosis Models," Psychometrika, Springer;The Psychometric Society, vol. 88(1), pages 175-207, March.
    3. Steven Andrew Culpepper, 2023. "A Note on Weaker Conditions for Identifying Restricted Latent Class Models for Binary Responses," Psychometrika, Springer;The Psychometric Society, vol. 88(1), pages 158-174, March.
    4. Chun Wang & Jing Lu, 2021. "Learning Attribute Hierarchies From Data: Two Exploratory Approaches," Journal of Educational and Behavioral Statistics, , vol. 46(1), pages 58-84, February.
    5. Motonori Oka & Kensuke Okada, 2023. "Scalable Bayesian Approach for the Dina Q-Matrix Estimation Combining Stochastic Optimization and Variational Inference," Psychometrika, Springer;The Psychometric Society, vol. 88(1), pages 302-331, March.
    6. Yuqi Gu, 2023. "Generic Identifiability of the DINA Model and Blessing of Latent Dependence," Psychometrika, Springer;The Psychometric Society, vol. 88(1), pages 117-131, March.
    7. Chengcheng Li & Chenchen Ma & Gongjun Xu, 2022. "Learning Large Q-Matrix by Restricted Boltzmann Machines," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 1010-1041, September.
    8. James Joseph Balamuta & Steven Andrew Culpepper, 2022. "Exploratory Restricted Latent Class Models with Monotonicity Requirements under PÒLYA–GAMMA Data Augmentation," Psychometrika, Springer;The Psychometric Society, vol. 87(3), pages 903-945, September.
    9. Xin Xu & Guanhua Fang & Jinxin Guo & Zhiliang Ying & Susu Zhang, 2024. "Diagnostic Classification Models for Testlets: Methods and Theory," Psychometrika, Springer;The Psychometric Society, vol. 89(3), pages 851-876, September.
    10. Peida Zhan & Wen-Chung Wang & Xiaomin Li, 2020. "A Partial Mastery, Higher-Order Latent Structural Model for Polytomous Attributes in Cognitive Diagnostic Assessments," Journal of Classification, Springer;The Classification Society, vol. 37(2), pages 328-351, July.
    11. Yinghan Chen & Ying Liu & Steven Andrew Culpepper & Yuguo Chen, 2021. "Inferring the Number of Attributes for the Exploratory DINA Model," Psychometrika, Springer;The Psychometric Society, vol. 86(1), pages 30-64, March.
    12. Hans Friedrich Köhn & Chia-Yi Chiu, 2021. "A Unified Theory of the Completeness of Q-Matrices for the DINA Model," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 500-518, October.
    13. Jing Ouyang & Gongjun Xu, 2022. "Identifiability of Latent Class Models with Covariates," Psychometrika, Springer;The Psychometric Society, vol. 87(4), pages 1343-1360, December.
    14. Yinghan Chen & Steven Andrew Culpepper & Yuguo Chen, 2023. "Bayesian Inference for an Unknown Number of Attributes in Restricted Latent Class Models," Psychometrika, Springer;The Psychometric Society, vol. 88(2), pages 613-635, June.
    15. Chia-Yi Chiu & Hans Friedrich Köhn & Wenchao Ma, 2023. "Commentary on “Extending the Basic Local Independence Model to Polytomous Data” by Stefanutti, de Chiusole, Anselmi, and Spoto," Psychometrika, Springer;The Psychometric Society, vol. 88(2), pages 656-671, June.
    16. Chenchen Ma & Jimmy Torre & Gongjun Xu, 2023. "Bridging Parametric and Nonparametric Methods in Cognitive Diagnosis," Psychometrika, Springer;The Psychometric Society, vol. 88(1), pages 51-75, March.
    17. Yuqi Gu & Jingchen Liu & Gongjun Xu & Zhiliang Ying, 2018. "Hypothesis Testing of the Q-matrix," Psychometrika, Springer;The Psychometric Society, vol. 83(3), pages 515-537, September.
    18. David Arthur & Hua-Hua Chang, 2024. "DINA-BAG: A Bagging Algorithm for DINA Model Parameter Estimation in Small Samples," Journal of Educational and Behavioral Statistics, , vol. 49(3), pages 342-367, June.
    19. Chia-Yi Chiu & Yan Sun & Yanhong Bian, 2018. "Cognitive Diagnosis for Small Educational Programs: The General Nonparametric Classification Method," Psychometrika, Springer;The Psychometric Society, vol. 83(2), pages 355-375, June.
    20. Kazuhiro Yamaguchi, 2023. "Bayesian Analysis Methods for Two-Level Diagnosis Classification Models," Journal of Educational and Behavioral Statistics, , vol. 48(6), pages 773-809, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:77:y:2021:i:4:p:1431-1444. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.