IDEAS home Printed from https://ideas.repec.org/a/taf/jnlasa/v108y2013i503p775-788.html
   My bibliography  Save this article

A Nonparametric Bayesian Model for Local Clustering With Application to Proteomics

Author

Listed:
  • Juhee Lee
  • Peter Müller
  • Yitan Zhu
  • Yuan Ji

Abstract

We propose a nonparametric Bayesian local clustering (NoB-LoC) approach for heterogeneous data. NoB-LoC implements inference for nested clusters as posterior inference under a Bayesian model. Using protein expression data as an example, the NoB-LoC model defines a protein (column) cluster as a set of proteins that give rise to the same partition of the samples (rows). In other words, the sample partitions are nested within protein clusters. The common clustering of the samples gives meaning to the protein clusters. Any pair of samples might belong to the same cluster for one protein set but to different clusters for another protein set. These local features are different from features obtained by global clustering approaches such as hierarchical clustering, which create only one partition of samples that applies for all the proteins in the dataset. In addition, the NoB-LoC model is different from most other local or nested clustering methods, which define clusters based on common parameters in the sampling model. As an added and important feature, the NoB-LoC method probabilistically excludes sets of irrelevant proteins and samples that do not meaningfully cocluster with other proteins and samples, thus improving the inference on the clustering of the remaining proteins and samples. Inference is guided by a joint probability model for all the random elements. We provide a simulation study and a motivating example to demonstrate the unique features of the NoB-LoC model. Supplementary materials for this article are available online.

Suggested Citation

  • Juhee Lee & Peter Müller & Yitan Zhu & Yuan Ji, 2013. "A Nonparametric Bayesian Model for Local Clustering With Application to Proteomics," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 108(503), pages 775-788, September.
  • Handle: RePEc:taf:jnlasa:v:108:y:2013:i:503:p:775-788
    DOI: 10.1080/01621459.2013.784705
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/01621459.2013.784705
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/01621459.2013.784705?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Daiane Aparecida Zuanetti & Peter Müller & Yitan Zhu & Shengjie Yang & Yuan Ji, 2018. "Clustering distributions with the marginalized nested Dirichlet process," Biometrics, The International Biometric Society, vol. 74(2), pages 584-594, June.
    2. Peter Müeller & Fernando A. Quintana & Garritt Page, 2018. "Nonparametric Bayesian inference in applications," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 27(2), pages 175-206, June.
    3. Subharup Guha & Rex Jung & David Dunson, 2022. "Predicting phenotypes from brain connection structure," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 639-668, June.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:jnlasa:v:108:y:2013:i:503:p:775-788. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/UASA20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.