IDEAS home Printed from https://ideas.repec.org/a/spr/stabio/v13y2021i3d10.1007_s12561-020-09293-0.html
   My bibliography  Save this article

A Statistical Method for Association Analysis of Cell Type Compositions

Author

Listed:
  • Licai Huang

    (Fred Hutchinson Cancer Research Center
    University of Texas MD Anderson Cancer Center)

  • Paul Little

    (Fred Hutchinson Cancer Research Center)

  • Jeroen R. Huyghe

    (Fred Hutchinson Cancer Research Center)

  • Qian Shi

    (Mayo Clinic)

  • Tabitha A. Harrison

    (Fred Hutchinson Cancer Research Center)

  • Greg Yothers

    (University of Pittsburgh)

  • Thomas J. George

    (University of Florida Health Cancer Center)

  • Ulrike Peters

    (Fred Hutchinson Cancer Research Center)

  • Andrew T. Chan

    (Massachusetts General Hospital and Harvard Medical School)

  • Polly A. Newcomb

    (Fred Hutchinson Cancer Research Center)

  • Wei Sun

    (Fred Hutchinson Cancer Research Center)

Abstract

Gene expression data are often collected from tissue samples that are composed of multiple cell types. Studies of cell type composition based on gene expression data from tissue samples have recently attracted increasing research interest and led to new method development for cell type composition estimation. This new information on cell type composition can be associated with individual characteristics (e.g., genetic variants) or clinical outcomes (e.g., survival time). Such association analysis can be conducted for each cell type separately followed by multiple testing correction. An alternative approach is to evaluate this association using the composition of all the cell types, thus aggregating association signals across cell types. A key challenge of this approach is to account for the dependence across cell types. We propose a new method to quantify the distances between cell types while accounting for their dependencies, and use this information for association analysis. We demonstrate our method in two applied examples: to assess the association between immune cell type composition in tumor samples of colorectal cancer patients versus survival time and SNP genotypes. We found immune cell composition has prognostic value, and our distance metric leads to more accurate survival time prediction than other distance metrics that ignore cell type dependencies. In addition, survival time-associated SNPs are enriched among the SNPs associated with immune cell composition.

Suggested Citation

  • Licai Huang & Paul Little & Jeroen R. Huyghe & Qian Shi & Tabitha A. Harrison & Greg Yothers & Thomas J. George & Ulrike Peters & Andrew T. Chan & Polly A. Newcomb & Wei Sun, 2021. "A Statistical Method for Association Analysis of Cell Type Compositions," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(3), pages 373-385, December.
  • Handle: RePEc:spr:stabio:v:13:y:2021:i:3:d:10.1007_s12561-020-09293-0
    DOI: 10.1007/s12561-020-09293-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s12561-020-09293-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s12561-020-09293-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Orit Rozenblatt-Rosen & Michael J. T. Stubbington & Aviv Regev & Sarah A. Teichmann, 2017. "The Human Cell Atlas: from vision to reality," Nature, Nature, vol. 550(7677), pages 451-453, October.
    2. Wei Lin & Pixu Shi & Rui Feng & Hongzhe Li, 2014. "Variable selection in regression with compositional covariates," Biometrika, Biometrika Trust, vol. 101(4), pages 785-797.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Christoph Ziegenhain & Rickard Sandberg, 2021. "BAMboozle removes genetic variation from human sequence data for open data sharing," Nature Communications, Nature, vol. 12(1), pages 1-10, December.
    2. Katharina T. Schmid & Barbara Höllbacher & Cristiana Cruceanu & Anika Böttcher & Heiko Lickert & Elisabeth B. Binder & Fabian J. Theis & Matthias Heinig, 2021. "scPower accelerates and optimizes the design of multi-sample single cell transcriptomic studies," Nature Communications, Nature, vol. 12(1), pages 1-18, December.
    3. Jiarui Lu & Pixu Shi & Hongzhe Li, 2019. "Generalized linear models with linear constraints for microbiome compositional data," Biometrics, The International Biometric Society, vol. 75(1), pages 235-244, March.
    4. Jacob Fiksel & Scott Zeger & Abhirup Datta, 2022. "A transformation‐free linear regression for compositional outcomes and predictors," Biometrics, The International Biometric Society, vol. 78(3), pages 974-987, September.
    5. Aiko Sekita & Hiroshi Kawasaki & Ayano Fukushima-Nomura & Kiyoshi Yashiro & Keiji Tanese & Susumu Toshima & Koichi Ashizaki & Tomohiro Miyai & Junshi Yazaki & Atsuo Kobayashi & Shinichi Namba & Tatsuh, 2023. "Multifaceted analysis of cross-tissue transcriptomes reveals phenotype–endotype associations in atopic dermatitis," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    6. Ajita Shree & Musale Krushna Pavan & Hamim Zafar, 2023. "scDREAMER for atlas-level integration of single-cell datasets using deep generative model paired with adversarial classifier," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    7. Yingxin Lin & Yue Cao & Elijah Willie & Ellis Patrick & Jean Y. H. Yang, 2023. "Atlas-scale single-cell multi-sample multi-condition data integration using scMerge2," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    8. Srinivasan, Arun & Xue, Lingzhou & Zhan, Xiang, 2023. "Identification of microbial features in multivariate regression under false discovery rate control," Computational Statistics & Data Analysis, Elsevier, vol. 181(C).
    9. Luisa Santus & Maria Sopena-Rios & Raquel García-Pérez & Aaron E. Lin & Gordon C. Adams & Kayla G. Barnes & Katherine J. Siddle & Shirlee Wohl & Ferran Reverter & John L. Rinn & Richard S. Bennett & L, 2023. "Single-cell profiling of lncRNA expression during Ebola virus infection in rhesus macaques," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    10. Zemin Zheng & Jinchi Lv & Wei Lin, 2021. "Nonsparse Learning with Latent Variables," Operations Research, INFORMS, vol. 69(1), pages 346-359, January.
    11. Huiwen Wang & Zhichao Wang & Shanshan Wang, 2021. "Sliced inverse regression method for multivariate compositional data modeling," Statistical Papers, Springer, vol. 62(1), pages 361-393, February.
    12. Rieser, Christopher & Filzmoser, Peter, 2023. "Extending compositional data analysis from a graph signal processing perspective," Journal of Multivariate Analysis, Elsevier, vol. 198(C).
    13. Xiaofei Wu & Rongmei Liang & Hu Yang, 2022. "Penalized and constrained LAD estimation in fixed and high dimension," Statistical Papers, Springer, vol. 63(1), pages 53-95, February.
    14. Rongbo Shen & Lin Liu & Zihan Wu & Ying Zhang & Zhiyuan Yuan & Junfu Guo & Fan Yang & Chao Zhang & Bichao Chen & Wanwan Feng & Chao Liu & Jing Guo & Guozhen Fan & Yong Zhang & Yuxiang Li & Xun Xu & Ji, 2022. "Spatial-ID: a cell typing method for spatially resolved transcriptomics via transfer learning and spatial embedding," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    15. Bingkai Wang & Brian S. Caffo & Xi Luo & Chin‐Fu Liu & Andreia V. Faria & Michael I. Miller & Yi Zhao & for the Alzheimer's Disease Neuroimaging Initiative*, 2022. "Regularized regression on compositional trees with application to MRI analysis," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(3), pages 541-561, June.
    16. Haixiang Zhang & Jun Chen & Zhigang Li & Lei Liu, 2021. "Testing for Mediation Effect with Application to Human Microbiome Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 13(2), pages 313-328, July.
    17. Jeon, Jong-June & Kim, Yongdai & Won, Sungho & Choi, Hosik, 2020. "Primal path algorithm for compositional data analysis," Computational Statistics & Data Analysis, Elsevier, vol. 148(C).
    18. Cristofari, Andrea, 2023. "A decomposition method for lasso problems with zero-sum constraint," European Journal of Operational Research, Elsevier, vol. 306(1), pages 358-369.
    19. Liangliang Zhang & Yushu Shi & Robert R. Jenq & Kim‐Anh Do & Christine B. Peterson, 2021. "Bayesian compositional regression with structured priors for microbiome feature selection," Biometrics, The International Biometric Society, vol. 77(3), pages 824-838, September.
    20. Mishra, Aditya & Müller, Christian L., 2022. "Robust regression with compositional covariates," Computational Statistics & Data Analysis, Elsevier, vol. 165(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:stabio:v:13:y:2021:i:3:d:10.1007_s12561-020-09293-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.