IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-50612-6.html
   My bibliography  Save this article

MOCHA’s advanced statistical modeling of scATAC-seq data enables functional genomic inference in large human cohorts

Author

Listed:
  • Samir Rachid Zaim

    (Allen Institute for Immunology)

  • Mark-Phillip Pebworth

    (Allen Institute for Immunology)

  • Imran McGrath

    (Allen Institute for Immunology)

  • Lauren Okada

    (Allen Institute for Immunology)

  • Morgan Weiss

    (Allen Institute for Immunology)

  • Julian Reading

    (Allen Institute for Immunology)

  • Julie L. Czartoski

    (Fred Hutchinson Cancer Research Center)

  • Troy R. Torgerson

    (Allen Institute for Immunology)

  • M. Juliana McElrath

    (Fred Hutchinson Cancer Research Center)

  • Thomas F. Bumol

    (Allen Institute for Immunology)

  • Peter J. Skene

    (Allen Institute for Immunology)

  • Xiao-jun Li

    (Allen Institute for Immunology)

Abstract

Single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) is being increasingly used to study gene regulation. However, major analytical gaps limit its utility in studying gene regulatory programs in complex diseases. In response, MOCHA (Model-based single cell Open CHromatin Analysis) presents major advances over existing analysis tools, including: 1) improving identification of sample-specific open chromatin, 2) statistical modeling of technical drop-out with zero-inflated methods, 3) mitigation of false positives in single cell analysis, 4) identification of alternative transcription-starting-site regulation, and 5) modules for inferring temporal gene regulatory networks from longitudinal data. These advances, in addition to open chromatin analyses, provide a robust framework after quality control and cell labeling to study gene regulatory programs in human disease. We benchmark MOCHA with four state-of-the-art tools to demonstrate its advances. We also construct cross-sectional and longitudinal gene regulatory networks, identifying potential mechanisms of COVID-19 response. MOCHA provides researchers with a robust analytical tool for functional genomic inference from scATAC-seq data.

Suggested Citation

  • Samir Rachid Zaim & Mark-Phillip Pebworth & Imran McGrath & Lauren Okada & Morgan Weiss & Julian Reading & Julie L. Czartoski & Troy R. Torgerson & M. Juliana McElrath & Thomas F. Bumol & Peter J. Ske, 2024. "MOCHA’s advanced statistical modeling of scATAC-seq data enables functional genomic inference in large human cohorts," Nature Communications, Nature, vol. 15(1), pages 1-24, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-50612-6
    DOI: 10.1038/s41467-024-50612-6
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-50612-6
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-50612-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Alexander Lachmann & Denis Torre & Alexandra B. Keenan & Kathleen M. Jagodnik & Hoyjin J. Lee & Lily Wang & Moshe C. Silverstein & Avi Ma’ayan, 2018. "Massive mining of publicly available RNA-seq data from human and mouse," Nature Communications, Nature, vol. 9(1), pages 1-10, December.
    2. Suhas V. Vasaikar & Adam K. Savage & Qiuyu Gong & Elliott Swanson & Aarthi Talla & Cara Lord & Alexander T. Heubeck & Julian Reading & Lucas T. Graybuck & Paul Meijer & Troy R. Torgerson & Peter J. Sk, 2023. "A comprehensive platform for analyzing longitudinal multi-omics data," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    3. Kip D. Zimmerman & Mark A. Espeland & Carl D. Langefeld, 2021. "A practical solution to pseudoreplication bias in single-cell studies," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    4. Taylor Sandra & Pollard Katherine, 2009. "Hypothesis Tests for Point-Mass Mixture Data with Application to `Omics Data with Many Zero Values," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-45, February.
    5. Rongxin Fang & Sebastian Preissl & Yang Li & Xiaomeng Hou & Jacinta Lucero & Xinxin Wang & Amir Motamedi & Andrew K. Shiau & Xinzhu Zhou & Fangming Xie & Eran A. Mukamel & Kai Zhang & Yanxiao Zhang & , 2021. "Comprehensive analysis of single cell ATAC-seq data with SnapATAC," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    6. Jason D. Buenrostro & Beijing Wu & Ulrike M. Litzenburger & Dave Ruff & Michael L. Gonzales & Michael P. Snyder & Howard Y. Chang & William J. Greenleaf, 2015. "Single-cell chromatin accessibility reveals principles of regulatory variation," Nature, Nature, vol. 523(7561), pages 486-490, July.
    7. Lei Xiong & Kui Xu & Kang Tian & Yanqiu Shao & Lei Tang & Ge Gao & Michael Zhang & Tao Jiang & Qiangfeng Cliff Zhang, 2019. "SCALE method for single-cell ATAC-seq analysis via latent feature extraction," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    8. Wouter Meuleman & Alexander Muratov & Eric Rynes & Jessica Halow & Kristen Lee & Daniel Bates & Morgan Diegel & Douglas Dunn & Fidencio Neri & Athanasios Teodosiadis & Alex Reynolds & Eric Haugen & Je, 2020. "Index and biological spectrum of human DNase I hypersensitive sites," Nature, Nature, vol. 584(7820), pages 244-251, August.
    9. Avantika Lal & Zachary D. Chiang & Nikolai Yakovenko & Fabiana M. Duarte & Johnny Israeli & Jason D. Buenrostro, 2021. "Deep learning-based enhancement of epigenomics data with AtacWorks," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    10. Jordan W. Squair & Matthieu Gautier & Claudia Kathe & Mark A. Anderson & Nicholas D. James & Thomas H. Hutson & Rémi Hudelle & Taha Qaiser & Kaya J. E. Matson & Quentin Barraud & Ariel J. Levine & Gio, 2021. "Confronting false discoveries in single-cell differential expression," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    11. Michael Lawrence & Wolfgang Huber & Hervé Pagès & Patrick Aboyoun & Marc Carlson & Robert Gentleman & Martin T Morgan & Vincent J Carey, 2013. "Software for Computing and Annotating Genomic Ranges," PLOS Computational Biology, Public Library of Science, vol. 9(8), pages 1-10, August.
    12. Zhijian Li & Christoph Kuppe & Susanne Ziegler & Mingbo Cheng & Nazanin Kabgani & Sylvia Menzel & Martin Zenke & Rafael Kramann & Ivan G. Costa, 2021. "Chromatin-accessibility estimation from single-cell ATAC-seq data with scOpen," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    13. Mariano I. Gabitto & Anders Rasmussen & Orly Wapinski & Kathryn Allaway & Nicholas Carriero & Gordon J. Fishell & Richard Bonneau, 2020. "Characterizing chromatin landscape from aggregate and single-cell genomic assays using flexible duration modeling," Nature Communications, Nature, vol. 11(1), pages 1-10, December.
    14. John D. Storey, 2002. "A direct approach to false discovery rates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 479-498, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Songming Tang & Xuejian Cui & Rongxiang Wang & Sijie Li & Siyu Li & Xin Huang & Shengquan Chen, 2024. "scCASE: accurate and interpretable enhancement for single-cell chromatin accessibility sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    2. Zhijian Li & Christoph Kuppe & Susanne Ziegler & Mingbo Cheng & Nazanin Kabgani & Sylvia Menzel & Martin Zenke & Rafael Kramann & Ivan G. Costa, 2021. "Chromatin-accessibility estimation from single-cell ATAC-seq data with scOpen," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    3. Suhas V. Vasaikar & Adam K. Savage & Qiuyu Gong & Elliott Swanson & Aarthi Talla & Cara Lord & Alexander T. Heubeck & Julian Reading & Lucas T. Graybuck & Paul Meijer & Troy R. Torgerson & Peter J. Sk, 2023. "A comprehensive platform for analyzing longitudinal multi-omics data," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    4. Eloise Berson & Anjali Sreenivas & Thanaphong Phongpreecha & Amalia Perna & Fiorella C. Grandi & Lei Xue & Neal G. Ravindra & Neelufar Payrovnaziri & Samson Mataraso & Yeasul Kim & Camilo Espinosa & A, 2023. "Whole genome deconvolution unveils Alzheimer’s resilient epigenetic signature," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    5. Lei Xiong & Kang Tian & Yuzhe Li & Weixi Ning & Xin Gao & Qiangfeng Cliff Zhang, 2022. "Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    6. Milton Pividori & Sumei Lu & Binglan Li & Chun Su & Matthew E. Johnson & Wei-Qi Wei & Qiping Feng & Bahram Namjou & Krzysztof Kiryluk & Iftikhar J. Kullo & Yuan Luo & Blair D. Sullivan & Benjamin F. V, 2023. "Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    7. Alan E. Murphy & Nathan G. Skene, 2022. "A balanced measure shows superior performance of pseudobulk methods in single-cell RNA-sequencing analysis," Nature Communications, Nature, vol. 13(1), pages 1-4, December.
    8. Parker C. Wilson & Yoshiharu Muto & Haojia Wu & Anil Karihaloo & Sushrut S. Waikar & Benjamin D. Humphreys, 2022. "Multimodal single cell sequencing implicates chromatin accessibility and genetic background in diabetic kidney disease progression," Nature Communications, Nature, vol. 13(1), pages 1-20, December.
    9. Shengen Shawn Hu & Lin Liu & Qi Li & Wenjing Ma & Michael J. Guertin & Clifford A. Meyer & Ke Deng & Tingting Zhang & Chongzhi Zang, 2022. "Intrinsic bias estimation for improved analysis of bulk and single-cell chromatin accessibility profiles using SELMA," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    10. Yuki Matsushita & Jialin Liu & Angel Ka Yan Chu & Chiaki Tsutsumi-Arai & Mizuki Nagata & Yuki Arai & Wanida Ono & Kouhei Yamamoto & Thomas L. Saunders & Joshua D. Welch & Noriaki Ono, 2023. "Bone marrow endosteal stem cells dictate active osteogenesis and aggressive tumorigenesis," Nature Communications, Nature, vol. 14(1), pages 1-23, December.
    11. Alan Selewa & Kaixuan Luo & Michael Wasney & Linsin Smith & Xiaotong Sun & Chenwei Tang & Heather Eckart & Ivan P. Moskowitz & Anindita Basu & Xin He & Sebastian Pott, 2023. "Single-cell genomics improves the discovery of risk variants and genes of atrial fibrillation," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    12. Youtao Lu & Jaehee Lee & Jifen Li & Srinivasa Rao Allu & Jinhui Wang & HyunBum Kim & Kevin L. Bullaughey & Stephen A. Fisher & C. Erik Nordgren & Jean G. Rosario & Stewart A. Anderson & Alexandra V. U, 2023. "CHEX-seq detects single-cell genomic single-stranded DNA with catalytical potential," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    13. Guillermo Serrano & Nerea Berastegui & Aintzane Díaz-Mazkiaran & Paula García-Olloqui & Carmen Rodriguez-Res & Sofia Huerga-Dominguez & Marina Ainciburu & Amaia Vilas-Zornoza & Patxi San Martin-Uriz &, 2024. "Single-cell transcriptional profile of CD34+ hematopoietic progenitor cells from del(5q) myelodysplastic syndromes and impact of lenalidomide," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    14. Wen Shi & Xi Chen & Jennifer Shang, 2019. "An Efficient Morris Method-Based Framework for Simulation Factor Screening," INFORMS Journal on Computing, INFORMS, vol. 31(4), pages 745-770, October.
    15. Jianqing Fan & Xu Han, 2017. "Estimation of the false discovery proportion with unknown dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1143-1164, September.
    16. Zhiyuan Luo & Jiacheng Zhang & Jingyi Fei & Shengdong Ke, 2022. "Deep learning modeling m6A deposition reveals the importance of downstream cis-element sequences," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    17. Shigeyuki Matsui & Hisashi Noma, 2011. "Estimating Effect Sizes of Differentially Expressed Genes for Power and Sample-Size Assessments in Microarray Experiments," Biometrics, The International Biometric Society, vol. 67(4), pages 1225-1235, December.
    18. Lianming Wang & David B. Dunson, 2010. "Semiparametric Bayes Multiple Testing: Applications to Tumor Data," Biometrics, The International Biometric Society, vol. 66(2), pages 493-501, June.
    19. B. Moerkerke & E. Goetghebeur & J. De Riek & I. Roldán‐Ruiz, 2006. "Significance and impotence: towards a balanced view of the null and the alternative hypotheses in marker selection for plant breeding," Journal of the Royal Statistical Society Series A, Royal Statistical Society, vol. 169(1), pages 61-79, January.
    20. Zaili Fang & Inyoung Kim & Jeesun Jung, 2018. "Semiparametric Kernel-Based Regression for Evaluating Interaction Between Pathway Effect and Covariate," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 23(1), pages 129-152, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-50612-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.