IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v15y2024i1d10.1038_s41467-024-50612-6.html
   My bibliography  Save this article

MOCHA’s advanced statistical modeling of scATAC-seq data enables functional genomic inference in large human cohorts

Author

Listed:
  • Samir Rachid Zaim

    (Allen Institute for Immunology)

  • Mark-Phillip Pebworth

    (Allen Institute for Immunology)

  • Imran McGrath

    (Allen Institute for Immunology)

  • Lauren Okada

    (Allen Institute for Immunology)

  • Morgan Weiss

    (Allen Institute for Immunology)

  • Julian Reading

    (Allen Institute for Immunology)

  • Julie L. Czartoski

    (Fred Hutchinson Cancer Research Center)

  • Troy R. Torgerson

    (Allen Institute for Immunology)

  • M. Juliana McElrath

    (Fred Hutchinson Cancer Research Center)

  • Thomas F. Bumol

    (Allen Institute for Immunology)

  • Peter J. Skene

    (Allen Institute for Immunology)

  • Xiao-jun Li

    (Allen Institute for Immunology)

Abstract

Single-cell assay for transposase-accessible chromatin using sequencing (scATAC-seq) is being increasingly used to study gene regulation. However, major analytical gaps limit its utility in studying gene regulatory programs in complex diseases. In response, MOCHA (Model-based single cell Open CHromatin Analysis) presents major advances over existing analysis tools, including: 1) improving identification of sample-specific open chromatin, 2) statistical modeling of technical drop-out with zero-inflated methods, 3) mitigation of false positives in single cell analysis, 4) identification of alternative transcription-starting-site regulation, and 5) modules for inferring temporal gene regulatory networks from longitudinal data. These advances, in addition to open chromatin analyses, provide a robust framework after quality control and cell labeling to study gene regulatory programs in human disease. We benchmark MOCHA with four state-of-the-art tools to demonstrate its advances. We also construct cross-sectional and longitudinal gene regulatory networks, identifying potential mechanisms of COVID-19 response. MOCHA provides researchers with a robust analytical tool for functional genomic inference from scATAC-seq data.

Suggested Citation

  • Samir Rachid Zaim & Mark-Phillip Pebworth & Imran McGrath & Lauren Okada & Morgan Weiss & Julian Reading & Julie L. Czartoski & Troy R. Torgerson & M. Juliana McElrath & Thomas F. Bumol & Peter J. Ske, 2024. "MOCHA’s advanced statistical modeling of scATAC-seq data enables functional genomic inference in large human cohorts," Nature Communications, Nature, vol. 15(1), pages 1-24, December.
  • Handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-50612-6
    DOI: 10.1038/s41467-024-50612-6
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-50612-6
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-50612-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Wouter Meuleman & Alexander Muratov & Eric Rynes & Jessica Halow & Kristen Lee & Daniel Bates & Morgan Diegel & Douglas Dunn & Fidencio Neri & Athanasios Teodosiadis & Alex Reynolds & Eric Haugen & Je, 2020. "Index and biological spectrum of human DNase I hypersensitive sites," Nature, Nature, vol. 584(7820), pages 244-251, August.
    2. Alexander Lachmann & Denis Torre & Alexandra B. Keenan & Kathleen M. Jagodnik & Hoyjin J. Lee & Lily Wang & Moshe C. Silverstein & Avi Ma’ayan, 2018. "Massive mining of publicly available RNA-seq data from human and mouse," Nature Communications, Nature, vol. 9(1), pages 1-10, December.
    3. Avantika Lal & Zachary D. Chiang & Nikolai Yakovenko & Fabiana M. Duarte & Johnny Israeli & Jason D. Buenrostro, 2021. "Deep learning-based enhancement of epigenomics data with AtacWorks," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    4. Suhas V. Vasaikar & Adam K. Savage & Qiuyu Gong & Elliott Swanson & Aarthi Talla & Cara Lord & Alexander T. Heubeck & Julian Reading & Lucas T. Graybuck & Paul Meijer & Troy R. Torgerson & Peter J. Sk, 2023. "A comprehensive platform for analyzing longitudinal multi-omics data," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    5. Jordan W. Squair & Matthieu Gautier & Claudia Kathe & Mark A. Anderson & Nicholas D. James & Thomas H. Hutson & Rémi Hudelle & Taha Qaiser & Kaya J. E. Matson & Quentin Barraud & Ariel J. Levine & Gio, 2021. "Confronting false discoveries in single-cell differential expression," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    6. Kip D. Zimmerman & Mark A. Espeland & Carl D. Langefeld, 2021. "A practical solution to pseudoreplication bias in single-cell studies," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    7. Michael Lawrence & Wolfgang Huber & Hervé Pagès & Patrick Aboyoun & Marc Carlson & Robert Gentleman & Martin T Morgan & Vincent J Carey, 2013. "Software for Computing and Annotating Genomic Ranges," PLOS Computational Biology, Public Library of Science, vol. 9(8), pages 1-10, August.
    8. Taylor Sandra & Pollard Katherine, 2009. "Hypothesis Tests for Point-Mass Mixture Data with Application to `Omics Data with Many Zero Values," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 8(1), pages 1-45, February.
    9. Rongxin Fang & Sebastian Preissl & Yang Li & Xiaomeng Hou & Jacinta Lucero & Xinxin Wang & Amir Motamedi & Andrew K. Shiau & Xinzhu Zhou & Fangming Xie & Eran A. Mukamel & Kai Zhang & Yanxiao Zhang & , 2021. "Comprehensive analysis of single cell ATAC-seq data with SnapATAC," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    10. Jason D. Buenrostro & Beijing Wu & Ulrike M. Litzenburger & Dave Ruff & Michael L. Gonzales & Michael P. Snyder & Howard Y. Chang & William J. Greenleaf, 2015. "Single-cell chromatin accessibility reveals principles of regulatory variation," Nature, Nature, vol. 523(7561), pages 486-490, July.
    11. Zhijian Li & Christoph Kuppe & Susanne Ziegler & Mingbo Cheng & Nazanin Kabgani & Sylvia Menzel & Martin Zenke & Rafael Kramann & Ivan G. Costa, 2021. "Chromatin-accessibility estimation from single-cell ATAC-seq data with scOpen," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    12. Lei Xiong & Kui Xu & Kang Tian & Yanqiu Shao & Lei Tang & Ge Gao & Michael Zhang & Tao Jiang & Qiangfeng Cliff Zhang, 2019. "SCALE method for single-cell ATAC-seq analysis via latent feature extraction," Nature Communications, Nature, vol. 10(1), pages 1-10, December.
    13. Mariano I. Gabitto & Anders Rasmussen & Orly Wapinski & Kathryn Allaway & Nicholas Carriero & Gordon J. Fishell & Richard Bonneau, 2020. "Characterizing chromatin landscape from aggregate and single-cell genomic assays using flexible duration modeling," Nature Communications, Nature, vol. 11(1), pages 1-10, December.
    14. John D. Storey, 2002. "A direct approach to false discovery rates," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 64(3), pages 479-498, August.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Songming Tang & Xuejian Cui & Rongxiang Wang & Sijie Li & Siyu Li & Xin Huang & Shengquan Chen, 2024. "scCASE: accurate and interpretable enhancement for single-cell chromatin accessibility sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    2. Zhijian Li & Christoph Kuppe & Susanne Ziegler & Mingbo Cheng & Nazanin Kabgani & Sylvia Menzel & Martin Zenke & Rafael Kramann & Ivan G. Costa, 2021. "Chromatin-accessibility estimation from single-cell ATAC-seq data with scOpen," Nature Communications, Nature, vol. 12(1), pages 1-14, December.
    3. Eloise Berson & Anjali Sreenivas & Thanaphong Phongpreecha & Amalia Perna & Fiorella C. Grandi & Lei Xue & Neal G. Ravindra & Neelufar Payrovnaziri & Samson Mataraso & Yeasul Kim & Camilo Espinosa & A, 2023. "Whole genome deconvolution unveils Alzheimer’s resilient epigenetic signature," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    4. Suhas V. Vasaikar & Adam K. Savage & Qiuyu Gong & Elliott Swanson & Aarthi Talla & Cara Lord & Alexander T. Heubeck & Julian Reading & Lucas T. Graybuck & Paul Meijer & Troy R. Torgerson & Peter J. Sk, 2023. "A comprehensive platform for analyzing longitudinal multi-omics data," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    5. Lei Xiong & Kang Tian & Yuzhe Li & Weixi Ning & Xin Gao & Qiangfeng Cliff Zhang, 2022. "Online single-cell data integration through projecting heterogeneous datasets into a common cell-embedding space," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    6. Alan Yue Yang Teo & Jordan W. Squair & Gregoire Courtine & Michael A. Skinnider, 2024. "Best practices for differential accessibility analysis in single-cell epigenomics," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    7. Yuki Matsushita & Jialin Liu & Angel Ka Yan Chu & Chiaki Tsutsumi-Arai & Mizuki Nagata & Yuki Arai & Wanida Ono & Kouhei Yamamoto & Thomas L. Saunders & Joshua D. Welch & Noriaki Ono, 2023. "Bone marrow endosteal stem cells dictate active osteogenesis and aggressive tumorigenesis," Nature Communications, Nature, vol. 14(1), pages 1-23, December.
    8. Milton Pividori & Sumei Lu & Binglan Li & Chun Su & Matthew E. Johnson & Wei-Qi Wei & Qiping Feng & Bahram Namjou & Krzysztof Kiryluk & Iftikhar J. Kullo & Yuan Luo & Blair D. Sullivan & Benjamin F. V, 2023. "Projecting genetic associations through gene expression patterns highlights disease etiology and drug mechanisms," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    9. Alan E. Murphy & Nathan G. Skene, 2022. "A balanced measure shows superior performance of pseudobulk methods in single-cell RNA-sequencing analysis," Nature Communications, Nature, vol. 13(1), pages 1-4, December.
    10. Parker C. Wilson & Yoshiharu Muto & Haojia Wu & Anil Karihaloo & Sushrut S. Waikar & Benjamin D. Humphreys, 2022. "Multimodal single cell sequencing implicates chromatin accessibility and genetic background in diabetic kidney disease progression," Nature Communications, Nature, vol. 13(1), pages 1-20, December.
    11. Alan Selewa & Kaixuan Luo & Michael Wasney & Linsin Smith & Xiaotong Sun & Chenwei Tang & Heather Eckart & Ivan P. Moskowitz & Anindita Basu & Xin He & Sebastian Pott, 2023. "Single-cell genomics improves the discovery of risk variants and genes of atrial fibrillation," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    12. Youtao Lu & Jaehee Lee & Jifen Li & Srinivasa Rao Allu & Jinhui Wang & HyunBum Kim & Kevin L. Bullaughey & Stephen A. Fisher & C. Erik Nordgren & Jean G. Rosario & Stewart A. Anderson & Alexandra V. U, 2023. "CHEX-seq detects single-cell genomic single-stranded DNA with catalytical potential," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    13. Shengen Shawn Hu & Lin Liu & Qi Li & Wenjing Ma & Michael J. Guertin & Clifford A. Meyer & Ke Deng & Tingting Zhang & Chongzhi Zang, 2022. "Intrinsic bias estimation for improved analysis of bulk and single-cell chromatin accessibility profiles using SELMA," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    14. Youngchao Ge & Sandrine Dudoit & Terence Speed, 2003. "Resampling-based multiple testing for microarray data analysis," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 12(1), pages 1-77, June.
    15. Guillermo Serrano & Nerea Berastegui & Aintzane Díaz-Mazkiaran & Paula García-Olloqui & Carmen Rodriguez-Res & Sofia Huerga-Dominguez & Marina Ainciburu & Amaia Vilas-Zornoza & Patxi San Martin-Uriz &, 2024. "Single-cell transcriptional profile of CD34+ hematopoietic progenitor cells from del(5q) myelodysplastic syndromes and impact of lenalidomide," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    16. Bajgrowicz, Pierre & Scaillet, Olivier, 2012. "Technical trading revisited: False discoveries, persistence tests, and transaction costs," Journal of Financial Economics, Elsevier, vol. 106(3), pages 473-491.
    17. Kip D. Zimmerman & Ciaran Evans & Carl D. Langefeld, 2022. "Reply to: A balanced measure shows superior performance of pseudobulk methods in single-cell RNA-sequencing analysis," Nature Communications, Nature, vol. 13(1), pages 1-2, December.
    18. Wen Shi & Xi Chen & Jennifer Shang, 2019. "An Efficient Morris Method-Based Framework for Simulation Factor Screening," INFORMS Journal on Computing, INFORMS, vol. 31(4), pages 745-770, October.
    19. Dørum Guro & Snipen Lars & Solheim Margrete & Saebo Solve, 2011. "Smoothing Gene Expression Data with Network Information Improves Consistency of Regulated Genes," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-26, August.
    20. Jianqing Fan & Xu Han, 2017. "Estimation of the false discovery proportion with unknown dependence," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(4), pages 1143-1164, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:15:y:2024:i:1:d:10.1038_s41467-024-50612-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.