IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-024-55513-2.html
   My bibliography  Save this article

Statistical framework for calling allelic imbalance in high-throughput sequencing data

Author

Listed:
  • Andrey Buyan

    (Russian Academy of Sciences
    Life Improvement by Future Technologies (LIFT) Center)

  • Georgy Meshcheryakov

    (Russian Academy of Sciences)

  • Viacheslav Safronov

    (Lomonosov Moscow State University)

  • Sergey Abramov

    (Russian Academy of Sciences
    Altius Institute for Biomedical Sciences
    Moscow Center for Advanced Studies)

  • Alexandr Boytsov

    (Russian Academy of Sciences
    Altius Institute for Biomedical Sciences
    Moscow Center for Advanced Studies)

  • Vladimir Nozdrin

    (Lomonosov Moscow State University)

  • Eugene F. Baulin

    (Moscow Center for Advanced Studies
    International Institute of Molecular and Cell Biology in Warsaw)

  • Semyon Kolmykov

    (Sirius University of Science and Technology)

  • Jeff Vierstra

    (Altius Institute for Biomedical Sciences)

  • Fedor Kolpakov

    (Sirius University of Science and Technology
    Federal Research Center for Information and Computational Technologies)

  • Vsevolod J. Makeev

    (Russian Academy of Sciences
    Moscow Center for Advanced Studies
    Ufa Federal Research Centre of the Russian Academy of Sciences
    University of Manchester)

  • Ivan V. Kulakovskiy

    (Russian Academy of Sciences
    Life Improvement by Future Technologies (LIFT) Center
    Russian Academy of Sciences)

Abstract

High-throughput sequencing facilitates large-scale studies of gene regulation and allows tracing the associations of individual genomic variants with changes in gene regulation and expression. Compared to classic association studies, the assessment of an allelic imbalance at heterozygous variants captures functional variant effects with smaller sample sizes, higher sensitivity, and better resolution. Yet, identification of allele-specific variants from allelic read counts remains challenging due to data-dependent biases and overdispersion arising from technical and biological variability. We present MIXALIME, a novel computational framework for calling allele-specific variants in diverse omics data with a repertoire of statistical models accounting for read mapping bias and copy number variation. We benchmark MIXALIME with DNase-Seq, ATAC-Seq, and CAGE-Seq data, and we demonstrate that the allelic imbalance highlights causal variants in GWAS results. Finally, as a showcase of the large-scale practical application of MIXALIME, we present an atlas of variants exhibiting allele-specific chromatin accessibility, built from thousands of available datasets obtained from diverse cell types.

Suggested Citation

  • Andrey Buyan & Georgy Meshcheryakov & Viacheslav Safronov & Sergey Abramov & Alexandr Boytsov & Vladimir Nozdrin & Eugene F. Baulin & Semyon Kolmykov & Jeff Vierstra & Fedor Kolpakov & Vsevolod J. Mak, 2025. "Statistical framework for calling allelic imbalance in high-throughput sequencing data," Nature Communications, Nature, vol. 16(1), pages 1-19, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-024-55513-2
    DOI: 10.1038/s41467-024-55513-2
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-024-55513-2
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-024-55513-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Xiaoquan Wen & Roger Pique-Regi & Francesca Luca, 2017. "Integrating molecular QTL data into genome-wide genetic association analysis: Probabilistic assessment of enrichment and colocalization," PLOS Genetics, Public Library of Science, vol. 13(3), pages 1-25, March.
    2. Marco Garieri & Olivier Delaneau & Federico Santoni & Richard J. Fish & David Mull & Piero Carninci & Emmanouil T. Dermitzakis & Stylianos E. Antonarakis & Alexandre Fort, 2017. "The effect of genetic variation on promoter usage and enhancer activity," Nature Communications, Nature, vol. 8(1), pages 1-9, December.
    3. Dafni A. Glinos & Garrett Garborcauskas & Paul Hoffman & Nava Ehsan & Lihua Jiang & Alper Gokden & Xiaoguang Dai & François Aguet & Kathleen L. Brown & Kiran Garimella & Tera Bowers & Maura Costello &, 2022. "Transcriptome variation in human tissues revealed by long-read sequencing," Nature, Nature, vol. 608(7922), pages 353-359, August.
    4. Claudia Calabrese & Natalie R. Davidson & Deniz Demircioğlu & Nuno A. Fonseca & Yao He & André Kahles & Kjong-Van Lehmann & Fenglin Liu & Yuichi Shiraishi & Cameron M. Soulette & Lara Urban & Liliana , 2020. "Genomic basis for RNA alterations in cancer," Nature, Nature, vol. 578(7793), pages 129-136, February.
    5. Tuuli Lappalainen & Michael Sammeth & Marc R. Friedländer & Peter A. C. ‘t Hoen & Jean Monlong & Manuel A. Rivas & Mar Gonzàlez-Porta & Natalja Kurbatova & Thasso Griebel & Pedro G. Ferreira & Matthia, 2013. "Transcriptome and genome sequencing uncovers functional variation in humans," Nature, Nature, vol. 501(7468), pages 506-511, September.
    6. Jeff Vierstra & John Lazar & Richard Sandstrom & Jessica Halow & Kristen Lee & Daniel Bates & Morgan Diegel & Douglas Dunn & Fidencio Neri & Eric Haugen & Eric Rynes & Alex Reynolds & Jemma Nelson & A, 2020. "Global reference mapping of human transcription factor footprints," Nature, Nature, vol. 583(7818), pages 729-736, July.
    7. E. George & G. Mudholkar, 1983. "On the convolution of logistic random variables," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 30(1), pages 1-13, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jun Inamo & Akari Suzuki & Mahoko Takahashi Ueda & Kensuke Yamaguchi & Hiroshi Nishida & Katsuya Suzuki & Yuko Kaneko & Tsutomu Takeuchi & Hiroaki Hatano & Kazuyoshi Ishigaki & Yasushi Ishihama & Kazu, 2024. "Long-read sequencing for 29 immune cell subsets reveals disease-linked isoforms," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    2. Kensuke Yamaguchi & Kazuyoshi Ishigaki & Akari Suzuki & Yumi Tsuchida & Haruka Tsuchiya & Shuji Sumitomo & Yasuo Nagafuchi & Fuyuki Miya & Tatsuhiko Tsunoda & Hirofumi Shoda & Keishi Fujio & Kazuhiko , 2022. "Splicing QTL analysis focusing on coding sequences reveals mechanisms for disease susceptibility loci," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    3. Zhao Wang & Qian Liang & Xinyi Qian & Bolang Hu & Zhanye Zheng & Jianhua Wang & Yuelin Hu & Zhengkai Bao & Ke Zhao & Yao Zhou & Xiangling Feng & Xianfu Yi & Jin Li & Jiandang Shi & Zhe Liu & Jihui Hao, 2023. "An autoimmune pleiotropic SNP modulates IRF5 alternative promoter usage through ZBTB3-mediated chromatin looping," Nature Communications, Nature, vol. 14(1), pages 1-23, December.
    4. Gökberk Alagöz & Else Eising & Yasmina Mekki & Giacomo Bignardi & Pierre Fontanillas & Michel G. Nivard & Michelle Luciano & Nancy J. Cox & Simon E. Fisher & Reyna L. Gordon, 2025. "The shared genetic architecture and evolution of human language and musical rhythm," Nature Human Behaviour, Nature, vol. 9(2), pages 376-390, February.
    5. Matvei Khoroshkin & Andrey Buyan & Martin Dodel & Albertas Navickas & Johnny Yu & Fathima Trejo & Anthony Doty & Rithvik Baratam & Shaopu Zhou & Sean B. Lee & Tanvi Joshi & Kristle Garcia & Benedict C, 2024. "Systematic identification of post-transcriptional regulatory modules," Nature Communications, Nature, vol. 15(1), pages 1-21, December.
    6. Chi-Fen Chang & Shu-Pin Huang & Yu-Mei Hsueh & Jiun-Hung Geng & Chao-Yuan Huang & Bo-Ying Bao, 2022. "Genetic Analysis Implicates Dysregulation of SHANK2 in Renal Cell Carcinoma Progression," IJERPH, MDPI, vol. 19(19), pages 1-9, September.
    7. Alexendar R. Perez & Laura Sala & Richard K. Perez & Joana A. Vidigal, 2021. "CSC software corrects off-target mediated gRNA depletion in CRISPR-Cas9 essentiality screens," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    8. Day, Brett & Bateman, Ian & Binner, Amy & Ferrini, Silvia & Fezzi, Carlo, 2019. "Structurally-consistent estimation of use and nonuse values for landscape-wide environmental change," Journal of Environmental Economics and Management, Elsevier, vol. 98(C).
    9. Seong Kyu Han & Michelle T. McNulty & Christopher J. Benway & Pei Wen & Anya Greenberg & Ana C. Onuchic-Whitford & Dongkeun Jang & Jason Flannick & Noël P. Burtt & Parker C. Wilson & Benjamin D. Humph, 2023. "Mapping genomic regulation of kidney disease and traits through high-resolution and interpretable eQTLs," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    10. Naim Panjwani & Fan Wang & Scott Mastromatteo & Allen Bao & Cheng Wang & Gengming He & Jiafen Gong & Johanna M Rommens & Lei Sun & Lisa J Strug, 2020. "LocusFocus: Web-based colocalization for the annotation and functional follow-up of GWAS," PLOS Computational Biology, Public Library of Science, vol. 16(10), pages 1-8, October.
    11. Anneke Brümmer & Sven Bergmann, 2024. "Disentangling genetic effects on transcriptional and post-transcriptional gene regulation through integrating exon and intron expression QTLs," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    12. Henrik Hansen & John Rand, 2006. "On the Causal Links Between FDI and Growth in Developing Countries," The World Economy, Wiley Blackwell, vol. 29(1), pages 21-41, January.
    13. Xian Sun & Dongshuo Yin & Fei Qin & Hongfeng Yu & Wanxuan Lu & Fanglong Yao & Qibin He & Xingliang Huang & Zhiyuan Yan & Peijin Wang & Chubo Deng & Nayu Liu & Yiran Yang & Wei Liang & Ruiping Wang & C, 2023. "Revealing influencing factors on global waste distribution via deep-learning based dumpsite detection from satellite imagery," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    14. Xiaokun Liu & Hanhan Wei & Qifan Zhang & Na Zhang & Qingqing Wu & Chenhuan Xu, 2024. "Footprint-C reveals transcription factor modes in local clusters and long-range chromatin interactions," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    15. Chachrit Khunsriraksakul & Qinmengge Li & Havell Markus & Matthew T. Patrick & Renan Sauteraud & Daniel McGuire & Xingyan Wang & Chen Wang & Lida Wang & Siyuan Chen & Ganesh Shenoy & Bingshan Li & Xue, 2023. "Multi-ancestry and multi-trait genome-wide association meta-analyses inform clinical risk prediction for systemic lupus erythematosus," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    16. Junhao Li & Manoj K. Jaiswal & Jo-Fan Chien & Alexey Kozlenkov & Jinyoung Jung & Ping Zhou & Mahammad Gardashli & Luc J. Pregent & Erica Engelberg-Cook & Dennis W. Dickson & Veronique V. Belzil & Eran, 2023. "Divergent single cell transcriptome and epigenome alterations in ALS and FTD patients with C9orf72 mutation," Nature Communications, Nature, vol. 14(1), pages 1-22, December.
    17. Xinyuan Dong & Yu-Ru Su & Richard Barfield & Stephanie A Bien & Qianchuan He & Tabitha A Harrison & Jeroen R Huyghe & Temitope O Keku & Noralane M Lindor & Clemens Schafmayer & Andrew T Chan & Stephen, 2020. "A general framework for functionally informed set-based analysis: Application to a large-scale colorectal cancer study," PLOS Genetics, Public Library of Science, vol. 16(8), pages 1-21, August.
    18. Yuichi Shiraishi & Ai Okada & Kenichi Chiba & Asuka Kawachi & Ikuko Omori & Raúl Nicolás Mateos & Naoko Iida & Hirofumi Yamauchi & Kenjiro Kosaki & Akihide Yoshimi, 2022. "Systematic identification of intron retention associated variants from massive publicly available transcriptome sequencing data," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    19. Hanae Sato & Robert H. Singer, 2021. "Cellular variability of nonsense-mediated mRNA decay," Nature Communications, Nature, vol. 12(1), pages 1-12, December.
    20. Naoko Iida & Ai Okada & Yoshihisa Kobayashi & Kenichi Chiba & Yasushi Yatabe & Yuichi Shiraishi, 2025. "Systematically developing a registry of splice-site creating variants utilizing massive publicly available transcriptome sequence data," Nature Communications, Nature, vol. 16(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-024-55513-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.