IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-55920-z.html
   My bibliography  Save this article

MethylBERT enables read-level DNA methylation pattern identification and tumour deconvolution using a Transformer-based model

Author

Listed:
  • Yunhee Jeong

    (German Cancer Research Center (DKFZ))

  • Clarissa Gerhäuser

    (German Cancer Research Center (DKFZ))

  • Guido Sauter

    (University Medical Center Hamburg-Eppendorf)

  • Thorsten Schlomm

    (Charité – Universitätsmedizin Berlin)

  • Karl Rohr

    (Heidelberg University)

  • Pavlo Lutsik

    (German Cancer Research Center (DKFZ)
    KU Leuven)

Abstract

DNA methylation (DNAm) is a key epigenetic mark that shows profound alterations in cancer. Read-level methylomes enable more in-depth analyses, due to their broad genomic coverage and preservation of rare cell-type signals, compared to summarized data such as 450K/EPIC microarrays. Here, we propose MethylBERT, a Transformer-based model for read-level methylation pattern classification. MethylBERT identifies tumour-derived sequence reads based on their methylation patterns and local genomic sequence, and estimates tumour cell fractions within bulk samples. In our evaluation, MethylBERT outperforms existing deconvolution methods and demonstrates high accuracy regardless of methylation pattern complexity, read length and read coverage. Moreover, we show its applicability to cell-type deconvolution as well as non-invasive early cancer diagnostics using liquid biopsy samples. MethylBERT represents a significant advancement in read-level methylome analysis and enables accurate tumour purity estimation. The broad applicability of MethylBERT will enhance studies on both tumour and non-cancerous bulk methylomes.

Suggested Citation

  • Yunhee Jeong & Clarissa Gerhäuser & Guido Sauter & Thorsten Schlomm & Karl Rohr & Pavlo Lutsik, 2025. "MethylBERT enables read-level DNA methylation pattern identification and tumour deconvolution using a Transformer-based model," Nature Communications, Nature, vol. 16(1), pages 1-14, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-55920-z
    DOI: 10.1038/s41467-025-55920-z
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-55920-z
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-55920-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Ryan Lister & Mattia Pelizzola & Robert H. Dowen & R. David Hawkins & Gary Hon & Julian Tonti-Filippini & Joseph R. Nery & Leonard Lee & Zhen Ye & Que-Minh Ngo & Lee Edsall & Jessica Antosiewicz-Bourg, 2009. "Human DNA methylomes at base resolution show widespread epigenomic differences," Nature, Nature, vol. 462(7271), pages 315-322, November.
    2. Netanel Loyfer & Judith Magenheim & Ayelet Peretz & Gordon Cann & Joerg Bredno & Agnes Klochendler & Ilana Fox-Fisher & Sapir Shabi-Porat & Merav Hecht & Tsuria Pelet & Joshua Moss & Zeina Drawshy & H, 2023. "A DNA methylation atlas of normal human cell types," Nature, Nature, vol. 613(7943), pages 355-364, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. André Bortolini Silveira & Alexandre Houy & Olivier Ganier & Begüm Özemek & Sandra Vanhuele & Anne Vincent-Salomon & Nathalie Cassoux & Pascale Mariani & Gaelle Pierron & Serge Leyvraz & Damian Rieke , 2024. "Base-excision repair pathway shapes 5-methylcytosine deamination signatures in pan-cancer genomes," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    2. Zengyu Shao & Jiuwei Lu & Nelli Khudaverdyan & Jikui Song, 2024. "Multi-layered heterochromatin interaction as a switch for DIM2-mediated DNA methylation," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    3. Xuelong Yao & Zongyang Lu & Zhanying Feng & Lei Gao & Xin Zhou & Min Li & Suijuan Zhong & Qian Wu & Zhenbo Liu & Haofeng Zhang & Zeyuan Liu & Lizhi Yi & Tao Zhou & Xudong Zhao & Jun Zhang & Yong Wang , 2022. "Comparison of chromatin accessibility landscapes during early development of prefrontal cortex between rhesus macaque and human," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    4. Kate E. Stanley & Tatjana Jatsenko & Stefania Tuveri & Dhanya Sudhakaran & Lore Lannoo & Kristel Calsteren & Marie Borre & Ilse Parijs & Leen Coillie & Kris Bogaert & Rodrigo Almeida Toledo & Liesbeth, 2024. "Cell type signatures in cell-free DNA fragmentation profiles reveal disease biology," Nature Communications, Nature, vol. 15(1), pages 1-13, December.
    5. Michaël Noë & Dimitrios Mathios & Akshaya V. Annapragada & Shashikant Koul & Zacharia H. Foda & Jamie E. Medina & Stephen Cristiano & Christopher Cherry & Daniel C. Bruhm & Noushin Niknafs & Vilmos Ad, 2024. "DNA methylation and gene expression as determinants of genome-wide cell-free DNA fragmentation," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    6. Rakesh Chettier & Lesa Nelson & James W Ogilvie & Hans M Albertsen & Kenneth Ward, 2015. "Haplotypes at LBX1 Have Distinct Inheritance Patterns with Opposite Effects in Adolescent Idiopathic Scoliosis," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-11, February.
    7. Xue Yue & Zhiyuan Xie & Moran Li & Kai Wang & Xiaojing Li & Xiaoqing Zhang & Jian Yan & Yimeng Yin, 2022. "Simultaneous profiling of histone modifications and DNA methylation via nanopore sequencing," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    8. Anyou Wang & Ying Du & Qianchuan He & Chunxiao Zhou, 2013. "A Quantitative System for Discriminating Induced Pluripotent Stem Cells, Embryonic Stem Cells and Somatic Cells," PLOS ONE, Public Library of Science, vol. 8(2), pages 1-10, February.
    9. Yu Xiaoqing & Sun Shuying, 2016. "Comparing five statistical methods of differential methylation identification using bisulfite sequencing data," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 15(2), pages 173-191, April.
    10. Jian Fang & Jianjun Jiang & Sarah M. Leichter & Jie Liu & Mahamaya Biswal & Nelli Khudaverdyan & Xuehua Zhong & Jikui Song, 2022. "Mechanistic basis for maintenance of CHG DNA methylation in plants," Nature Communications, Nature, vol. 13(1), pages 1-12, December.
    11. Allegra Angeloni & Skye Fissette & Deniz Kaya & Jillian M. Hammond & Hasindu Gamaarachchi & Ira W. Deveson & Robert J. Klose & Weiming Li & Xiaotian Zhang & Ozren Bogdanovic, 2024. "Extensive DNA methylome rearrangement during early lamprey embryogenesis," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    12. Jason A. Carter & Léonie Strömich & Matthew Peacey & Sarah R. Chapin & Lars Velten & Lars M. Steinmetz & Benedikt Brors & Sheena Pinto & Hannah V. Meyer, 2022. "Transcriptomic diversity in human medullary thymic epithelial cells," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    13. Mahnaz Hosseinpour & Xinqi Xi & Ling Liu & Luis Malaver-Ortega & Laura Perlaza-Jimenez & Jihoon E. Joo & Harrison M. York & Jonathan Beesley & C. Elizabeth Caldon & Pierre-Antoine Dugué & James G. Dow, 2024. "SAM-DNMT3A, a strategy for induction of genome-wide DNA methylation, identifies DNA methylation as a vulnerability in ER-positive breast cancers," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    14. Joshua Moss & Roni Ben-Ami & Ela Shai & Ofer Gal-Rosenberg & Yosef Kalish & Agnes Klochendler & Gordon Cann & Benjamin Glaser & Ariela Arad & Ruth Shemer & Yuval Dor, 2023. "Megakaryocyte- and erythroblast-specific cell-free DNA patterns in plasma and platelets reflect thrombopoiesis and erythropoiesis levels," Nature Communications, Nature, vol. 14(1), pages 1-10, December.
    15. Tianyu Zhu & Huige Tong & Zhaozhen Du & Stephan Beck & Andrew E. Teschendorff, 2024. "An improved epigenetic counter to track mitotic age in normal and precancerous tissues," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    16. Xusheng Zhang & Xintong Gao & Zhen Liu & Fei Shao & Dou Yu & Min Zhao & Xiwen Qin & Shuo Wang, 2024. "Microbiota regulates the TET1-mediated DNA hydroxymethylation program in innate lymphoid cell differentiation," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    17. Theodore Sakellaropoulos & Catherine Do & Guimei Jiang & Giulia Cova & Peter Meyn & Dacia Dimartino & Sitharam Ramaswami & Adriana Heguy & Aristotelis Tsirigos & Jane A. Skok, 2024. "MethNet: a robust approach to identify regulatory hubs and their distal targets from cancer data," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    18. Pui Pik Law & Liudmila A. Mikheeva & Francisco Rodriguez-Algarra & Fredrika Asenius & Maria Gregori & Robert A. E. Seaborne & Selin Yildizoglu & James R. C. Miller & Hemanth Tummala & Robin Mesnage & , 2024. "Ribosomal DNA copy number is associated with body mass in humans and other mammals," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    19. Lacey Michelle R. & Baribault Carl & Ehrlich Melanie, 2013. "Modeling, simulation and analysis of methylation profiles from reduced representation bisulfite sequencing experiments," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(6), pages 723-742, December.
    20. Ruth V. Nichols & Brendan L. O’Connell & Ryan M. Mulqueen & Jerushah Thomas & Ashley R. Woodfin & Sonia Acharya & Gail Mandel & Dmitry Pokholok & Frank J. Steemers & Andrew C. Adey, 2022. "High-throughput robust single-cell DNA methylation profiling with sciMETv2," Nature Communications, Nature, vol. 13(1), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-55920-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.