IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v16y2025i1d10.1038_s41467-025-56054-y.html
   My bibliography  Save this article

Causality-driven candidate identification for reliable DNA methylation biomarker discovery

Author

Listed:
  • Xinlu Tang

    (Shanghai Jiao Tong University)

  • Rui Guo

    (Shanghai Jiao Tong University)

  • Zhanfeng Mo

    (Nanyang Technological University)

  • Wenli Fu

    (Shanghai Jiao Tong University)

  • Xiaohua Qian

    (Shanghai Jiao Tong University)

Abstract

Despite vast data support in DNA methylation (DNAm) biomarker discovery to facilitate health-care research, this field faces huge resource barriers due to preliminary unreliable candidates and the consequent compensations using expensive experiments. The underlying challenges lie in the confounding factors, especially measurement noise and individual characteristics. To achieve reliable identification of a candidate pool for DNAm biomarker discovery, we propose a Causality-driven Deep Regularization framework to reinforce correlations that are suggestive of causality with disease. It integrates causal thinking, deep learning, and biological priors to handle non-causal confounding factors, through a contrastive scheme and a spatial-relation regularization that reduces interferences from individual characteristics and noises, respectively. The comprehensive reliability of the proposed method was verified by simulations and applications involving various human diseases, sample origins, and sequencing technologies, highlighting its universal biomedical significance. Overall, this study offers a causal-deep-learning-based perspective with a compatible tool to identify reliable DNAm biomarker candidates, promoting resource-efficient biomarker discovery.

Suggested Citation

  • Xinlu Tang & Rui Guo & Zhanfeng Mo & Wenli Fu & Xiaohua Qian, 2025. "Causality-driven candidate identification for reliable DNA methylation biomarker discovery," Nature Communications, Nature, vol. 16(1), pages 1-15, December.
  • Handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-56054-y
    DOI: 10.1038/s41467-025-56054-y
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-025-56054-y
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-025-56054-y?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. P. Tseng & S. Yun, 2009. "Block-Coordinate Gradient Descent Method for Linearly Constrained Nonsmooth Separable Optimization," Journal of Optimization Theory and Applications, Springer, vol. 140(3), pages 513-535, March.
    2. Markus Reichstein & Gustau Camps-Valls & Bjorn Stevens & Martin Jung & Joachim Denzler & Nuno Carvalhais & Prabhat, 2019. "Deep learning and process understanding for data-driven Earth system science," Nature, Nature, vol. 566(7743), pages 195-204, February.
    3. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    4. Pei Fen Kuan & Derek Y. Chiang, 2012. "Integrating Prior Knowledge in Multiple Testing under Dependence with Applications to Detecting Differential DNA Methylation," Biometrics, The International Biometric Society, vol. 68(3), pages 774-783, September.
    5. Tiantian Wang & Peilong Li & Qiuchen Qi & Shujun Zhang & Yan Xie & Jing Wang & Shibiao Liu & Suhong Ma & Shijun Li & Tingting Gong & Huiting Xu & Mengqiu Xiong & Guanghua Li & Chongge You & Zhaofan Lu, 2023. "A multiplex blood-based assay targeting DNA methylation in PBMCs enables early detection of breast cancer," Nature Communications, Nature, vol. 14(1), pages 1-13, December.
    6. Jing Li & Chuanliang Xu & Hyung Joo Lee & Shancheng Ren & Xiaoyuan Zi & Zhiming Zhang & Haifeng Wang & Yongwei Yu & Chenghua Yang & Xiaofeng Gao & Jianguo Hou & Linhui Wang & Bo Yang & Qing Yang & Hua, 2020. "A genomic and epigenomic atlas of prostate cancer in Asian populations," Nature, Nature, vol. 580(7801), pages 93-99, April.
    7. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    2. Fuzhi Lu & Huayu Lu & Yao Gu & Pengyu Lin & Zhengyao Lu & Qiong Zhang & Hongyan Zhang & Fan Yang & Xiaoyi Dong & Shuangwen Yi & Deliang Chen & Francesco S. R. Pausata & Maya Ben-Yami & Jennifer V. Mec, 2025. "Tipping point-induced abrupt shifts in East Asian hydroclimate since the Last Glacial Maximum," Nature Communications, Nature, vol. 16(1), pages 1-21, December.
    3. Yang, Yiping & Luo, Chuanqin & Yang, Weiming, 2024. "Double penalized variable selection for high-dimensional partial linear mixed effects models," Journal of Multivariate Analysis, Elsevier, vol. 204(C).
    4. Jan Pablo Burgard & Joscha Krause & Dennis Kreber & Domingo Morales, 2021. "The generalized equivalence of regularization and min–max robustification in linear mixed models," Statistical Papers, Springer, vol. 62(6), pages 2857-2883, December.
    5. Mingrui Zhong & Zanhua Yin & Zhichao Wang, 2023. "Variable Selection for Sparse Logistic Regression with Grouped Variables," Mathematics, MDPI, vol. 11(24), pages 1-21, December.
    6. Choi, Insu & Kim, Woo Chang, 2023. "Estimating Historical Downside Risks of Global Financial Market Indices via Inflation Rate-Adjusted Dependence Graphs," Research in International Business and Finance, Elsevier, vol. 66(C).
    7. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    8. Oxana Babecka Kucharcukova & Jan Bruha, 2016. "Nowcasting the Czech Trade Balance," Working Papers 2016/11, Czech National Bank.
    9. Carstensen, Kai & Heinrich, Markus & Reif, Magnus & Wolters, Maik H., 2020. "Predicting ordinary and severe recessions with a three-state Markov-switching dynamic factor model," International Journal of Forecasting, Elsevier, vol. 36(3), pages 829-850.
    10. Hou-Tai Chang & Ping-Huai Wang & Wei-Fang Chen & Chen-Ju Lin, 2022. "Risk Assessment of Early Lung Cancer with LDCT and Health Examinations," IJERPH, MDPI, vol. 19(8), pages 1-12, April.
    11. Margherita Giuzio, 2017. "Genetic algorithm versus classical methods in sparse index tracking," Decisions in Economics and Finance, Springer;Associazione per la Matematica, vol. 40(1), pages 243-256, November.
    12. Nicolaj N. Mühlbach, 2020. "Tree-based Synthetic Control Methods: Consequences of moving the US Embassy," CREATES Research Papers 2020-04, Department of Economics and Business Economics, Aarhus University.
    13. Wang, Qiao & Zhou, Wei & Cheng, Yonggang & Ma, Gang & Chang, Xiaolin & Miao, Yu & Chen, E, 2018. "Regularized moving least-square method and regularized improved interpolating moving least-square method with nonsingular moment matrices," Applied Mathematics and Computation, Elsevier, vol. 325(C), pages 120-145.
    14. Dmitriy Drusvyatskiy & Adrian S. Lewis, 2018. "Error Bounds, Quadratic Growth, and Linear Convergence of Proximal Methods," Mathematics of Operations Research, INFORMS, vol. 43(3), pages 919-948, August.
    15. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    16. Lucian Belascu & Alexandra Horobet & Georgiana Vrinceanu & Consuela Popescu, 2021. "Performance Dissimilarities in European Union Manufacturing: The Effect of Ownership and Technological Intensity," Sustainability, MDPI, vol. 13(18), pages 1-19, September.
    17. Candelon, B. & Hurlin, C. & Tokpavi, S., 2012. "Sampling error and double shrinkage estimation of minimum variance portfolios," Journal of Empirical Finance, Elsevier, vol. 19(4), pages 511-527.
    18. Susan Athey & Guido W. Imbens & Stefan Wager, 2018. "Approximate residual balancing: debiased inference of average treatment effects in high dimensions," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(4), pages 597-623, September.
    19. Andrea Carriero & Todd E. Clark & Massimiliano Marcellino, 2025. "Specification Choices in Quantile Regression for Empirical Macroeconomics," Journal of Applied Econometrics, John Wiley & Sons, Ltd., vol. 40(1), pages 57-73, January.
    20. Kim, Hyun Hak & Swanson, Norman R., 2018. "Mining big data using parsimonious factor, machine learning, variable selection and shrinkage methods," International Journal of Forecasting, Elsevier, vol. 34(2), pages 339-354.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:16:y:2025:i:1:d:10.1038_s41467-025-56054-y. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.