Author
Listed:
- Xi Fu
(Columbia University
Columbia University)
- Shentong Mo
(Mohamed bin Zayed University of Artificial Intelligence
Carnegie Mellon University)
- Alejandro Buendia
(Columbia University)
- Anouchka P. Laurent
(Columbia University)
- Anqi Shao
(Columbia University)
- Maria del Mar Alvarez-Torres
(Columbia University)
- Tianji Yu
(Columbia University)
- Jimin Tan
(New York University Grossman School of Medicine)
- Jiayu Su
(Columbia University)
- Romella Sagatelian
(Columbia University)
- Adolfo A. Ferrando
(Columbia University
Regeneron)
- Alberto Ciccia
(Columbia University)
- Yanyan Lan
(Tsinghua University
Tsinghua University)
- David M. Owens
(Columbia University
Columbia University)
- Teresa Palomero
(Columbia University
Columbia University)
- Eric P. Xing
(Mohamed bin Zayed University of Artificial Intelligence
Carnegie Mellon University)
- Raul Rabadan
(Columbia University
Columbia University)
Abstract
Transcriptional regulation, which involves a complex interplay between regulatory sequences and proteins, directs all biological processes. Computational models of transcription lack generalizability to accurately extrapolate to unseen cell types and conditions. Here we introduce GET (general expression transformer), an interpretable foundation model designed to uncover regulatory grammars across 213 human fetal and adult cell types1,2. Relying exclusively on chromatin accessibility data and sequence information, GET achieves experimental-level accuracy in predicting gene expression even in previously unseen cell types3. GET also shows remarkable adaptability across new sequencing platforms and assays, enabling regulatory inference across a broad range of cell types and conditions, and uncovers universal and cell-type-specific transcription factor interaction networks. We evaluated its performance in prediction of regulatory activity, inference of regulatory elements and regulators, and identification of physical interactions between transcription factors and found that it outperforms current models4 in predicting lentivirus-based massively parallel reporter assay readout5,6. In fetal erythroblasts7, we identified distal (greater than 1 Mbp) regulatory regions that were missed by previous models, and, in B cells, we identified a lymphocyte-specific transcription factor–transcription factor interaction that explains the functional significance of a leukaemia risk predisposing germline mutation8–10. In sum, we provide a generalizable and accurate model for transcription together with catalogues of gene regulation and transcription factor interactions, all with cell type specificity.
Suggested Citation
Xi Fu & Shentong Mo & Alejandro Buendia & Anouchka P. Laurent & Anqi Shao & Maria del Mar Alvarez-Torres & Tianji Yu & Jimin Tan & Jiayu Su & Romella Sagatelian & Adolfo A. Ferrando & Alberto Ciccia &, 2025.
"A foundation model of transcription across human cell types,"
Nature, Nature, vol. 637(8047), pages 965-973, January.
Handle:
RePEc:nat:nature:v:637:y:2025:i:8047:d:10.1038_s41586-024-08391-z
DOI: 10.1038/s41586-024-08391-z
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:nature:v:637:y:2025:i:8047:d:10.1038_s41586-024-08391-z. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.