IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v14y2023i1d10.1038_s41467-023-43266-3.html
   My bibliography  Save this article

Deep learning of human polyadenylation sites at nucleotide resolution reveals molecular determinants of site usage and relevance in disease

Author

Listed:
  • Emily Kunce Stroup

    (Northwestern University)

  • Zhe Ji

    (Northwestern University
    Northwestern University)

Abstract

The genomic distribution of cleavage and polyadenylation (polyA) sites should be co-evolutionally optimized with the local gene structure. Otherwise, spurious polyadenylation can cause premature transcription termination and generate aberrant proteins. To obtain mechanistic insights into polyA site optimization across the human genome, we develop deep/machine learning models to identify genome-wide putative polyA sites at unprecedented nucleotide-level resolution and calculate their strength and usage in the genomic context. Our models quantitatively measure position-specific motif importance and their crosstalk in polyA site formation and cleavage heterogeneity. The intronic site expression is governed by the surrounding splicing landscape. The usage of alternative polyA sites in terminal exons is modulated by their relative locations and distance to downstream genes. Finally, we apply our models to reveal thousands of disease- and trait-associated genetic variants altering polyadenylation activity. Altogether, our models represent a valuable resource to dissect molecular mechanisms mediating genome-wide polyA site expression and characterize their functional roles in human diseases.

Suggested Citation

  • Emily Kunce Stroup & Zhe Ji, 2023. "Deep learning of human polyadenylation sites at nucleotide resolution reveals molecular determinants of site usage and relevance in disease," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
  • Handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-43266-3
    DOI: 10.1038/s41467-023-43266-3
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-023-43266-3
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-023-43266-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Irtisha Singh & Shih-Han Lee & Adam S. Sperling & Mehmet K. Samur & Yu-Tzu Tai & Mariateresa Fulciniti & Nikhil C. Munshi & Christine Mayr & Christina S. Leslie, 2018. "Widespread intronic polyadenylation diversifies immune cell transcriptomes," Nature Communications, Nature, vol. 9(1), pages 1-16, December.
    2. Daisuke Kaida & Michael G. Berg & Ihab Younis & Mumtaz Kasim & Larry N. Singh & Lili Wan & Gideon Dreyfuss, 2010. "U1 snRNP protects pre-mRNAs from premature cleavage and polyadenylation," Nature, Nature, vol. 468(7324), pages 664-668, December.
    3. Corey R. Mandel & Syuzo Kaneko & Hailong Zhang & Damara Gebauer & Vasupradha Vethantham & James L. Manley & Liang Tong, 2006. "Polyadenylation factor CPSF-73 is the pre-mRNA 3'-end-processing endonuclease," Nature, Nature, vol. 444(7121), pages 953-956, December.
    4. Clare Bycroft & Colin Freeman & Desislava Petkova & Gavin Band & Lloyd T. Elliott & Kevin Sharp & Allan Motyer & Damjan Vukcevic & Olivier Delaneau & Jared O’Connell & Adrian Cortes & Samantha Welsh &, 2018. "The UK Biobank resource with deep phenotyping and genomic data," Nature, Nature, vol. 562(7726), pages 203-209, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Matteo Di Scipio & Mohammad Khan & Shihong Mao & Michael Chong & Conor Judge & Nazia Pathan & Nicolas Perrot & Walter Nelson & Ricky Lali & Shuang Di & Robert Morton & Jeremy Petch & Guillaume Paré, 2023. "A versatile, fast and unbiased method for estimation of gene-by-environment interaction effects on biobank-scale datasets," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    2. Jacob Joseph & Chang Liu & Qin Hui & Krishna Aragam & Zeyuan Wang & Brian Charest & Jennifer E. Huffman & Jacob M. Keaton & Todd L. Edwards & Serkalem Demissie & Luc Djousse & Juan P. Casas & J. Micha, 2022. "Genetic architecture of heart failure with preserved versus reduced ejection fraction," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    3. Vincent Michaud & Eulalie Lasseaux & David J. Green & Dave T. Gerrard & Claudio Plaisant & Tomas Fitzgerald & Ewan Birney & Benoît Arveiler & Graeme C. Black & Panagiotis I. Sergouniotis, 2022. "The contribution of common regulatory and protein-coding TYR variants to the genetic architecture of albinism," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    4. Natalie DeForest & Yuqi Wang & Zhiyi Zhu & Jacqueline S. Dron & Ryan Koesterer & Pradeep Natarajan & Jason Flannick & Tiffany Amariuta & Gina M. Peloso & Amit R. Majithia, 2024. "Genome-wide discovery and integrative genomic characterization of insulin resistance loci using serum triglycerides to HDL-cholesterol ratio as a proxy," Nature Communications, Nature, vol. 15(1), pages 1-17, December.
    5. Dick Schijven & Sourena Soheili-Nezhad & Simon E. Fisher & Clyde Francks, 2024. "Exome-wide analysis implicates rare protein-altering variants in human handedness," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    6. Lili Liu & Atlas Khan & Elena Sanchez-Rodriguez & Francesca Zanoni & Yifu Li & Nicholas Steers & Olivia Balderes & Junying Zhang & Priya Krithivasan & Robert A. LeDesma & Clara Fischman & Scott J. Heb, 2022. "Genetic regulation of serum IgA levels and susceptibility to common immune, infectious, kidney, and cardio-metabolic traits," Nature Communications, Nature, vol. 13(1), pages 1-17, December.
    7. Shahram Bahrami & Kaja Nordengen & Jaroslav Rokicki & Alexey A. Shadrin & Zillur Rahman & Olav B. Smeland & Piotr P. Jaholkowski & Nadine Parker & Pravesh Parekh & Kevin S. O’Connell & Torbjørn Elvsås, 2024. "The genetic landscape of basal ganglia and implications for common brain disorders," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    8. Sylvia Hartmann & Summaira Yasmeen & Benjamin M. Jacobs & Spiros Denaxas & Munir Pirmohamed & Eric R. Gamazon & Mark J. Caulfield & Harry Hemingway & Maik Pietzner & Claudia Langenberg, 2023. "ADRA2A and IRX1 are putative risk genes for Raynaud’s phenomenon," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    9. Mit Shah & Marco H. A. Inácio & Chang Lu & Pierre-Raphaël Schiratti & Sean L. Zheng & Adam Clement & Antonio Marvao & Wenjia Bai & Andrew P. King & James S. Ware & Martin R. Wilkins & Johanna Mielke &, 2023. "Environmental and genetic predictors of human cardiovascular ageing," Nature Communications, Nature, vol. 14(1), pages 1-15, December.
    10. Mathias Seviiri & Matthew H. Law & Jue-Sheng Ong & Puya Gharahkhani & Pierre Fontanillas & Catherine M. Olsen & David C. Whiteman & Stuart MacGregor, 2022. "A multi-phenotype analysis reveals 19 susceptibility loci for basal cell carcinoma and 15 for squamous cell carcinoma," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    11. Amy L. Hughes & Aleksander T. Szczurek & Jessica R. Kelley & Anna Lastuvkova & Anne H. Turberfield & Emilia Dimitrova & Neil P. Blackledge & Robert J. Klose, 2023. "A CpG island-encoded mechanism protects genes from premature transcription termination," Nature Communications, Nature, vol. 14(1), pages 1-19, December.
    12. Zhaotong Lin & Wei Pan, 2024. "A robust cis-Mendelian randomization method with application to drug target discovery," Nature Communications, Nature, vol. 15(1), pages 1-14, December.
    13. Zhening Liu & Hangkai Huang & Jiarong Xie & Yingying Xu & Chengfu Xu, 2024. "Circulating fatty acids and risk of hepatocellular carcinoma and chronic liver disease mortality in the UK Biobank," Nature Communications, Nature, vol. 15(1), pages 1-10, December.
    14. Junqing Xie & Shuo Feng & Xintong Li & Ester Gea-Mallorquí & Albert Prats-Uribe & Dani Prieto-Alhambra, 2022. "Comparative effectiveness of the BNT162b2 and ChAdOx1 vaccines against Covid-19 in people over 50," Nature Communications, Nature, vol. 13(1), pages 1-8, December.
    15. Rongtao Jiang & Stephanie Noble & Matthew Rosenblatt & Wei Dai & Jean Ye & Shu Liu & Shile Qi & Vince D. Calhoun & Jing Sui & Dustin Scheinost, 2024. "The brain structure, inflammatory, and genetic mechanisms mediate the association between physical frailty and depression," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    16. Erik Schoenmakers & Federica Marelli & Helle F. Jørgensen & W. Edward Visser & Carla Moran & Stefan Groeneweg & Carolina Avalos & Sean J. Jurgens & Nichola Figg & Alison Finigan & Neha Wali & Maura Ag, 2023. "Selenoprotein deficiency disorder predisposes to aortic aneurysm formation," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    17. Timofey A. Karginov & Antoine Ménoret & Anthony T. Vella, 2022. "Optimal CD8+ T cell effector function requires costimulation-induced RNA-binding proteins that reprogram the transcript isoform landscape," Nature Communications, Nature, vol. 13(1), pages 1-13, December.
    18. Harry D Green & Alistair Jones & Jonathan P Evans & Andrew R Wood & Robin N Beaumont & Jessica Tyrrell & Timothy M Frayling & Christopher Smith & Michael N Weedon, 2021. "A genome-wide association study identifies 5 loci associated with frozen shoulder and implicates diabetes as a causal risk factor," PLOS Genetics, Public Library of Science, vol. 17(6), pages 1-13, June.
    19. Zhen Qiao & Julia Sidorenko & Joana A. Revez & Angli Xue & Xueling Lu & Katri Pärna & Harold Snieder & Peter M. Visscher & Naomi R. Wray & Loic Yengo, 2023. "Estimation and implications of the genetic architecture of fasting and non-fasting blood glucose," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    20. Xiaoyi Raymond Gao & Marion Chiariglione & Alexander J. Arch, 2022. "Whole-exome sequencing study identifies rare variants and genes associated with intraocular pressure and glaucoma," Nature Communications, Nature, vol. 13(1), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:14:y:2023:i:1:d:10.1038_s41467-023-43266-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.