IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1003314.html
   My bibliography  Save this article

Systematically Differentiating Functions for Alternatively Spliced Isoforms through Integrating RNA-seq Data

Author

Listed:
  • Ridvan Eksi
  • Hong-Dong Li
  • Rajasree Menon
  • Yuchen Wen
  • Gilbert S Omenn
  • Matthias Kretzler
  • Yuanfang Guan

Abstract

Integrating large-scale functional genomic data has significantly accelerated our understanding of gene functions. However, no algorithm has been developed to differentiate functions for isoforms of the same gene using high-throughput genomic data. This is because standard supervised learning requires ‘ground-truth’ functional annotations, which are lacking at the isoform level. To address this challenge, we developed a generic framework that interrogates public RNA-seq data at the transcript level to differentiate functions for alternatively spliced isoforms. For a specific function, our algorithm identifies the ‘responsible’ isoform(s) of a gene and generates classifying models at the isoform level instead of at the gene level. Through cross-validation, we demonstrated that our algorithm is effective in assigning functions to genes, especially the ones with multiple isoforms, and robust to gene expression levels and removal of homologous gene pairs. We identified genes in the mouse whose isoforms are predicted to have disparate functionalities and experimentally validated the ‘responsible’ isoforms using data from mammary tissue. With protein structure modeling and experimental evidence, we further validated the predicted isoform functional differences for the genes Cdkn2a and Anxa6. Our generic framework is the first to predict and differentiate functions for alternatively spliced isoforms, instead of genes, using genomic data. It is extendable to any base machine learner and other species with alternatively spliced isoforms, and shifts the current gene-centered function prediction to isoform-level predictions.Author Summary: In mammalian genomes, a single gene can be alternatively spliced into multiple isoforms which greatly increase the functional diversity of the genome. In the human, more than 95% of multi-exon genes undergo alternative splicing. It is hard to computationally differentiate the functions for the splice isoforms of the same gene, because they are almost always annotated with the same functions and share similar sequences. In this paper, we developed a generic framework to identify the ‘responsible’ isoform(s) for each function that the gene carries out, and therefore predict functional assignment on the isoform level instead of on the gene level. Within this generic framework, we implemented and evaluated several related algorithms for isoform function prediction. We tested these algorithms through both computational evaluation and experimental validation of the predicted ‘responsible’ isoform(s) and the predicted disparate functions of the isoforms of Cdkn2a and of Anxa6. Our algorithm represents the first effort to predict and differentiate isoforms through large-scale genomic data integration.

Suggested Citation

  • Ridvan Eksi & Hong-Dong Li & Rajasree Menon & Yuchen Wen & Gilbert S Omenn & Matthias Kretzler & Yuanfang Guan, 2013. "Systematically Differentiating Functions for Alternatively Spliced Isoforms through Integrating RNA-seq Data," PLOS Computational Biology, Public Library of Science, vol. 9(11), pages 1-16, November.
  • Handle: RePEc:plo:pcbi00:1003314
    DOI: 10.1371/journal.pcbi.1003314
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003314
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1003314&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1003314?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yoseph Barash & John A. Calarco & Weijun Gao & Qun Pan & Xinchen Wang & Ofer Shai & Benjamin J. Blencowe & Brendan J. Frey, 2010. "Deciphering the splicing code," Nature, Nature, vol. 465(7294), pages 53-59, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Asta Laiho & Laura L Elo, 2014. "A Note on an Exon-Based Strategy to Identify Differentially Expressed Genes in RNA-Seq Experiments," PLOS ONE, Public Library of Science, vol. 9(12), pages 1-12, December.
    2. Yijuan Zhang & Ding Li & Bingyun Sun, 2015. "Do Housekeeping Genes Exist?," PLOS ONE, Public Library of Science, vol. 10(5), pages 1-22, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Robert Stojnic & Audrey Qiuyan Fu & Boris Adryan, 2012. "A Graphical Modelling Approach to the Dissection of Highly Correlated Transcription Factor Binding Site Profiles," PLOS Computational Biology, Public Library of Science, vol. 8(11), pages 1-13, November.
    2. Ilias Georgakopoulos-Soares & Guillermo E. Parada & Hei Yuen Wong & Ragini Medhi & Giulia Furlan & Roberto Munita & Eric A. Miska & Chun Kit Kwok & Martin Hemberg, 2022. "Alternative splicing modulation by G-quadruplexes," Nature Communications, Nature, vol. 13(1), pages 1-16, December.
    3. Areum Han & Peter Stoilov & Anthony J Linares & Yu Zhou & Xiang-Dong Fu & Douglas L Black, 2014. "De Novo Prediction of PTBP1 Binding and Splicing Targets Reveals Unexpected Features of Its RNA Recognition and Function," PLOS Computational Biology, Public Library of Science, vol. 10(1), pages 1-18, January.
    4. Xiangbin Ruan & Kaining Hu & Xiaochang Zhang, 2023. "PIE-seq: identifying RNA-binding protein targets by dual RNA-deaminase editing and sequencing," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    5. Jorge Vaquero-Garcia & Joseph K. Aicher & San Jewell & Matthew R. Gazzara & Caleb M. Radens & Anupama Jha & Scott S. Norton & Nicholas F. Lahens & Gregory R. Grant & Yoseph Barash, 2023. "RNA splicing analysis using heterogeneous and large RNA-seq datasets," Nature Communications, Nature, vol. 14(1), pages 1-20, December.
    6. Yocelyn Recinos & Dmytro Ustianenko & Yow-Tyng Yeh & Xiaojian Wang & Martin Jacko & Lekha V. Yesantharao & Qiyang Wu & Chaolin Zhang, 2024. "CRISPR-dCas13d-based deep screening of proximal and distal splicing-regulatory elements," Nature Communications, Nature, vol. 15(1), pages 1-15, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1003314. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.