IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1003545.html
   My bibliography  Save this article

Prediction and Prioritization of Rare Oncogenic Mutations in the Cancer Kinome Using Novel Features and Multiple Classifiers

Author

Listed:
  • ManChon U
  • Eric Talevich
  • Samiksha Katiyar
  • Khaled Rasheed
  • Natarajan Kannan

Abstract

Cancer is a genetic disease that develops through a series of somatic mutations, a subset of which drive cancer progression. Although cancer genome sequencing studies are beginning to reveal the mutational patterns of genes in various cancers, identifying the small subset of “causative” mutations from the large subset of “non-causative” mutations, which accumulate as a consequence of the disease, is a challenge. In this article, we present an effective machine learning approach for identifying cancer-associated mutations in human protein kinases, a class of signaling proteins known to be frequently mutated in human cancers. We evaluate the performance of 11 well known supervised learners and show that a multiple-classifier approach, which combines the performances of individual learners, significantly improves the classification of known cancer-associated mutations. We introduce several novel features related specifically to structural and functional characteristics of protein kinases and find that the level of conservation of the mutated residue at specific evolutionary depths is an important predictor of oncogenic effect. We consolidate the novel features and the multiple-classifier approach to prioritize and experimentally test a set of rare unconfirmed mutations in the epidermal growth factor receptor tyrosine kinase (EGFR). Our studies identify T725M and L861R as rare cancer-associated mutations inasmuch as these mutations increase EGFR activity in the absence of the activating EGF ligand in cell-based assays.Author Summary: Cancer progresses by accumulation of mutations in a subset of genes that confer growth advantage. The 518 protein kinase genes encoded in the human genome, collectively called the kinome, represent one of the largest families of oncogenes. Targeted sequencing studies of many different cancers have shown that the mutational landscape comprises both cancer-causing “driver” mutations and harmless “passenger” mutations. While the frequent recurrence of some driver mutations in human cancers helps distinguish them from the large number of passenger mutations, a significant challenge is to identify the rare “driver” mutations that are less frequently observed in patient samples and yet are causative. Here we combine computational and experimental approaches to identify rare cancer-associated mutations in Epidermal Growth Factor receptor kinase (EGFR), a signaling protein frequently mutated in cancers. Specifically, we evaluate a novel multiple-classifier approach and features specific to the protein kinase super-family in distinguishing known cancer-associated mutations from benign mutations. We then apply the multiple classifier to identify and test the functional impact of rare cancer-associated mutations in EGFR. We report, for the first time, that the EGFR mutations T725M and L861R, which are infrequently observed in cancers, constitutively activate EGFR in a manner analogous to the frequently observed driver mutations.

Suggested Citation

  • ManChon U & Eric Talevich & Samiksha Katiyar & Khaled Rasheed & Natarajan Kannan, 2014. "Prediction and Prioritization of Rare Oncogenic Mutations in the Cancer Kinome Using Novel Features and Multiple Classifiers," PLOS Computational Biology, Public Library of Science, vol. 10(4), pages 1-12, April.
  • Handle: RePEc:plo:pcbi00:1003545
    DOI: 10.1371/journal.pcbi.1003545
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003545
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1003545&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1003545?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Christopher Greenman & Philip Stephens & Raffaella Smith & Gillian L. Dalgliesh & Christopher Hunter & Graham Bignell & Helen Davies & Jon Teague & Adam Butler & Claire Stevens & Sarah Edkins & Sarah , 2007. "Patterns of somatic mutation in human cancer genomes," Nature, Nature, vol. 446(7132), pages 153-158, March.
    2. Lynda Chin & Joe W. Gray, 2008. "Translating insights from the cancer genome into clinical practice," Nature, Nature, vol. 452(7187), pages 553-563, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hazem El-Osta & Gerald Falchook & Apostolia Tsimberidou & David Hong & Aung Naing & Kevin Kim & Sijin Wen & Filip Janku & Razelle Kurzrock, 2011. "BRAF Mutations in Advanced Cancers: Clinical Characteristics and Outcomes," PLOS ONE, Public Library of Science, vol. 6(10), pages 1-13, October.
    2. Gaurav Mendiratta & Eugene Ke & Meraj Aziz & David Liarakos & Melinda Tong & Edward C. Stites, 2021. "Cancer gene mutation frequencies for the U.S. population," Nature Communications, Nature, vol. 12(1), pages 1-11, December.
    3. Oriana D’Ecclesiis & Saverio Caini & Chiara Martinoli & Sara Raimondi & Camilla Gaiaschi & Giulio Tosti & Paola Queirolo & Camilla Veneri & Calogero Saieva & Sara Gandini & Susanna Chiocca, 2021. "Gender-Dependent Specificities in Cutaneous Melanoma Predisposition, Risk Factors, Somatic Mutations, Prognostic and Predictive Factors: A Systematic Review," IJERPH, MDPI, vol. 18(15), pages 1-17, July.
    4. Ivana Bozic & Chay Paterson & Bartlomiej Waclaw, 2019. "On measuring selection in cancer from subclonal mutation frequencies," PLOS Computational Biology, Public Library of Science, vol. 15(9), pages 1-15, September.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1003545. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.