IDEAS home Printed from https://ideas.repec.org/a/bla/biomet/v77y2021i4p1445-1455.html
   My bibliography  Save this article

Using the “Hidden” genome to improve classification of cancer types

Author

Listed:
  • Saptarshi Chakraborty
  • Colin B. Begg
  • Ronglai Shen

Abstract

It is increasingly common clinically for cancer specimens to be examined using techniques that identify somatic mutations. In principle, these mutational profiles can be used to diagnose the tissue of origin, a critical task for the 3% to 5% of tumors that have an unknown primary site. Diagnosis of primary site is also critical for screening tests that employ circulating DNA. However, most mutations observed in any new tumor are very rarely occurring mutations, and indeed the preponderance of these may never have been observed in any previous recorded tumor. To create a viable diagnostic tool we need to harness the information content in this “hidden genome” of variants for which no direct information is available. To accomplish this we propose a multilevel meta‐feature regression to extract the critical information from rare variants in the training data in a way that permits us to also extract diagnostic information from any previously unobserved variants in the new tumor sample. A scalable implementation of the model is obtained by combining a high‐dimensional feature screening approach with a group‐lasso penalized maximum likelihood approach based on an equivalent mixed‐effect representation of the multilevel model. We apply the method to the Cancer Genome Atlas whole‐exome sequencing data set including 3702 tumor samples across seven common cancer sites. Results show that our multilevel approach can harness substantial diagnostic information from the hidden genome.

Suggested Citation

  • Saptarshi Chakraborty & Colin B. Begg & Ronglai Shen, 2021. "Using the “Hidden” genome to improve classification of cancer types," Biometrics, The International Biometric Society, vol. 77(4), pages 1445-1455, December.
  • Handle: RePEc:bla:biomet:v:77:y:2021:i:4:p:1445-1455
    DOI: 10.1111/biom.13367
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/biom.13367
    Download Restriction: no

    File URL: https://libkey.io/10.1111/biom.13367?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jianqing Fan & Jinchi Lv, 2008. "Sure independence screening for ultrahigh dimensional feature space," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(5), pages 849-911, November.
    2. Ludmil B. Alexandrov & Serena Nik-Zainal & David C. Wedge & Samuel A. J. R. Aparicio & Sam Behjati & Andrew V. Biankin & Graham R. Bignell & Niccolò Bolli & Ake Borg & Anne-Lise Børresen-Dale & Sandri, 2013. "Correction: Corrigendum: Signatures of mutational processes in human cancer," Nature, Nature, vol. 502(7470), pages 258-258, October.
    3. Ludmil B. Alexandrov & Serena Nik-Zainal & David C. Wedge & Samuel A. J. R. Aparicio & Sam Behjati & Andrew V. Biankin & Graham R. Bignell & Niccolò Bolli & Ake Borg & Anne-Lise Børresen-Dale & Sandri, 2013. "Signatures of mutational processes in human cancer," Nature, Nature, vol. 500(7463), pages 415-421, August.
    4. Vincent, Martin & Hansen, Niels Richard, 2014. "Sparse group lasso and high dimensional multinomial classification," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 771-786.
    5. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    6. Saptarshi Chakraborty & Arshi Arora & Colin B. Begg & Ronglai Shen, 2019. "Using somatic variant richness to mine signals from rare variants in the cancer genome," Nature Communications, Nature, vol. 10(1), pages 1-9, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    2. Xing Cheng & Jing An & Jitong Lou & Qisheng Gu & Weimin Ding & Gaith Nabil Droby & Yilin Wang & Chenghao Wang & Yanzhe Gao & Jay Ramanlal Anand & Abigail Shelton & Andrew Benson Satterlee & Breanna Ma, 2024. "Trans-lesion synthesis and mismatch repair pathway crosstalk defines chemoresistance and hypermutation mechanisms in glioblastoma," Nature Communications, Nature, vol. 15(1), pages 1-20, December.
    3. Michelle Dietzen & Haoran Zhai & Olivia Lucas & Oriol Pich & Christopher Barrington & Wei-Ting Lu & Sophia Ward & Yanping Guo & Robert E. Hynds & Simone Zaccaria & Charles Swanton & Nicholas McGranaha, 2024. "Replication timing alterations are associated with mutation acquisition during breast and lung cancer evolution," Nature Communications, Nature, vol. 15(1), pages 1-23, December.
    4. Shuichi Kawano, 2014. "Selection of tuning parameters in bridge regression models via Bayesian information criterion," Statistical Papers, Springer, vol. 55(4), pages 1207-1223, November.
    5. Daan M. K. Soest & Paulien E. Polderman & Wytze T. F. Toom & Janneke P. Keijer & Markus J. Roosmalen & Tim M. F. Leyten & Johannes Lehmann & Susan Zwakenberg & Sasha Henau & Ruben Boxtel & Boudewijn M, 2024. "Mitochondrial H2O2 release does not directly cause damage to chromosomal DNA," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    6. Marjan M. Naeini & Felicity Newell & Lauren G. Aoude & Vanessa F. Bonazzi & Kalpana Patel & Guy Lampe & Lambros T. Koufariotis & Vanessa Lakis & Venkateswar Addala & Olga Kondrashova & Rebecca L. John, 2023. "Multi-omic features of oesophageal adenocarcinoma in patients treated with preoperative neoadjuvant therapy," Nature Communications, Nature, vol. 14(1), pages 1-17, December.
    7. Jurica Levatić & Marina Salvadores & Francisco Fuster-Tormo & Fran Supek, 2022. "Mutational signatures are markers of drug sensitivity of cancer cells," Nature Communications, Nature, vol. 13(1), pages 1-19, December.
    8. Tara Muijlwijk & Irene H. Nauta & Anabel Lee & Kari J. T. Grünewald & Arjen Brink & Sonja H. Ganzevles & Robert J. Baatenburg de Jong & Lilit Atanesyan & Suvi Savola & Mark A. Wiel & Laura A. N. Pefer, 2024. "Hallmarks of a genomically distinct subclass of head and neck cancer," Nature Communications, Nature, vol. 15(1), pages 1-18, December.
    9. Ambrocio Sanchez & Pedro Ortega & Ramin Sakhtemani & Lavanya Manjunath & Sunwoo Oh & Elodie Bournique & Alexandrea Becker & Kyumin Kim & Cameron Durfee & Nuri Alpay Temiz & Xiaojiang S. Chen & Reuben , 2024. "Mesoscale DNA features impact APOBEC3A and APOBEC3B deaminase activity and shape tumor mutational landscapes," Nature Communications, Nature, vol. 15(1), pages 1-16, December.
    10. Qingli Guo & Eszter Lakatos & Ibrahim Al Bakir & Kit Curtius & Trevor A. Graham & Ville Mustonen, 2022. "The mutational signatures of formalin fixation on the human genome," Nature Communications, Nature, vol. 13(1), pages 1-14, December.
    11. Caralyn Reisle & Laura M. Williamson & Erin Pleasance & Anna Davies & Brayden Pellegrini & Dustin W. Bleile & Karen L. Mungall & Eric Chuah & Martin R. Jones & Yussanne Ma & Eleanor Lewis & Isaac Beck, 2022. "A platform for oncogenomic reporting and interpretation," Nature Communications, Nature, vol. 13(1), pages 1-11, December.
    12. Brittany N. Vandenberg & Marian F. Laughery & Cameron Cordero & Dalton Plummer & Debra Mitchell & Jordan Kreyenhagen & Fatimah Albaqshi & Alexander J. Brown & Piotr A. Mieczkowski & John J. Wyrick & S, 2023. "Contributions of replicative and translesion DNA polymerases to mutagenic bypass of canonical and atypical UV photoproducts," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    13. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    14. Anna Luiza Silva Almeida Vicente & Alexei Novoloaca & Vincent Cahais & Zainab Awada & Cyrille Cuenin & Natália Spitz & André Lopes Carvalho & Adriane Feijó Evangelista & Camila Souza Crovador & Rui Ma, 2022. "Cutaneous and acral melanoma cross-OMICs reveals prognostic cancer drivers associated with pathobiology and ultraviolet exposure," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    15. Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP56/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    16. Gerda Claeskens, 2012. "Focused estimation and model averaging with penalization methods: an overview," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 66(3), pages 272-287, August.
    17. Teresa Maria Rosaria Noviello & Anna Maria Giacomo & Francesca Pia Caruso & Alessia Covre & Roberta Mortarini & Giovanni Scala & Maria Claudia Costa & Sandra Coral & Wolf H. Fridman & Catherine Sautès, 2023. "Guadecitabine plus ipilimumab in unresectable melanoma: five-year follow-up and integrated multi-omic analysis in the phase 1b NIBIT-M4 trial," Nature Communications, Nature, vol. 14(1), pages 1-18, December.
    18. Hai-Bin Zhang & Jiao-Jiao Jiang & Yun-Bin Zhao, 2015. "On the proximal Landweber Newton method for a class of nonsmooth convex problems," Computational Optimization and Applications, Springer, vol. 61(1), pages 79-99, May.
    19. Maria Zhivagui & Areebah Hoda & Noelia Valenzuela & Yi-Yu Yeh & Jason Dai & Yudou He & Shuvro P. Nandi & Burcak Otlu & Bennett Houten & Ludmil B. Alexandrov, 2023. "DNA damage and somatic mutations in mammalian cells after irradiation with a nail polish dryer," Nature Communications, Nature, vol. 14(1), pages 1-14, December.
    20. Yan, Xiaodong & Wang, Hongni & Wang, Wei & Xie, Jinhan & Ren, Yanyan & Wang, Xinjun, 2021. "Optimal model averaging forecasting in high-dimensional survival analysis," International Journal of Forecasting, Elsevier, vol. 37(3), pages 1147-1155.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:biomet:v:77:y:2021:i:4:p:1445-1455. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0006-341X .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.