IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1004618.html
   My bibliography  Save this article

SAAS-CNV: A Joint Segmentation Approach on Aggregated and Allele Specific Signals for the Identification of Somatic Copy Number Alterations with Next-Generation Sequencing Data

Author

Listed:
  • Zhongyang Zhang
  • Ke Hao

Abstract

Cancer genomes exhibit profound somatic copy number alterations (SCNAs). Studying tumor SCNAs using massively parallel sequencing provides unprecedented resolution and meanwhile gives rise to new challenges in data analysis, complicated by tumor aneuploidy and heterogeneity as well as normal cell contamination. While the majority of read depth based methods utilize total sequencing depth alone for SCNA inference, the allele specific signals are undervalued. We proposed a joint segmentation and inference approach using both signals to meet some of the challenges. Our method consists of four major steps: 1) extracting read depth supporting reference and alternative alleles at each SNP/Indel locus and comparing the total read depth and alternative allele proportion between tumor and matched normal sample; 2) performing joint segmentation on the two signal dimensions; 3) correcting the copy number baseline from which the SCNA state is determined; 4) calling SCNA state for each segment based on both signal dimensions. The method is applicable to whole exome/genome sequencing (WES/WGS) as well as SNP array data in a tumor-control study. We applied the method to a dataset containing no SCNAs to test the specificity, created by pairing sequencing replicates of a single HapMap sample as normal/tumor pairs, as well as a large-scale WGS dataset consisting of 88 liver tumors along with adjacent normal tissues. Compared with representative methods, our method demonstrated improved accuracy, scalability to large cancer studies, capability in handling both sequencing and SNP array data, and the potential to improve the estimation of tumor ploidy and purity.Author Summary: Somatic copy number alterations (SCNAs) are essential in oncogensis and progression of a variety of cancers. Accurate identification and quatification of SCNAs are fundamental in the effort of cataloging different variants in cancer genome. This task has its own challenges due to complex nature of tumor SCNA profile and is further complicated by the heterogeneity of the cells collected from a tumor tissue and the contamination from adjacent normal cells, making it difficult for the methods well tailored for the detection of germline copy number variation (CNV) to fit in tumor SCNA detection. Next generation sequencing provides an opportunity to comprehensively characterize SCNA at unprecedent resolution. While total read depth information is commonly used in SCNA detection methods, the allele-specific read depth is less often considered, leading to sub-optimal solution. By incorparating both pieces of information, we developed a segmentation-based pipeline to address aforementioned issues in SCNA detection. This tool is applicable on both deep sequencing data as well as SNP array data and enables accurate and efficient characterization of genome-wide SCNA profile to facilitate large-scale cancer studies.

Suggested Citation

  • Zhongyang Zhang & Ke Hao, 2015. "SAAS-CNV: A Joint Segmentation Approach on Aggregated and Allele Specific Signals for the Identification of Somatic Copy Number Alterations with Next-Generation Sequencing Data," PLOS Computational Biology, Public Library of Science, vol. 11(11), pages 1-27, November.
  • Handle: RePEc:plo:pcbi00:1004618
    DOI: 10.1371/journal.pcbi.1004618
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1004618
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1004618&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1004618?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Nancy R. Zhang & David O. Siegmund & Hanlee Ji & Jun Z. Li, 2010. "Detecting simultaneous changepoints in multiple sequences," Biometrika, Biometrika Trust, vol. 97(3), pages 631-645.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yana Melnykov & Marcus Perry, 2024. "On Robust Change Point Detection and Estimation in Multisubject Studies," Sankhya A: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 86(2), pages 827-879, August.
    2. Wenbiao Zhao & Lixing Zhu & Falong Tan, 2024. "Multiple change point detection for high-dimensional data," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 33(3), pages 809-846, September.
    3. Daniel Philps & Tillman Weyde & Artur d'Avila Garcez & Roy Batchelor, 2018. "Continual Learning Augmented Investment Decisions," Papers 1812.02340, arXiv.org, revised Jan 2019.
    4. Liu, Bin & Zhang, Xinsheng & Liu, Yufeng, 2022. "High dimensional change point inference: Recent developments and extensions," Journal of Multivariate Analysis, Elsevier, vol. 188(C).
    5. Bin Liu & Cheng Zhou & Xinsheng Zhang & Yufeng Liu, 2020. "A unified data‐adaptive framework for high dimensional change point detection," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 82(4), pages 933-963, September.
    6. Cai, Qingyun, 2018. "A scoring criterion for rejection of clustered p-values," Computational Statistics & Data Analysis, Elsevier, vol. 121(C), pages 180-189.
    7. Julius Juodakis & Stephen Marsland, 2023. "Epidemic changepoint detection in the presence of nuisance changes," Statistical Papers, Springer, vol. 64(1), pages 17-39, February.
    8. Simon A. C. Taylor & Rebecca Killick & Jonathan Burr & Louise Rogerson, 2021. "Assessing daily patterns using home activity sensors and within period changepoint detection," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 70(3), pages 579-595, June.
    9. Mengjia Yu & Xiaohui Chen, 2021. "Finite sample change point inference and identification for high‐dimensional mean vectors," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 83(2), pages 247-270, April.
    10. Bertille Follain & Tengyao Wang & Richard J. Samworth, 2022. "High‐dimensional changepoint estimation with heterogeneous missingness," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 84(3), pages 1023-1055, July.
    11. Zhou, Houlin & Zhu, Hanbing & Wang, Xuejun, 2024. "Change point detection via feedforward neural networks with theoretical guarantees," Computational Statistics & Data Analysis, Elsevier, vol. 193(C).
    12. Follain, Bertille & Wang, Tengyao & Samworth, Richard J., 2022. "High-dimensional changepoint estimation with heterogeneous missingness," LSE Research Online Documents on Economics 115014, London School of Economics and Political Science, LSE Library.
    13. Chen, Cathy Yi-hsuan & Okhrin, Yarema & Wang, Tengyao, 2022. "Monitoring network changes in social media," LSE Research Online Documents on Economics 113742, London School of Economics and Political Science, LSE Library.
    14. Hahn, Georg, 2022. "Online multivariate changepoint detection with type I error control and constant time/memory updates per series," Statistics & Probability Letters, Elsevier, vol. 181(C).

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1004618. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.