IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0086309.html
   My bibliography  Save this article

Integrative Gene Network Construction to Analyze Cancer Recurrence Using Semi-Supervised Learning

Author

Listed:
  • Chihyun Park
  • Jaegyoon Ahn
  • Hyunjin Kim
  • Sanghyun Park

Abstract

Background: The prognosis of cancer recurrence is an important research area in bioinformatics and is challenging due to the small sample sizes compared to the vast number of genes. There have been several attempts to predict cancer recurrence. Most studies employed a supervised approach, which uses only a few labeled samples. Semi-supervised learning can be a great alternative to solve this problem. There have been few attempts based on manifold assumptions to reveal the detailed roles of identified cancer genes in recurrence. Results: In order to predict cancer recurrence, we proposed a novel semi-supervised learning algorithm based on a graph regularization approach. We transformed the gene expression data into a graph structure for semi-supervised learning and integrated protein interaction data with the gene expression data to select functionally-related gene pairs. Then, we predicted the recurrence of cancer by applying a regularization approach to the constructed graph containing both labeled and unlabeled nodes. Conclusions: The average improvement rate of accuracy for three different cancer datasets was 24.9% compared to existing supervised and semi-supervised methods. We performed functional enrichment on the gene networks used for learning. We identified that those gene networks are significantly associated with cancer-recurrence-related biological functions. Our algorithm was developed with standard C++ and is available in Linux and MS Windows formats in the STL library. The executable program is freely available at: http://embio.yonsei.ac.kr/~Park/ssl.php.

Suggested Citation

  • Chihyun Park & Jaegyoon Ahn & Hyunjin Kim & Sanghyun Park, 2014. "Integrative Gene Network Construction to Analyze Cancer Recurrence Using Semi-Supervised Learning," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-9, January.
  • Handle: RePEc:plo:pone00:0086309
    DOI: 10.1371/journal.pone.0086309
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0086309
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0086309&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0086309?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Yu Lianbo & Gulati Parul & Fernandez Soledad & Pennell Michael & Kirschner Lawrence & Jarjoura David, 2011. "Fully Moderated T-statistic for Small Sample Size Gene Expression Arrays," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-22, September.
    2. Jianhua Hu & Fred A. Wright, 2007. "Assessing Differential Gene Expression with Small Sample Sizes in Oligonucleotide Arrays Using a Mean-Variance Model," Biometrics, The International Biometric Society, vol. 63(1), pages 41-49, March.
    3. Eric Bair & Robert Tibshirani, 2004. "Semi-Supervised Methods to Predict Patient Survival from Gene Expression Data," PLOS Biology, Public Library of Science, vol. 2(4), pages 1-1, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hirakawa, Akihiro & Hamada, Chikuma & Yoshimura, Isao, 2011. "Sample size calculation for a regularized t-statistic in microarray experiments," Statistics & Probability Letters, Elsevier, vol. 81(7), pages 870-875, July.
    2. Qin, Huaizhen & Ouyang, Weiwei, 2015. "Statistical properties of gene–gene correlations in omics experiments," Statistics & Probability Letters, Elsevier, vol. 97(C), pages 206-211.
    3. Hongjuan Zhao & Börje Ljungberg & Kjell Grankvist & Torgny Rasmuson & Robert Tibshirani & James D Brooks, 2005. "Gene Expression Profiling Predicts Survival in Conventional Renal Cell Carcinoma," PLOS Medicine, Public Library of Science, vol. 3(1), pages 1-1, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0086309. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.