IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0007492.html
   My bibliography  Save this article

Accounting for Redundancy when Integrating Gene Interaction Databases

Author

Listed:
  • Antigoni Elefsinioti
  • Marit Ackermann
  • Andreas Beyer

Abstract

During the last years gene interaction networks are increasingly being used for the assessment and interpretation of biological measurements. Knowledge of the interaction partners of an unknown protein allows scientists to understand the complex relationships between genetic products, helps to reveal unknown biological functions and pathways, and get a more detailed picture of an organism's complexity. Being able to measure all protein interactions under all relevant conditions is virtually impossible. Hence, computational methods integrating different datasets for predicting gene interactions are needed. However, when integrating different sources one has to account for the fact that some parts of the information may be redundant, which may lead to an overestimation of the true likelihood of an interaction. Our method integrates information derived from three different databases (Bioverse, HiMAP and STRING) for predicting human gene interactions. A Bayesian approach was implemented in order to integrate the different data sources on a common quantitative scale. An important assumption of the Bayesian integration is independence of the input data (features). Our study shows that the conditional dependency cannot be ignored when combining gene interaction databases that rely on partially overlapping input data. In addition, we show how the correlation structure between the databases can be detected and we propose a linear model to correct for this bias. Benchmarking the results against two independent reference data sets shows that the integrated model outperforms the individual datasets. Our method provides an intuitive strategy for weighting the different features while accounting for their conditional dependencies.

Suggested Citation

  • Antigoni Elefsinioti & Marit Ackermann & Andreas Beyer, 2009. "Accounting for Redundancy when Integrating Gene Interaction Databases," PLOS ONE, Public Library of Science, vol. 4(10), pages 1-9, October.
  • Handle: RePEc:plo:pone00:0007492
    DOI: 10.1371/journal.pone.0007492
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0007492
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0007492&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0007492?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Edward M. Marcotte & Matteo Pellegrini & Michael J. Thompson & Todd O. Yeates & David Eisenberg, 1999. "A combined algorithm for genome-wide prediction of protein function," Nature, Nature, vol. 402(6757), pages 83-86, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Chuanhua Xing & David B Dunson, 2011. "Bayesian Inference for Genomic Data Integration Reduces Misclassification Rate in Predicting Protein-Protein Interactions," PLOS Computational Biology, Public Library of Science, vol. 7(7), pages 1-10, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Han Yan & Kavitha Venkatesan & John E Beaver & Niels Klitgord & Muhammed A Yildirim & Tong Hao & David E Hill & Michael E Cusick & Norbert Perrimon & Frederick P Roth & Marc Vidal, 2010. "A Genome-Wide Gene Function Prediction Resource for Drosophila melanogaster," PLOS ONE, Public Library of Science, vol. 5(8), pages 1-11, August.
    2. Christopher Y Park & Aaron K Wong & Casey S Greene & Jessica Rowland & Yuanfang Guan & Lars A Bongo & Rebecca D Burdine & Olga G Troyanskaya, 2013. "Functional Knowledge Transfer for High-accuracy Prediction of Under-studied Biological Processes," PLOS Computational Biology, Public Library of Science, vol. 9(3), pages 1-14, March.
    3. Heiko Müller & Francesco Mancuso, 2008. "Identification and Analysis of Co-Occurrence Networks with NetCutter," PLOS ONE, Public Library of Science, vol. 3(9), pages 1-16, September.
    4. Sara Mostafavi & Anna Goldenberg & Quaid Morris, 2012. "Labeling Nodes Using Three Degrees of Propagation," PLOS ONE, Public Library of Science, vol. 7(12), pages 1-10, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0007492. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.