IDEAS home Printed from https://ideas.repec.org/a/plo/pcbi00/1003666.html
   My bibliography  Save this article

Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference

Author

Listed:
  • Guocai Chen
  • Michael J Cairelli
  • Halil Kilicoglu
  • Dongwook Shin
  • Thomas C Rindflesch

Abstract

Gene regulatory networks are a crucial aspect of systems biology in describing molecular mechanisms of the cell. Various computational models rely on random gene selection to infer such networks from microarray data. While incorporation of prior knowledge into data analysis has been deemed important, in practice, it has generally been limited to referencing genes in probe sets and using curated knowledge bases. We investigate the impact of augmenting microarray data with semantic relations automatically extracted from the literature, with the view that relations encoding gene/protein interactions eliminate the need for random selection of components in non-exhaustive approaches, producing a more accurate model of cellular behavior. A genetic algorithm is then used to optimize the strength of interactions using microarray data and an artificial neural network fitness function. The result is a directed and weighted network providing the individual contribution of each gene to its target. For testing, we used invasive ductile carcinoma of the breast to query the literature and a microarray set containing gene expression changes in these cells over several time points. Our model demonstrates significantly better fitness than the state-of-the-art model, which relies on an initial random selection of genes. Comparison to the component pathways of the KEGG Pathways in Cancer map reveals that the resulting networks contain both known and novel relationships. The p53 pathway results were manually validated in the literature. 60% of non-KEGG relationships were supported (74% for highly weighted interactions). The method was then applied to yeast data and our model again outperformed the comparison model. Our results demonstrate the advantage of combining gene interactions extracted from the literature in the form of semantic relations with microarray analysis in generating contribution-weighted gene regulatory networks. This methodology can make a significant contribution to understanding the complex interactions involved in cellular behavior and molecular physiology.Author Summary: We have developed a methodology that combines standard computational analysis of gene expression data with knowledge in the literature to identify pathways of gene and protein interactions. We extract the knowledge from PubMed citations using a tool (SemRep) that identifies specific relationships between genes or proteins. We string together networks of individual interactions that are found within citations that refer to the target pathways. Upon this skeleton of interactions, we calculate the weight of the interaction with the gene expression data captured over multiple time points using state-of-the-art analysis algorithms. Not surprisingly, this approach of combining prior knowledge into the analysis process significantly improves the performance of the analysis. This work is most significant as an example of how the wealth of textual data related to gene interactions can be incorporated into computational analysis, not solely to identify this type of pathway (a gene regulatory network) but for any type of similar biological problem.

Suggested Citation

  • Guocai Chen & Michael J Cairelli & Halil Kilicoglu & Dongwook Shin & Thomas C Rindflesch, 2014. "Augmenting Microarray Data with Literature-Based Knowledge to Enhance Gene Regulatory Network Inference," PLOS Computational Biology, Public Library of Science, vol. 10(6), pages 1-16, June.
  • Handle: RePEc:plo:pcbi00:1003666
    DOI: 10.1371/journal.pcbi.1003666
    as

    Download full text from publisher

    File URL: https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1003666
    Download Restriction: no

    File URL: https://journals.plos.org/ploscompbiol/article/file?id=10.1371/journal.pcbi.1003666&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pcbi.1003666?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Mikhail V. Blagosklonny & Arthur B. Pardee, 2002. "Conceptual biology: Unearthing the gems," Nature, Nature, vol. 416(6879), pages 373-373, March.
    2. Eric H. Davidson, 2010. "Emerging properties of animal gene regulatory networks," Nature, Nature, vol. 468(7326), pages 911-920, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhou, Peipei & Cai, Shuiming & Liu, Zengrong & Chen, Luonan & Wang, Ruiqi, 2013. "Coupling switches and oscillators as a means to shape cellular signals in biomolecular systems," Chaos, Solitons & Fractals, Elsevier, vol. 50(C), pages 115-126.
    2. Nuzhat Haneef, 2013. "Empirical research consolidation: a generic overview and a classification scheme for methods," Quality & Quantity: International Journal of Methodology, Springer, vol. 47(1), pages 383-410, January.
    3. Rabajante, Jomar Fajardo & Talaue, Cherryl Ortega, 2015. "Equilibrium switching and mathematical properties of nonlinear interaction networks with concurrent antagonism and self-stimulation," Chaos, Solitons & Fractals, Elsevier, vol. 73(C), pages 166-182.
    4. Mark D Alter, 2013. "Studying Gene Expression System Regulation at the Program Level," PLOS ONE, Public Library of Science, vol. 8(4), pages 1-8, April.
    5. Song, Min & Heo, Go Eun & Ding, Ying, 2015. "SemPathFinder: Semantic path analysis for discovering publicly unknown knowledge," Journal of Informetrics, Elsevier, vol. 9(4), pages 686-703.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pcbi00:1003666. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ploscompbiol (email available below). General contact details of provider: https://journals.plos.org/ploscompbiol/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.