IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0232891.html
   My bibliography  Save this article

Neural networks for open and closed Literature-based Discovery

Author

Listed:
  • Gamal Crichton
  • Simon Baker
  • Yufan Guo
  • Anna Korhonen

Abstract

Literature-based Discovery (LBD) aims to discover new knowledge automatically from large collections of literature. Scientific literature is growing at an exponential rate, making it difficult for researchers to stay current in their discipline and easy to miss knowledge necessary to advance their research. LBD can facilitate hypothesis testing and generation and thus accelerate scientific progress. Neural networks have demonstrated improved performance on LBD-related tasks but are yet to be applied to it. We propose four graph-based, neural network methods to perform open and closed LBD. We compared our methods with those used by the state-of-the-art LION LBD system on the same evaluations to replicate recently published findings in cancer biology. We also applied them to a time-sliced dataset of human-curated peer-reviewed biological interactions. These evaluations and the metrics they employ represent performance on real-world knowledge advances and are thus robust indicators of approach efficacy. In the first experiments, our best methods performed 2-4 times better than the baselines in closed discovery and 2-3 times better in open discovery. In the second, our best methods performed almost 2 times better than the baselines in open discovery. These results are strong indications that neural LBD is potentially a very effective approach for generating new scientific discoveries from existing literature. The code for our models and other information can be found at: https://github.com/cambridgeltl/nn_for_LBD.

Suggested Citation

  • Gamal Crichton & Simon Baker & Yufan Guo & Anna Korhonen, 2020. "Neural networks for open and closed Literature-based Discovery," PLOS ONE, Public Library of Science, vol. 15(5), pages 1-16, May.
  • Handle: RePEc:plo:pone00:0232891
    DOI: 10.1371/journal.pone.0232891
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0232891
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0232891&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0232891?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Michael D. Gordon & Robert K. Lindsay, 1996. "Toward discovery support systems: A replication, re‐examination, and extension of Swanson's work on literature‐based discovery of a connection between Raynaud's and fish oil," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 47(2), pages 116-128, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alan L. Porter & Alisa Kongthon & Jye-Chyi (JC) Lu, 2002. "Research profiling: Improving the literature review," Scientometrics, Springer;Akadémiai Kiadó, vol. 53(3), pages 351-370, March.
    2. Andrej Kastrin & Dimitar Hristovski, 2021. "Scientometric analysis and knowledge mapping of literature-based discovery (1986–2020)," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1415-1451, February.
    3. Jeong, Yoo Kyung & Xie, Qing & Yan, Erjia & Song, Min, 2020. "Examining drug and side effect relation using author–entity pair bipartite networks," Journal of Informetrics, Elsevier, vol. 14(1).
    4. Christian Sternitzke, 2009. "Patents and publications as sources of novel and inventive knowledge," Scientometrics, Springer;Akadémiai Kiadó, vol. 79(3), pages 551-561, June.
    5. Lv, Yanhua & Ding, Ying & Song, Min & Duan, Zhiguang, 2018. "Topology-driven trend analysis for drug discovery," Journal of Informetrics, Elsevier, vol. 12(3), pages 893-905.
    6. Johannes Stegmann & Guenter Grohmann, 2003. "Hypothesis generation guided by co-word clustering," Scientometrics, Springer;Akadémiai Kiadó, vol. 56(1), pages 111-135, January.
    7. Chen, Chaomei & Chen, Yue & Horowitz, Mark & Hou, Haiyan & Liu, Zeyuan & Pellegrino, Donald, 2009. "Towards an explanatory and computational theory of scientific discovery," Journal of Informetrics, Elsevier, vol. 3(3), pages 191-209.
    8. Chaomei Chen & Min Song, 2019. "Visualizing a field of research: A methodology of systematic scientometric reviews," PLOS ONE, Public Library of Science, vol. 14(10), pages 1-25, October.
    9. Emil Hudomalj & Gaj Vidmar, 2003. "OLAP and bibliographic databases," Scientometrics, Springer;Akadémiai Kiadó, vol. 58(3), pages 609-622, November.
    10. Guangyu Zou & Levent Yilmaz, 2011. "Dynamics of knowledge creation in global participatory science communities: open innovation communities from a network perspective," Computational and Mathematical Organization Theory, Springer, vol. 17(1), pages 35-58, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0232891. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.