IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v14y2020i3s1751157719304468.html
   My bibliography  Save this article

Mining Temporal Evolution of Knowledge Graphs and Genealogical Features for Literature-based Discovery Prediction

Author

Listed:
  • Choudhury, Nazim
  • Faisal, Fahim
  • Khushi, Matloob

Abstract

Literature-based discovery process identifies the important but implicit relations among information embedded in published literature. Existing techniques from Information Retrieval (IR) and Natural Language Processing (NLP) attempt to identify the hidden or unpublished connections between information concepts within published literature, however, these techniques overlooked the concept of predicting the future and emerging relations among scientific knowledge components such as author selected keywords encapsulated within the literature. Keyword Co-occurrence Network (KCN), built upon author selected keywords, is considered as a knowledge graph that focuses both on these knowledge components and knowledge structure of a scientific domain by examining the relationships between knowledge entities. Using data from two multidisciplinary research domains other than the bio-medical domain, and capitalizing on bibliometrics, the dynamicity of temporal KCNs, and a recurrent neural network, this study develops some novel features supportive for the prediction of the future literature-based discoveries - the emerging connections (co-appearances in the same article) among keywords. Temporal importance extracted from both bipartite and unipartite networks, communities defined by genealogical relations, and the relative importance of temporal citation counts were used in the feature construction process. Both node and edge-level features were input into a recurrent neural network to forecast the feature values and predict the future relations between different scientific concepts/topics represented by the author selected keywords. High performance rates, compared both against contemporary heterogeneous network-based method and preferential attachment process, suggest that these features complement both the prediction of future literature-based discoveries and emerging trend analysis.

Suggested Citation

  • Choudhury, Nazim & Faisal, Fahim & Khushi, Matloob, 2020. "Mining Temporal Evolution of Knowledge Graphs and Genealogical Features for Literature-based Discovery Prediction," Journal of Informetrics, Elsevier, vol. 14(3).
  • Handle: RePEc:eee:infome:v:14:y:2020:i:3:s1751157719304468
    DOI: 10.1016/j.joi.2020.101057
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157719304468
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2020.101057?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Marcelo A Montemurro & Damián H Zanette, 2013. "Keywords and Co-Occurrence Patterns in the Voynich Manuscript: An Information-Theoretic Analysis," PLOS ONE, Public Library of Science, vol. 8(6), pages 1-9, June.
    2. Min Song & Nam-Gi Han & Yong-Hwan Kim & Ying Ding & Tamy Chambers, 2013. "Discovering Implicit Entity Relation with the Gene-Citation-Gene Network," PLOS ONE, Public Library of Science, vol. 8(12), pages 1-1, December.
    3. Jeff Alstott & Ed Bullmore & Dietmar Plenz, 2014. "powerlaw: A Python Package for Analysis of Heavy-Tailed Distributions," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-11, January.
    4. M. M. Kessler, 1963. "Bibliographic coupling between scientific papers," American Documentation, Wiley Blackwell, vol. 14(1), pages 10-25, January.
    5. Dotsika, Fefie & Watkins, Andrew, 2017. "Identifying potentially disruptive trends by means of keyword network analysis," Technological Forecasting and Social Change, Elsevier, vol. 119(C), pages 114-127.
    6. Nees Jan van Eck & Ludo Waltman, 2009. "How to normalize cooccurrence data? An analysis of some well‐known similarity measures," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(8), pages 1635-1651, August.
    7. Peter Klimek & Aleksandar Jovanovic & Rainer Egloff & Reto Schneider, 2016. "Successful fish go with the flow: citation impact prediction based on centrality measures for term–document networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1265-1282, June.
    8. Nazim Choudhury & Shahadat Uddin, 2016. "Time-aware link prediction to explore network effects on temporal knowledge evolution," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(2), pages 745-776, August.
    9. Michael D. Gordon & Susan Dumais, 1998. "Using latent semantic indexing for literature based discovery," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 49(8), pages 674-685.
    10. Hsin-Ning Su & Pei-Chun Lee, 2010. "Mapping knowledge structure by keyword co-occurrence: a first look at journal papers in Technology Foresight," Scientometrics, Springer;Akadémiai Kiadó, vol. 85(1), pages 65-79, October.
    11. Leo Katz, 1953. "A new status index derived from sociometric analysis," Psychometrika, Springer;The Psychometric Society, vol. 18(1), pages 39-43, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Qi Wang & Bentao Zou & Jialin Jin & Yuefen Wang, 2024. "Studying the linkage patterns and incremental evolution of domain knowledge structure: a perspective of structure deconstruction," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 4249-4274, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Qikai Cheng & Jiamin Wang & Wei Lu & Yong Huang & Yi Bu, 2020. "Keyword-citation-keyword network: a new perspective of discipline knowledge structure analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 1923-1943, September.
    2. Lilian Cervo Cabrera & Carlos Eduardo Caldarelli & Marcia Regina Gabardo Camara, 2020. "Mapping collaboration in international coffee certification research," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 2597-2618, September.
    3. Ying Huang & Wolfgang Glänzel & Lin Zhang, 2021. "Tracing the development of mapping knowledge domains," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6201-6224, July.
    4. Li, Heyang & Wu, Meijun & Wang, Yougui & Zeng, An, 2022. "Bibliographic coupling networks reveal the advantage of diversification in scientific projects," Journal of Informetrics, Elsevier, vol. 16(3).
    5. Perianes-Rodriguez, Antonio & Waltman, Ludo & van Eck, Nees Jan, 2016. "Constructing bibliometric networks: A comparison between full and fractional counting," Journal of Informetrics, Elsevier, vol. 10(4), pages 1178-1195.
    6. Jiang, Chenming & Bhat, Chandra R. & Lam, William H.K., 2020. "A bibliometric overview of Transportation Research Part B: Methodological in the past forty years (1979–2019)," Transportation Research Part B: Methodological, Elsevier, vol. 138(C), pages 268-291.
    7. Gorupec Natalia & Brehmer Nataliia & Tiberius Victor & Kraus Sascha, 2022. "Tackling uncertain future scenarios with real options: A review and research framework," The Irish Journal of Management, Sciendo, vol. 41(1), pages 69-88, July.
    8. Takano, Yasutomo & Kajikawa, Yuya, 2019. "Extracting commercialization opportunities of the Internet of Things: Measuring text similarity between papers and patents," Technological Forecasting and Social Change, Elsevier, vol. 138(C), pages 45-68.
    9. Juntao Zheng & Niancai Liu, 2015. "Mapping of important international academic awards," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(3), pages 763-791, September.
    10. John Bryden & Eric Silverman & Simon T Powers, 2022. "Modelling transitions between egalitarian, dynamic leader and absolutist power structures," PLOS ONE, Public Library of Science, vol. 17(2), pages 1-13, February.
    11. Ma, Jing & Abrams, Natalie F. & Porter, Alan L. & Zhu, Donghua & Farrell, Dorothy, 2019. "Identifying translational indicators and technology opportunities for nanomedical research using tech mining: The case of gold nanostructures," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 767-775.
    12. Nazim Choudhury & Shahadat Uddin, 2016. "Time-aware link prediction to explore network effects on temporal knowledge evolution," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(2), pages 745-776, August.
    13. Patrick Röhm, 2018. "Exploring the landscape of corporate venture capital: a systematic review of the entrepreneurial and finance literature," Management Review Quarterly, Springer, vol. 68(3), pages 279-319, August.
    14. Hric, Darko & Kaski, Kimmo & Kivelä, Mikko, 2018. "Stochastic block model reveals maps of citation patterns and their evolution in time," Journal of Informetrics, Elsevier, vol. 12(3), pages 757-783.
    15. Igors Skute, 2019. "Opening the black box of academic entrepreneurship: a bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(1), pages 237-265, July.
    16. Igors Skute & Kasia Zalewska-Kurek & Isabella Hatak & Petra Weerd-Nederhof, 2019. "Mapping the field: a bibliometric analysis of the literature on university–industry collaborations," The Journal of Technology Transfer, Springer, vol. 44(3), pages 916-947, June.
    17. Kumar, Ajay & Singh, Shashank Sheshar & Singh, Kuldeep & Biswas, Bhaskar, 2020. "Link prediction techniques, applications, and performance: A survey," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 553(C).
    18. Luís Farinha & João Renato Sebastião & Carlos Sampaio & João Lopes, 2020. "Social innovation and social entrepreneurship: discovering origins, exploring current and future trends," International Review on Public and Nonprofit Marketing, Springer;International Association of Public and Non-Profit Marketing, vol. 17(1), pages 77-96, March.
    19. Secundo, Giustina & Ndou, Valentina & Vecchio, Pasquale Del & De Pascale, Gianluigi, 2020. "Sustainable development, intellectual capital and technology policies: A structured literature review and future research agenda," Technological Forecasting and Social Change, Elsevier, vol. 153(C).
    20. Nassiri, Isar & Masoudi-Nejad, Ali & Jalili, Mahdi & Moeini, Ali, 2013. "Normalized Similarity Index: An adjusted index to prioritize article citations," Journal of Informetrics, Elsevier, vol. 7(1), pages 91-98.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:14:y:2020:i:3:s1751157719304468. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.