IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v127y2022i9d10.1007_s11192-021-04187-4.html
   My bibliography  Save this article

Doc2vec-based link prediction approach using SAO structures: application to patent network

Author

Listed:
  • Byungun Yoon

    (Dongguk University)

  • Songhee Kim

    (Dongguk University)

  • Sunhye Kim

    (Dongguk University)

  • Hyeonju Seol

    (Chungnam National University)

Abstract

As the amount of documents has exploded in the Internet era, many researchers have tried to understand the relationships between documents and predict the links between similar but unconnected documents. However, existing link prediction techniques that use the predefined links of documents might provide incorrect results, because of the generic problem of citation analysis. Moreover, they may fail to reflect important contents of documents in the link prediction process. Thus, we propose a new link prediction approach that employs the Doc2vec algorithm, a document-embedding method, in order to predict potential links between documents, by reflecting the functional context of technological words. For this, first, we collected both citation information and documents of patents of interest, and generated a patent network by using the citation relationship between patents. Second, we identified unconnected links between nodes and transformed the patent document into document vectors, based on the Doc2vec algorithm. In particular, since patent documents include useful functions for solving technological problems, the proposed approach extracts subject-action-object (SAO) structures that we used to generate document vectors. Then, we calculated the similarity between patents in the unconnected links of a patent network, and could predict potential links by using the similarity. Third, we validated the results of the proposed approach by comparing them using the Adamic–Adar technique, one of the traditional link prediction techniques, and word vector-based link prediction. We applied the Doc2vec-based link prediction approach to a real case, the unmanned aerial vehicle (UAV) technology field. We found that the proposed approach makes better predictions performance than the Adamic–Adar technique and the word vector approach. Our results can help analyzers accurately forecast future relationships between nodes in a network, and give R&D managers insightful information on the future direction of technological development by using a patent network.

Suggested Citation

  • Byungun Yoon & Songhee Kim & Sunhye Kim & Hyeonju Seol, 2022. "Doc2vec-based link prediction approach using SAO structures: application to patent network," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5385-5414, September.
  • Handle: RePEc:spr:scient:v:127:y:2022:i:9:d:10.1007_s11192-021-04187-4
    DOI: 10.1007/s11192-021-04187-4
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-021-04187-4
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-021-04187-4?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zhang, Yi & Lu, Jie & Liu, Feng & Liu, Qian & Porter, Alan & Chen, Hongshu & Zhang, Guangquan, 2018. "Does deep learning help topic extraction? A kernel k-means clustering method with word embedding," Journal of Informetrics, Elsevier, vol. 12(4), pages 1099-1117.
    2. Lü, Linyuan & Zhou, Tao, 2011. "Link prediction in complex networks: A survey," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 390(6), pages 1150-1170.
    3. Behrouzi, Saman & Shafaeipour Sarmoor, Zahra & Hajsadeghi, Khosrow & Kavousi, Kaveh, 2020. "Predicting scientific research trends based on link prediction in keyword networks," Journal of Informetrics, Elsevier, vol. 14(4).
    4. Guo, Junfang & Wang, Xuefeng & Li, Qianrui & Zhu, Donghua, 2016. "Subject–action–object-based morphology analysis for determining the direction of technological change," Technological Forecasting and Social Change, Elsevier, vol. 105(C), pages 27-40.
    5. David Liben‐Nowell & Jon Kleinberg, 2007. "The link‐prediction problem for social networks," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(7), pages 1019-1031, May.
    6. Hong-Liang Sun & Eugene Ch’ng & Xi Yong & Jonathan M. Garibaldi & Simon See & Duan-Bing Chen, 2017. "An improved game-theoretic approach to uncover overlapping communities," International Journal of Modern Physics C (IJMPC), World Scientific Publishing Co. Pte. Ltd., vol. 28(09), pages 1-17, September.
    7. Xie, Qing & Zhang, Xinyuan & Ding, Ying & Song, Min, 2020. "Monolingual and multilingual topic analysis using LDA and BERT embeddings," Journal of Informetrics, Elsevier, vol. 14(3).
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Shicheng Tan & Tao Zhang & Shu Zhao & Yanping Zhang, 2023. "Self-supervised scientific document recommendation based on contrastive learning," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5027-5049, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Liu, Zhenfeng & Feng, Jian & Uden, Lorna, 2023. "Technology opportunity analysis using hierarchical semantic networks and dual link prediction," Technovation, Elsevier, vol. 128(C).
    2. Huang, Lu & Chen, Xiang & Ni, Xingxing & Liu, Jiarun & Cao, Xiaoli & Wang, Changtian, 2021. "Tracking the dynamics of co-word networks for emerging topic identification," Technological Forecasting and Social Change, Elsevier, vol. 170(C).
    3. Lu Huang & Xiang Chen & Yi Zhang & Yihe Zhu & Suyi Li & Xingxing Ni, 2021. "Dynamic network analytics for recommending scientific collaborators," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 8789-8814, November.
    4. Leto Peel & Tiago P. Peixoto & Manlio De Domenico, 2022. "Statistical inference links data and theory in network science," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
    5. Rafiee, Samira & Salavati, Chiman & Abdollahpouri, Alireza, 2020. "CNDP: Link prediction based on common neighbors degree penalization," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 539(C).
    6. Lee, Yan-Li & Zhou, Tao, 2021. "Collaborative filtering approach to link prediction," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 578(C).
    7. Shugang Li & Ziming Wang & Beiyan Zhang & Boyi Zhu & Zhifang Wen & Zhaoxu Yu, 2022. "The Research of “Products Rapidly Attracting Users” Based on the Fully Integrated Link Prediction Algorithm," Mathematics, MDPI, vol. 10(14), pages 1-19, July.
    8. Kai Yang & Yuan Liu & Zijuan Zhao & Xingxing Zhou & Peijin Ding, 2023. "Graph attention network via node similarity for link prediction," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 96(3), pages 1-10, March.
    9. Xie, Qing & Zhang, Xinyuan & Song, Min, 2021. "A network embedding-based scholar assessment indicator considering four facets: Research topic, author credit allocation, field-normalized journal impact, and published time," Journal of Informetrics, Elsevier, vol. 15(4).
    10. Mungo, Luca & Lafond, François & Astudillo-Estévez, Pablo & Farmer, J. Doyne, 2023. "Reconstructing production networks using machine learning," Journal of Economic Dynamics and Control, Elsevier, vol. 148(C).
    11. Yu, Jiating & Wu, Ling-Yun, 2022. "Multiple Order Local Information model for link prediction in complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 600(C).
    12. Wang, Feifei & Dong, Jiaxin & Lu, Wanzhao & Xu, Shuo, 2023. "Collaboration prediction based on multilayer all-author tripartite citation networks: A case study of gene editing," Journal of Informetrics, Elsevier, vol. 17(1).
    13. Guan-Nan Wang & Hui Gao & Lian Chen & Dennis N A Mensah & Yan Fu, 2015. "Predicting Positive and Negative Relationships in Large Social Networks," PLOS ONE, Public Library of Science, vol. 10(6), pages 1-14, June.
    14. Guilherme T Valente & Marcio L Acencio & Cesar Martins & Ney Lemke, 2013. "The Development of a Universal In Silico Predictor of Protein-Protein Interactions," PLOS ONE, Public Library of Science, vol. 8(5), pages 1-11, May.
    15. Najari, Shaghayegh & Salehi, Mostafa & Ranjbar, Vahid & Jalili, Mahdi, 2019. "Link prediction in multiplex networks based on interlayer similarity," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 536(C).
    16. Zhang, Ting & Zhang, Kun & Li, Xun & Lv, Laishui & Sun, Qi, 2021. "Semi-supervised link prediction based on non-negative matrix factorization for temporal networks," Chaos, Solitons & Fractals, Elsevier, vol. 145(C).
    17. Zhou, Tao, 2023. "Discriminating abilities of threshold-free evaluation metrics in link prediction," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 615(C).
    18. Tad Dallas & Andrew W Park & John M Drake, 2017. "Predicting cryptic links in host-parasite networks," PLOS Computational Biology, Public Library of Science, vol. 13(5), pages 1-15, May.
    19. Ding, Rui & Ujang, Norsidah & Hamid, Hussain bin & Manan, Mohd Shahrudin Abd & He, Yuou & Li, Rong & Wu, Jianjun, 2018. "Detecting the urban traffic network structure dynamics through the growth and analysis of multi-layer networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 503(C), pages 800-817.
    20. Chunning Wang & Fengqin Tang & Xuejing Zhao, 2023. "LPGRI: A Global Relevance-Based Link Prediction Approach for Multiplex Networks," Mathematics, MDPI, vol. 11(14), pages 1-15, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:127:y:2022:i:9:d:10.1007_s11192-021-04187-4. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.