IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v127y2022i11d10.1007_s11192-022-04301-0.html
   My bibliography  Save this article

A two-stage deep learning-based system for patent citation recommendation

Author

Listed:
  • Jaewoong Choi

    (Konkuk University)

  • Jiho Lee

    (Konkuk University)

  • Janghyeok Yoon

    (Konkuk University)

  • Sion Jang

    (Netmarble AI Center)

  • Jaeyoung Kim

    (VUNO INC)

  • Sungchul Choi

    (Pukyong National University)

Abstract

The increasing number of patents leads patent applicants and examiners to spend more time and cost on searching and citing prior patents. Deep learning has exhibited outstanding performance in the recommendation of movies, music, products, and paper citation. However, the application of deep learning in patent citation recommendation has not been addressed well. Despite many attempts to apply deep learning models to the patent domain, there is little attention to the patent citation recommendation. Since patent citation is determined according to a complex technological context beyond simply finding semantically similar preceding documents, it is necessary to understand the context in which the citation occurs. Therefore, we propose a dataset named as a PatentNet to capture technological citation context based on textual information, meta data and examiner citation information for about 110,000 patents. Also, this paper proposes a strong benchmark model considering the similarity of patent text as well as technological citation context using cooperative patent classification (CPC) code. The proposed model exploits a two-stage structure of selecting based on textual information and pre-trained CPC embedding values and re-ranking candidates using a trained deep learning model with examiner citation information. The proposed model achieved improved performance with an MRR of 0.2506 on the benchmarking dataset, outperforming the existing methods. The results obtained show that learning about the descriptive citation context, rather than simple text similarity, has an important influence on citation recommendation. The proposed model and dataset can help researchers to understand technological citation context and assist patent examiners or applicants to find prior patents to cite effectively.

Suggested Citation

  • Jaewoong Choi & Jiho Lee & Janghyeok Yoon & Sion Jang & Jaeyoung Kim & Sungchul Choi, 2022. "A two-stage deep learning-based system for patent citation recommendation," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6615-6636, November.
  • Handle: RePEc:spr:scient:v:127:y:2022:i:11:d:10.1007_s11192-022-04301-0
    DOI: 10.1007/s11192-022-04301-0
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-022-04301-0
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-022-04301-0?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lee, Changyong & Kwon, Ohjin & Kim, Myeongjung & Kwon, Daeil, 2018. "Early identification of emerging technologies: A machine learning approach using multiple patent indicators," Technological Forecasting and Social Change, Elsevier, vol. 127(C), pages 291-303.
    2. Xuefeng Wang & Huichao Ren & Yun Chen & Yuqin Liu & Yali Qiao & Ying Huang, 2019. "Measuring patent similarity with SAO semantic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 1-23, October.
    3. Adam B. Jaffe & Manuel Trajtenberg & Rebecca Henderson, 1993. "Geographic Localization of Knowledge Spillovers as Evidenced by Patent Citations," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 108(3), pages 577-598.
    4. Lichun Zhou, 2020. "Product advertising recommendation in e-commerce based on deep learning and distributed expression," Electronic Commerce Research, Springer, vol. 20(2), pages 321-342, June.
    5. Yonghe Lu & Xin Xiong & Weiting Zhang & Jiaxin Liu & Ruijie Zhao, 2020. "Research on classification and similarity of patent citation based on deep learning," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 813-839, May.
    6. Trappey, Amy & Trappey, Charles V. & Hsieh, Alex, 2021. "An intelligent patent recommender adopting machine learning approach for natural language processing: A case study for smart machinery technology mining," Technological Forecasting and Social Change, Elsevier, vol. 164(C).
    7. Martin Meyer, 2000. "What is Special about Patent Citations? Differences between Scientific and Patent Citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 49(1), pages 93-123, August.
    8. An, Xin & Li, Jinghong & Xu, Shuo & Chen, Liang & Sun, Wei, 2021. "An improved patent similarity measurement based on entities and semantic relations," Journal of Informetrics, Elsevier, vol. 15(2).
    9. Chanwoo Jeong & Sion Jang & Eunjeong Park & Sungchul Choi, 2020. "A context-aware citation recommendation model with BERT and graph convolutional networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 1907-1922, September.
    10. Shaobo Li & Jie Hu & Yuxin Cui & Jianjun Hu, 2018. "DeepPatent: patent classification with convolutional neural networks and word embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(2), pages 721-744, November.
    11. Chung, Park & Sohn, So Young, 2020. "Early detection of valuable patents using a deep learning model: Case of semiconductor industry," Technological Forecasting and Social Change, Elsevier, vol. 158(C).
    12. Liang Chen & Shuo Xu & Lijun Zhu & Jing Zhang & Xiaoping Lei & Guancan Yang, 2020. "A deep learning based method for extracting semantic information from patent documents," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(1), pages 289-312, October.
    13. Jaeyoung Kim & Janghyeok Yoon & Eunjeong Park & Sungchul Choi, 2020. "Patent document clustering with deep embeddings," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 563-577, May.
    14. C. Gay & C. Le Bas, 2005. "Uses without too many abuses of patent citations or the simple economics of patent citations as a measure of value and flows of knowledge," Economics of Innovation and New Technology, Taylor & Francis Journals, vol. 14(5), pages 333-338.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Peng Liu & Liang Gui & Huirong Wang & Muhammad Riaz, 2022. "A Two-Stage Deep-Learning Model for Link Prediction Based on Network Structure and Node Attributes," Sustainability, MDPI, vol. 14(23), pages 1-15, December.
    2. Yi Zhang & Chengzhi Zhang & Philipp Mayr & Arho Suominen, 2022. "An editorial of “AI + informetrics”: multi-disciplinary interactions in the era of big data," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6503-6507, November.
    3. Percia David, Dimitri & Maréchal, Loïc & Lacube, William & Gillard, Sébastien & Tsesmelis, Michael & Maillart, Thomas & Mermoud, Alain, 2023. "Measuring security development in information technologies: A scientometric framework using arXiv e-prints," Technological Forecasting and Social Change, Elsevier, vol. 188(C).
    4. Shicheng Tan & Tao Zhang & Shu Zhao & Yanping Zhang, 2023. "Self-supervised scientific document recommendation based on contrastive learning," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5027-5049, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hain, Daniel S. & Jurowetzki, Roman & Buchmann, Tobias & Wolf, Patrick, 2022. "A text-embedding-based approach to measuring patent-to-patent technological similarity," Technological Forecasting and Social Change, Elsevier, vol. 177(C).
    2. Choi, Jaewoong & Yoon, Janghyeok, 2022. "Measuring knowledge exploration distance at the patent level: Application of network embedding and citation analysis," Journal of Informetrics, Elsevier, vol. 16(2).
    3. Chen, Liang & Xu, Shuo & Zhu, Lijun & Zhang, Jing & Yang, Guancan & Xu, Haiyun, 2022. "A deep learning based method benefiting from characteristics of patents for semantic relation classification," Journal of Informetrics, Elsevier, vol. 16(3).
    4. Juite Wang, 0000. "Analyzing and Predicting R&D Collaboration Networks in the Metaverse Industry," Proceedings of Economics and Finance Conferences 14716418, International Institute of Social and Economic Sciences.
    5. Choi, Seokkyu & Lee, Hyeonju & Park, Eunjeong & Choi, Sungchul, 2022. "Deep learning for patent landscaping using transformer and graph embedding," Technological Forecasting and Social Change, Elsevier, vol. 175(C).
    6. Adam B. Jaffe & Gaétan de Rassenfosse, 2017. "Patent citation data in social science research: Overview and best practices," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(6), pages 1360-1374, June.
    7. Teng, Hao & Wang, Nan & Zhao, Hongyu & Hu, Yingtong & Jin, Haitao, 2024. "Enhancing semantic text similarity with functional semantic knowledge (FOP) in patents," Journal of Informetrics, Elsevier, vol. 18(1).
    8. Inchae Park & Yujin Jeong & Byungun Yoon, 2017. "Analyzing the value of technology based on the differences of patent citations between applicants and examiners," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 665-691, May.
    9. Bryan, Kevin A. & Ozcan, Yasin & Sampat, Bhaven, 2020. "In-text patent citations: A user's guide," Research Policy, Elsevier, vol. 49(4).
    10. Kim, Juram & Hong, Suckwon & Kang, Yubin & Lee, Changyong, 2023. "Domain-specific valuation of university technologies using bibliometrics, Jonckheere–Terpstra tests, and data envelopment analysis," Technovation, Elsevier, vol. 122(C).
    11. Arousha Haghighian Roudsari & Jafar Afshar & Wookey Lee & Suan Lee, 2022. "PatentNet: multi-label classification of patent documents using deep learning based language understanding," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(1), pages 207-231, January.
    12. Ahmad Barirani & Bruno Agard & Catherine Beaudry, 2013. "Discovering and assessing fields of expertise in nanomedicine: a patent co-citation network perspective," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(3), pages 1111-1136, March.
    13. Xu, Shuo & Hao, Liyuan & Yang, Guancan & Lu, Kun & An, Xin, 2021. "A topic models based framework for detecting and forecasting emerging technologies," Technological Forecasting and Social Change, Elsevier, vol. 162(C).
    14. Shugang Li & Ziyi Li & Yixin Tang & Wenjing Zhao & Xiaoqi Kang & Lingling Zheng & Zhaoxu Yu, 2024. "Pioneering Technology Mining Research for New Technology Strategic Planning," Sustainability, MDPI, vol. 16(15), pages 1-26, August.
    15. Yuandi Wang & Xiongfeng Pan & Yantai Chen & Xin Gu, 2013. "Do references in transferred patent documents signal learning opportunities for the receiving firms?," Scientometrics, Springer;Akadémiai Kiadó, vol. 95(2), pages 731-752, May.
    16. Yuan Zhou & Fang Dong & Yufei Liu & Liang Ran, 2021. "A deep learning framework to early identify emerging technologies in large-scale outlier patents: an empirical study of CNC machine tool," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 969-994, February.
    17. Gamal Atallah & Gabriel Rodríguez, 2006. "Indirect patent citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 67(3), pages 437-465, June.
    18. Mafini Dosso & Didier Lebert, 2019. "A geography of corporate knowledge flows across world regions: evidence from patent citations of top R&D-investing firms," JRC Working Papers on Corporate R&D and Innovation 2019-03, Joint Research Centre.
    19. Criscuolo, P. & Verspagen, B., 2005. "Does it matter where patent citations come from? Inventor versus examiner citations in European patents," Working Papers 05.06, Eindhoven Center for Innovation Studies.
    20. Martin Meyer, 2002. "Tracing knowledge flows in innovation systems," Scientometrics, Springer;Akadémiai Kiadó, vol. 54(2), pages 193-212, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:127:y:2022:i:11:d:10.1007_s11192-022-04301-0. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.