IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v123y2020i1d10.1007_s11192-020-03351-6.html
   My bibliography  Save this article

Forecasting emerging technologies using data augmentation and deep learning

Author

Listed:
  • Yuan Zhou

    (Tsinghua University)

  • Fang Dong

    (Tsinghua University)

  • Yufei Liu

    (Tsinghua University
    Chinese Academy of Engineering)

  • Zhaofu Li

    (Huazhong University of Science and Technology)

  • JunFei Du

    (Huazhong University of Science and Technology)

  • Li Zhang

    (Huazhong University of Science and Technology)

Abstract

Deep learning can be used to forecast emerging technologies based on patent data. However, it requires a large amount of labeled patent data as a training set, which is difficult to obtain due to various constraints. This study proposes a novel approach that integrates data augmentation and deep learning methods, which overcome the problem of lacking training samples when applying deep learning to forecast emerging technologies. First, a sample data set was constructed using Gartner’s hype cycle and multiple patent features. Second, a generative adversarial network was used to generate many synthetic samples (data augmentation) to expand the scale of the sample data set. Finally, a deep neural network classifier was trained with the augmented data set to forecast emerging technologies, and it could predict up to 77% of the emerging technologies in a given year with high precision. This approach was used to forecast emerging technologies in Gartner’s hype cycles for 2017 based on patent data from 2000 to 2016. Four out of six of the emerging technologies were forecasted correctly, showing the accuracy and precision of the proposed approach. This approach enables deep learning to forecast emerging technologies with limited training samples.

Suggested Citation

  • Yuan Zhou & Fang Dong & Yufei Liu & Zhaofu Li & JunFei Du & Li Zhang, 2020. "Forecasting emerging technologies using data augmentation and deep learning," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(1), pages 1-29, April.
  • Handle: RePEc:spr:scient:v:123:y:2020:i:1:d:10.1007_s11192-020-03351-6
    DOI: 10.1007/s11192-020-03351-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-020-03351-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-020-03351-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Joshua Lerner, 1994. "The Importance of Patent Scope: An Empirical Analysis," RAND Journal of Economics, The RAND Corporation, vol. 25(2), pages 319-333, Summer.
    2. Seung-Pyo Jun, 2012. "An empirical study of users’ hype cycle based on search traffic: the case study on hybrid cars," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(1), pages 81-99, April.
    3. Yuan Zhou & Heng Lin & Yufei Liu & Wei Ding, 2019. "A novel method to identify emerging technologies using a semi-supervised topic clustering model: a case of 3D printing industry," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(1), pages 167-185, July.
    4. Kyebambe, Moses Ntanda & Cheng, Ge & Huang, Yunqing & He, Chunhui & Zhang, Zhenyu, 2017. "Forecasting emerging technologies: A supervised learning approach through patent analysis," Technological Forecasting and Social Change, Elsevier, vol. 125(C), pages 236-244.
    5. Lee, Changyong & Kwon, Ohjin & Kim, Myeongjung & Kwon, Daeil, 2018. "Early identification of emerging technologies: A machine learning approach using multiple patent indicators," Technological Forecasting and Social Change, Elsevier, vol. 127(C), pages 291-303.
    6. Saeed-Ul Hassan & Mubashir Imran & Sehrish Iqbal & Naif Radi Aljohani & Raheel Nawaz, 2018. "Deep context of citations using machine-learning models in scholarly full-text articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(3), pages 1645-1662, December.
    7. Anthony Breitzman & Patrick Thomas, 2015. "Inventor team size as a predictor of the future citation impact of patents," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 631-647, May.
    8. Bronwyn H. Hall & Christian Helmers & Mark Rogers & Vania Sena, 2013. "The importance (or not) of patents to UK firms," Oxford Economic Papers, Oxford University Press, vol. 65(3), pages 603-629, July.
    9. Hall, Bronwyn H. & Helmers, Christian, 2013. "Innovation and diffusion of clean/green technology: Can patent commons help?," Journal of Environmental Economics and Management, Elsevier, vol. 66(1), pages 33-51.
    10. Zhang, Yi & Lu, Jie & Liu, Feng & Liu, Qian & Porter, Alan & Chen, Hongshu & Zhang, Guangquan, 2018. "Does deep learning help topic extraction? A kernel k-means clustering method with word embedding," Journal of Informetrics, Elsevier, vol. 12(4), pages 1099-1117.
    11. Harhoff, Dietmar & Scherer, Frederic M. & Vopel, Katrin, 2003. "Citations, family size, opposition and the value of patent rights," Research Policy, Elsevier, vol. 32(8), pages 1343-1363, September.
    12. Kreuchauff, Florian & Korzinov, Vladimir, 2015. "A patent search strategy based on machine learning for the emerging field of service robotics," Working Paper Series in Economics 71, Karlsruhe Institute of Technology (KIT), Department of Economics and Management.
    13. Shaobo Li & Jie Hu & Yuxin Cui & Jianjun Hu, 2018. "DeepPatent: patent classification with convolutional neural networks and word embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(2), pages 721-744, November.
    14. Pao-Long Chang & Chao-Chan Wu & Hoang-Jyh Leu, 2010. "Using patent analyses to monitor the technological trends in an emerging field of technology: a case of carbon nanotube field emission display," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(1), pages 5-19, January.
    15. Connie K N Chang & Anthony Breitzman, 2009. "Using patents prospectively to identify emerging, high-impact technological clusters," Research Evaluation, Oxford University Press, vol. 18(5), pages 357-364, December.
    16. Lawrence D. Fu & Constantin F. Aliferis, 2010. "Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 85(1), pages 257-270, October.
    17. Breitzman, Anthony & Thomas, Patrick, 2015. "The Emerging Clusters Model: A tool for identifying emerging technologies across multiple patent systems," Research Policy, Elsevier, vol. 44(1), pages 195-205.
    18. Kong, Dejing & Zhou, Yuan & Liu, Yufei & Xue, Lan, 2017. "Using the data mining method to assess the innovation gap: A case of industrial robotics in a catching-up country," Technological Forecasting and Social Change, Elsevier, vol. 119(C), pages 80-97.
    19. Jean O. Lanjouw & Mark Schankerman, 2004. "Patent Quality and Research Productivity: Measuring Innovation with Multiple Indicators," Economic Journal, Royal Economic Society, vol. 114(495), pages 441-465, April.
    20. Bronwyn H. Hall & Christian Helmers & Mark Rogers & Vania Sena, 2013. "The importance (or not) of patents to UK firms," Oxford Economic Papers, Oxford University Press, vol. 65(3), pages 603-629, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Lijie Feng & Kehui Liu & Jinfeng Wang & Kuo-Yi Lin & Ke Zhang & Luyao Zhang, 2022. "Identifying Promising Technologies of Electric Vehicles from the Perspective of Market and Technical Attributes," Energies, MDPI, vol. 15(20), pages 1-22, October.
    2. Yuan Zhou & Fang Dong & Yufei Liu & Liang Ran, 2021. "A deep learning framework to early identify emerging technologies in large-scale outlier patents: an empirical study of CNC machine tool," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 969-994, February.
    3. Myoungjae Choi & Ohjin Kwon & Dongkyu Won & Wooseok Jang, 2021. "Identifying the Policy Direction of National R&D Programs Based on Data Envelopment Analysis and Diversity Index Approach," Sustainability, MDPI, vol. 13(22), pages 1-17, November.
    4. Huailan Liu & Zhiwang Chen & Jie Tang & Yuan Zhou & Sheng Liu, 2020. "Mapping the technology evolution path: a novel model for dynamic topic detection and tracking," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2043-2090, December.
    5. Dejing Kong & Jianzhong Yang & Lingfeng Li, 2020. "Early identification of technological convergence in numerical control machine tool: a deep learning approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 1983-2009, December.
    6. Zhai, Dongsheng & Zhai, Liang & Li, Mengyang & He, Xijun & Xu, Shuo & Wang, Feifei, 2022. "Patent representation learning with a novel design of patent ontology: Case study on PEM patents," Technological Forecasting and Social Change, Elsevier, vol. 183(C).
    7. Alvaro Figueira & Bruno Vaz, 2022. "Survey on Synthetic Data Generation, Evaluation Methods and GANs," Mathematics, MDPI, vol. 10(15), pages 1-41, August.
    8. Hain, Daniel S. & Jurowetzki, Roman & Buchmann, Tobias & Wolf, Patrick, 2022. "A text-embedding-based approach to measuring patent-to-patent technological similarity," Technological Forecasting and Social Change, Elsevier, vol. 177(C).
    9. Liang Chen & Shuo Xu & Lijun Zhu & Jing Zhang & Xiaoping Lei & Guancan Yang, 2020. "A deep learning based method for extracting semantic information from patent documents," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(1), pages 289-312, October.
    10. Zamani, Mehdi & Yalcin, Haydar & Naeini, Ali Bonyadi & Zeba, Gordana & Daim, Tugrul U, 2022. "Developing metrics for emerging technologies: identification and assessment," Technological Forecasting and Social Change, Elsevier, vol. 176(C).
    11. Christian Ulrich & Benjamin Frieske & Stephan A. Schmid & Horst E. Friedrich, 2022. "Monitoring and Forecasting of Key Functions and Technologies for Automated Driving," Forecasting, MDPI, vol. 4(2), pages 1-24, May.
    12. Zhenyu Yang & Wenyu Zhang & Zhimin Wang & Xiaoling Huang, 2024. "A deep learning-based method for predicting the emerging degree of research topics using emerging index," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 4021-4042, July.
    13. Yingyi Zhang & Chengzhi Zhang, 2024. "Extracting problem and method sentence from scientific papers: a context-enhanced transformer using formulaic expression desensitization," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(6), pages 3433-3468, June.
    14. Guannan Xu & Weijie Hu & Yuanyuan Qiao & Yuan Zhou, 2020. "Mapping an innovation ecosystem using network clustering and community identification: a multi-layered framework," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 2057-2081, September.
    15. June Young Lee & Sejung Ahn & Dohyun Kim, 2021. "Deep learning-based prediction of future growth potential of technologies," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-16, June.
    16. Yunlei Lin & Yuan Zhou, 2023. "Identification of Hydrogen-Energy-Related Emerging Technologies Based on Text Mining," Sustainability, MDPI, vol. 16(1), pages 1-19, December.
    17. Puccetti, Giovanni & Giordano, Vito & Spada, Irene & Chiarello, Filippo & Fantoni, Gualtiero, 2023. "Technology identification from patent texts: A novel named entity recognition method," Technological Forecasting and Social Change, Elsevier, vol. 186(PB).
    18. Ryosuke L. Ohniwa & Kunio Takeyasu & Aiko Hibino, 2022. "Researcher dynamics in the generation of emerging topics in life sciences and medicine," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(2), pages 871-884, February.
    19. Li Yao & He Ni, 2023. "Prediction of patent grant and interpreting the key determinants: an application of interpretable machine learning approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 4933-4969, September.
    20. Gozuacik, Necip & Sakar, C. Okan & Ozcan, Sercan, 2023. "Technological forecasting based on estimation of word embedding matrix using LSTM networks," Technological Forecasting and Social Change, Elsevier, vol. 191(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yuan Zhou & Fang Dong & Yufei Liu & Liang Ran, 2021. "A deep learning framework to early identify emerging technologies in large-scale outlier patents: an empirical study of CNC machine tool," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 969-994, February.
    2. Uijun Kwon & Youngjung Geum, 2020. "Identification of promising inventions considering the quality of knowledge accumulation: a machine learning approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 1877-1897, December.
    3. Youngjae Choi & Sanghyun Park & Sungjoo Lee, 2021. "Identifying emerging technologies to envision a future innovation ecosystem: A machine learning approach to patent data," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5431-5476, July.
    4. Serkan Altuntas & Zulfiye Erdogan & Turkay Dereli, 2020. "A clustering-based approach for the evaluation of candidate emerging technologies," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 1157-1177, August.
    5. Ahmad Barirani & Bruno Agard & Catherine Beaudry, 2013. "Discovering and assessing fields of expertise in nanomedicine: a patent co-citation network perspective," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(3), pages 1111-1136, March.
    6. Antoine Dechezlepretre & Elias Einio & Ralf Martin & Kieu-Trang Nguyen & John Van Reenen, 2016. "Do tax incentives for research increase firm innovation? An RD design for R&D," GRI Working Papers 230, Grantham Research Institute on Climate Change and the Environment.
    7. Wooseok Jang & Yongtae Park & Hyeonju Seol, 2021. "Identifying emerging technologies using expert opinions on the future: A topic modeling and fuzzy clustering approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6505-6532, August.
    8. Chung, Park & Sohn, So Young, 2020. "Early detection of valuable patents using a deep learning model: Case of semiconductor industry," Technological Forecasting and Social Change, Elsevier, vol. 158(C).
    9. Choi, Jaewoong & Yoon, Janghyeok, 2022. "Measuring knowledge exploration distance at the patent level: Application of network embedding and citation analysis," Journal of Informetrics, Elsevier, vol. 16(2).
    10. Manuel Acosta & Daniel Coronado & Esther Ferrándiz & Manuel Jiménez, 2022. "Effects of knowledge spillovers between competitors on patent quality: what patent citations reveal about a global duopoly," The Journal of Technology Transfer, Springer, vol. 47(5), pages 1451-1487, October.
    11. Bernhard Ganglmair & Imke Reimers, 2019. "Visibility of Technology and Cumulative Innovation: Evidence from Trade Secrets Laws," CRC TR 224 Discussion Paper Series crctr224_2019_119v1, University of Bonn and University of Mannheim, Germany.
    12. Yun, Siyeong & Song, Kisik & Kim, Chulhyun & Lee, Sungjoo, 2021. "From stones to jewellery: Investigating technology opportunities from expired patents," Technovation, Elsevier, vol. 103(C).
    13. RAITERI Emilio, 2015. "A time to nourish? Evaluating the impact of innovative public procurement on technological generality through patent data," Cahiers du GREThA (2007-2019) 2015-05, Groupe de Recherche en Economie Théorique et Appliquée (GREThA).
    14. Niron Hashai & Sarit Markovich, 2017. "Market Entry by High Technology Startups: The Effect of Competition Level and Startup Innovativeness," Strategy Science, INFORMS, vol. 2(3), pages 141-160, September.
    15. Jyun-Cheng Wang & Cheng-hsin Chiang & Shu-Wei Lin, 2010. "Network structure of innovation: can brokerage or closure predict patent quality?," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(3), pages 735-748, September.
    16. Song, Kisik & Kim, Kyuwoong & Lee, Sungjoo, 2018. "Identifying promising technologies using patents: A retrospective feature analysis and a prospective needs analysis on outlier patents," Technological Forecasting and Social Change, Elsevier, vol. 128(C), pages 118-132.
    17. Raiteri, Emilio, 2018. "A time to nourish? Evaluating the impact of public procurement on technological generality through patent data," Research Policy, Elsevier, vol. 47(5), pages 936-952.
    18. Nicolas van Zeebroeck & Bruno van Pottelsberghe de la Potterie, 2011. "Filing strategies and patent value," Economics of Innovation and New Technology, Taylor & Francis Journals, vol. 20(6), pages 539-561, February.
    19. Jungpyo Lee & So Young Sohn, 2017. "What makes the first forward citation of a patent occur earlier?," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(1), pages 279-298, October.
    20. Nicolas van Zeebroeck, 2011. "The puzzle of patent value indicators," Economics of Innovation and New Technology, Taylor & Francis Journals, vol. 20(1), pages 33-62.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:123:y:2020:i:1:d:10.1007_s11192-020-03351-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.