IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v16y2022i1s1751157721001061.html
   My bibliography  Save this article

Utilizing citation network structure to predict paper citation counts: A Deep learning approach

Author

Listed:
  • Zhao, Qihang
  • Feng, Xiaodong

Abstract

With the advancement of science and technology, the number of academic papers published each year has increased almost exponentially. While a large number of research papers highlight the prosperity of science and technology, they also give rise to some problems. As we know, academic papers are the most intuitive embodiment of the research results of scholars, which can reflect the level of researchers. It is also the standard for evaluation and decision-making of them, such as promotion and allocation of funds. Therefore, how to measure the quality of an academic paper is very critical. The most common standard for measuring the quality of academic papers is the number of citation counts of them, as this indicator is widely used in the evaluation of scientific publications. It also serves as the basis for many other indicators (such as the h-index). Therefore, it is very important to be able to accurately predict the citation counts of academic papers. To improve the effective of citation counts prediction, we try to solve the citation counts prediction problem from the perspective of information cascade prediction and take advantage of deep learning techniques. Thus, we propose an end-to-end deep learning framework (DeepCCP), consisting of graph structure representation and recurrent neural network modules. DeepCCP directly uses the citation network formed in the early stage of the paper as the input, and outputs the citation counts of the corresponding paper after a period of time. It only exploits the structure and temporal information of the citation network, and does not require other additional information. According to experiments on two real academic citation datasets, DeepCCP is shown superior to the state-of-the-art methods in terms of the accuracy of citation count prediction.

Suggested Citation

  • Zhao, Qihang & Feng, Xiaodong, 2022. "Utilizing citation network structure to predict paper citation counts: A Deep learning approach," Journal of Informetrics, Elsevier, vol. 16(1).
  • Handle: RePEc:eee:infome:v:16:y:2022:i:1:s1751157721001061
    DOI: 10.1016/j.joi.2021.101235
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157721001061
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2021.101235?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Chao Min & Qingyu Chen & Erjia Yan & Yi Bu & Jianjun Sun, 2021. "Citation cascade and the evolution of topic relevance," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(1), pages 110-127, January.
    2. Jorge A. V. Tohalino & Laura V. C. Quispe & Diego R. Amancio, 2021. "Analyzing the relationship between text features and grants productivity," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4255-4275, May.
    3. Zongyang Ma & Aixin Sun & Gao Cong, 2013. "On predicting the popularity of newly emerging hashtags in Twitter," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 64(7), pages 1399-1410, July.
    4. D. R. Amancio & M. G. V. Nunes & O. N. Oliveira & L. F. Costa, 2012. "Using complex networks concepts to assess approaches for citations in scientific papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(3), pages 827-842, June.
    5. Zongyang Ma & Aixin Sun & Gao Cong, 2013. "On predicting the popularity of newly emerging hashtags in Twitter," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 64(7), pages 1399-1410, July.
    6. Ruan, Xuanmin & Zhu, Yuanyang & Li, Jiang & Cheng, Ying, 2020. "Predicting the citation counts of individual papers via a BP neural network," Journal of Informetrics, Elsevier, vol. 14(3).
    7. Bai, Xiaomei & Zhang, Fuli & Lee, Ivan, 2019. "Predicting the citations of scholarly paper," Journal of Informetrics, Elsevier, vol. 13(1), pages 407-418.
    8. Tian Yu & Guang Yu & Peng-Yu Li & Liang Wang, 2014. "Citation impact prediction for scientific papers using stepwise regression analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1233-1252, November.
    9. Abrishami, Ali & Aliakbary, Sadegh, 2019. "Predicting citation counts based on deep neural network learning techniques," Journal of Informetrics, Elsevier, vol. 13(2), pages 485-499.
    10. Ho F. Chan & Franklin G. Mixon & Benno Torgler, 2018. "Relation of early career performance and recognition to the probability of winning the Nobel Prize in economics," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 1069-1086, March.
    11. Yong-Gil Lee & Jeong-Dong Lee & Yong-Il Song & Se-Jun Lee, 2007. "An in-depth empirical analysis of patent citation counts using zero-inflated count data model: The case of KIST," Scientometrics, Springer;Akadémiai Kiadó, vol. 70(1), pages 27-39, January.
    12. Didegah, Fereshteh & Thelwall, Mike, 2013. "Which factors help authors produce the highest impact research? Collaboration, journal and document properties," Journal of Informetrics, Elsevier, vol. 7(4), pages 861-873.
    13. Amancio, Diego Raphael & Oliveira, Osvaldo Novais & da Fontoura Costa, Luciano, 2012. "Three-feature model to reproduce the topology of citation networks and the effects from authors’ visibility on their h-index," Journal of Informetrics, Elsevier, vol. 6(3), pages 427-434.
    14. Lawrence D. Fu & Constantin F. Aliferis, 2010. "Using content-based and bibliometric features for machine learning models to predict citation counts in the biomedical literature," Scientometrics, Springer;Akadémiai Kiadó, vol. 85(1), pages 257-270, October.
    15. Mingyang Wang & Zhenyu Wang & Guangsheng Chen, 2019. "Which can better predict the future success of articles? Bibliometric indices or alternative metrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1575-1595, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Trappey, Amy J.C. & Wei, Ann Y.E. & Chen, Neil K.T. & Li, Kuo-An & Hung, L.P. & Trappey, Charles V., 2023. "Patent landscape and key technology interaction roadmap using graph convolutional network – Case of mobile communication technologies beyond 5G," Journal of Informetrics, Elsevier, vol. 17(1).
    2. Kumar, Dhananjay & Bhowmick, Plaban Kumar & Paik, Jiaul H, 2023. "Researcher influence prediction (ResIP) using academic genealogy network," Journal of Informetrics, Elsevier, vol. 17(2).
    3. Jianhua Hou & Xiucai Yang & Yang Zhang, 2023. "The effect of social media knowledge cascade: an analysis of scientific papers diffusion," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5169-5195, September.
    4. Chi, Yuxue & Tang, Xianyi & Liu, Yijun, 2022. "Exploring the “awakening effect” in knowledge diffusion: a case study of publications in the library and information science domain," Journal of Informetrics, Elsevier, vol. 16(4).
    5. Chen, Ying & Koch, Thorsten & Zakiyeva, Nazgul & Liu, Kailiang & Xu, Zhitong & Chen, Chun-houh & Nakano, Junji & Honda, Keisuke, 2023. "Article’s scientific prestige: Measuring the impact of individual articles in the web of science," Journal of Informetrics, Elsevier, vol. 17(1).
    6. Jiang, Hongxun & Fan, Shaokun & Zhang, Nan & Zhu, Bin, 2023. "Deep learning for predicting patent application outcome: The fusion of text and network embeddings," Journal of Informetrics, Elsevier, vol. 17(2).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Wanjun Xia & Tianrui Li & Chongshou Li, 2023. "A review of scientific impact prediction: tasks, features and methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 543-585, January.
    2. Li, Xin & Ma, Xiaodi & Feng, Ye, 2024. "Early identification of breakthrough research from sleeping beauties using machine learning," Journal of Informetrics, Elsevier, vol. 18(2).
    3. Shengzhi Huang & Jiajia Qian & Yong Huang & Wei Lu & Yi Bu & Jinqing Yang & Qikai Cheng, 2022. "Disclosing the relationship between citation structure and future impact of a publication," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(7), pages 1025-1042, July.
    4. Ruan, Xuanmin & Zhu, Yuanyang & Li, Jiang & Cheng, Ying, 2020. "Predicting the citation counts of individual papers via a BP neural network," Journal of Informetrics, Elsevier, vol. 14(3).
    5. Bin Wang & Feng Wu & Lukui Shi, 2023. "AGSTA-NET: adaptive graph spatiotemporal attention network for citation count prediction," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 511-541, January.
    6. Anqi Ma & Yu Liu & Xiujuan Xu & Tao Dong, 2021. "A deep-learning based citation count prediction model with paper metadata semantic features," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6803-6823, August.
    7. Martorell Cunil, Onofre & Otero González, Luis & Durán Santomil, Pablo & Mulet Forteza, Carlos, 2023. "How to accomplish a highly cited paper in the tourism, leisure and hospitality field," Journal of Business Research, Elsevier, vol. 157(C).
    8. Hu, Ya-Han & Tai, Chun-Tien & Liu, Kang Ernest & Cai, Cheng-Fang, 2020. "Identification of highly-cited papers using topic-model-based and bibliometric features: the consideration of keyword popularity," Journal of Informetrics, Elsevier, vol. 14(1).
    9. Kumar, Dhananjay & Bhowmick, Plaban Kumar & Paik, Jiaul H, 2023. "Researcher influence prediction (ResIP) using academic genealogy network," Journal of Informetrics, Elsevier, vol. 17(2).
    10. Zhang, Xinyuan & Xie, Qing & Song, Min, 2021. "Measuring the impact of novelty, bibliometric, and academic-network factors on citation count using a neural network," Journal of Informetrics, Elsevier, vol. 15(2).
    11. Wumei Du & Zheng Xie & Yiqin Lv, 2021. "Predicting publication productivity for authors: Shallow or deep architecture?," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5855-5879, July.
    12. Yuhao Zhou & Ruijie Wang & An Zeng, 2022. "Predicting the impact and publication date of individual scientists’ future papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 1867-1882, April.
    13. Kehan Wang & Wenxuan Shi & Junsong Bai & Xiaoping Zhao & Liying Zhang, 2021. "Prediction and application of article potential citations based on nonlinear citation-forecasting combined model," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6533-6550, August.
    14. Akella, Akhil Pandey & Alhoori, Hamed & Kondamudi, Pavan Ravikanth & Freeman, Cole & Zhou, Haiming, 2021. "Early indicators of scientific impact: Predicting citations with altmetrics," Journal of Informetrics, Elsevier, vol. 15(2).
    15. Mingyue Sun & Tingcan Ma & Lewei Zhou & Mingliang Yue, 2023. "Analysis of the relationships among paper citation and its influencing factors: a Bayesian network-based approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 3017-3033, May.
    16. Adilson Vital & Diego R. Amancio, 2022. "A comparative analysis of local similarity metrics and machine learning approaches: application to link prediction in author citation networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(10), pages 6011-6028, October.
    17. Jorge A. V. Tohalino & Laura V. C. Quispe & Diego R. Amancio, 2021. "Analyzing the relationship between text features and grants productivity," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4255-4275, May.
    18. Stegehuis, Clara & Litvak, Nelly & Waltman, Ludo, 2015. "Predicting the long-term citation impact of recent publications," Journal of Informetrics, Elsevier, vol. 9(3), pages 642-657.
    19. Xie, Zheng, 2020. "Predicting publication productivity for researchers: A piecewise Poisson model," Journal of Informetrics, Elsevier, vol. 14(3).
    20. Yang, Jinqing & Liu, Zhifeng, 2022. "The effect of citation behaviour on knowledge diffusion and intellectual structure," Journal of Informetrics, Elsevier, vol. 16(1).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:16:y:2022:i:1:s1751157721001061. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.