IDEAS home Printed from https://ideas.repec.org/a/bla/jinfst/v74y2023i10p1229-1240.html
   My bibliography  Save this article

Neural machine translation for in‐text citation classification

Author

Listed:
  • Iqra Safder
  • Momin Ali
  • Naif Radi Aljohani
  • Raheel Nawaz
  • Saeed‐Ul Hassan

Abstract

The quality of scientific publications can be measured by quantitative indices such as the h‐index, Source Normalized Impact per Paper, or g‐index. However, these measures lack to explain the function or reasons for citations and the context of citations from citing publication to cited publication. We argue that citation context may be considered while calculating the impact of research work. However, mining citation context from unstructured full‐text publications is a challenging task. In this paper, we compiled a data set comprising 9,518 citations context. We developed a deep learning‐based architecture for citation context classification. Unlike feature‐based state‐of‐the‐art models, our proposed focal‐loss and class‐weight‐aware BiLSTM model with pretrained GloVe embedding vectors use citation context as input to outperform them in multiclass citation context classification tasks. Our model improves on the baseline state‐of‐the‐art by achieving an F1 score of 0.80 with an accuracy of 0.81 for citation context classification. Moreover, we delve into the effects of using different word embeddings on the performance of the classification model and draw a comparison between fastText, GloVe, and spaCy pretrained word embeddings.

Suggested Citation

  • Iqra Safder & Momin Ali & Naif Radi Aljohani & Raheel Nawaz & Saeed‐Ul Hassan, 2023. "Neural machine translation for in‐text citation classification," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 74(10), pages 1229-1240, October.
  • Handle: RePEc:bla:jinfst:v:74:y:2023:i:10:p:1229-1240
    DOI: 10.1002/asi.24817
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/asi.24817
    Download Restriction: no

    File URL: https://libkey.io/10.1002/asi.24817?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Iqra Safder & Saeed-Ul Hassan, 2019. "Bibliometric-enhanced information retrieval: a novel deep feature engineering approach for algorithm searching from full-text publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(1), pages 257-277, April.
    2. Saeed-Ul Hassan & Mubashir Imran & Sehrish Iqbal & Naif Radi Aljohani & Raheel Nawaz, 2018. "Deep context of citations using machine-learning models in scholarly full-text articles," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(3), pages 1645-1662, December.
    3. Lutz Bornmann & K. Brad Wray & Robin Haunschild, 2020. "Citation concept analysis (CCA): a new form of citation analysis revealing the usefulness of concepts for other researchers illustrated by exemplary case studies including classic books by Thomas S. K," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(2), pages 1051-1074, February.
    4. Small, Henry, 2018. "Characterizing highly cited method and non-method papers using citation contexts: The role of uncertainty," Journal of Informetrics, Elsevier, vol. 12(2), pages 461-480.
    5. Xiaodan Zhu & Peter Turney & Daniel Lemire & André Vellino, 2015. "Measuring academic influence: Not all citations are equal," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(2), pages 408-427, February.
    6. Naif Radi Aljohani & Ayman Fayoumi & Saeed-Ul Hassan, 2021. "An in-text citation classification predictive model for a scholarly search system," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5509-5529, July.
    7. Dorte Drongstrup & Shafaq Malik & Naif Radi Aljohani & Salem Alelyani & Iqra Safder & Saeed-Ul Hassan, 2020. "Can social media usage of scientific literature predict journal indices of AJG, SNIP and JCR? An altmetric study of economics," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 1541-1558, November.
    8. Lutz Bornmann & K. Brad Wray & Robin Haunschild, 2020. "Correction to: Citation concept analysis (CCA): a new form of citation analysis revealing the usefulness of concepts for other researchers illustrated by exemplary case studies including classic books," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 2737-2737, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Naif Radi Aljohani & Ayman Fayoumi & Saeed-Ul Hassan, 2021. "An in-text citation classification predictive model for a scholarly search system," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5509-5529, July.
    2. Sehrish Iqbal & Saeed-Ul Hassan & Naif Radi Aljohani & Salem Alelyani & Raheel Nawaz & Lutz Bornmann, 2021. "A decade of in-text citation analysis based on natural language processing and machine learning techniques: an overview of empirical studies," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6551-6599, August.
    3. K. Brad Wray, 2020. "Paradigms in Structure: finally, a count," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(1), pages 823-828, October.
    4. Kofi A. A-O. Agyei-Henaku & Charlotte Badu-Prah & Francis Srofenyoh & Ferguson K. Gidiglo & Akua Agyeiwaa-Afrane & Justice G. Djokoto, 2024. "Citations of Publications on Foreign Direct Investments into Agribusiness: Nature, Variability and Drivers," SAGE Open, , vol. 14(1), pages 21582440241, February.
    5. Xiaorui Jiang & Jingqiang Chen, 2023. "Contextualised segment-wise citation function classification," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5117-5158, September.
    6. Xin An & Xin Sun & Shuo Xu, 2022. "Important citations identification with semi-supervised classification model," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6533-6555, November.
    7. Setio Basuki & Masatoshi Tsuchiya, 2022. "SDCF: semi-automatically structured dataset of citation functions," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(8), pages 4569-4608, August.
    8. Jodi Schneider & Di Ye & Alison M. Hill & Ashley S. Whitehorn, 2020. "Continued post-retraction citation of a fraudulent clinical trial report, 11 years after it was retracted for falsifying data," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2877-2913, December.
    9. Ivan Heibi & Silvio Peroni, 2021. "A qualitative and quantitative analysis of open citations to retracted articles: the Wakefield 1998 et al.'s case," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(10), pages 8433-8470, October.
    10. Iman Tahamtan & Lutz Bornmann, 2019. "What do citation counts measure? An updated review of studies on citations in scientific documents published between 2006 and 2018," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1635-1684, December.
    11. Li Zhang & Ming Liu & Bo Wang & Bo Lang & Peng Yang, 2021. "Discovering communities based on mention distance," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(3), pages 1945-1967, March.
    12. Chao Min & Qingyu Chen & Erjia Yan & Yi Bu & Jianjun Sun, 2021. "Citation cascade and the evolution of topic relevance," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(1), pages 110-127, January.
    13. Faiza Qayyum & Harun Jamil & Naeem Iqbal & DoHyeun Kim & Muhammad Tanvir Afzal, 2022. "Toward potential hybrid features evaluation using MLP-ANN binary classification model to tackle meaningful citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6471-6499, November.
    14. Tamara Krajna & Jelka Petrak, 2019. "Croatian Highly Cited Papers," Interdisciplinary Description of Complex Systems - scientific journal, Croatian Interdisciplinary Society Provider Homepage: http://indecs.eu, vol. 17(3-B), pages 684-696.
    15. Ying Guo & Xiantao Xiao, 2022. "Author-level altmetrics for the evaluation of Chinese scholars," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(2), pages 973-990, February.
    16. Yi Bu & Binglu Wang & Win-bin Huang & Shangkun Che & Yong Huang, 2018. "Using the appearance of citations in full text on author co-citation analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 275-289, July.
    17. Dangzhi Zhao & Andreas Strotmann, 2020. "Telescopic and panoramic views of library and information science research 2011–2018: a comparison of four weighting schemes for author co-citation analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 255-270, July.
    18. Yaxue Ma & Zhichao Ba & Yuxiang Zhao & Jin Mao & Gang Li, 2021. "Understanding and predicting the dissemination of scientific papers on social media: a two-step simultaneous equation modeling–artificial neural network approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 7051-7085, August.
    19. Yuanyuan Liu & Qiang Wu & Shijie Wu & Yong Gao, 2021. "Weighted citation based on ranking-related contribution: a new index for evaluating article impact," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(10), pages 8653-8672, October.
    20. Mingyang Wang & Jiaqi Zhang & Shijia Jiao & Xiangrong Zhang & Na Zhu & Guangsheng Chen, 2020. "Important citation identification by exploiting the syntactic and contextual information of citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2109-2129, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jinfst:v:74:y:2023:i:10:p:1229-1240. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.asis.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.