IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v118y2019i1d10.1007_s11192-018-2961-x.html
   My bibliography  Save this article

Identification of important citations by exploiting research articles’ metadata and cue-terms from content

Author

Listed:
  • Faiza Qayyum

    (Capital University of Science and Technology)

  • Muhammad Tanvir Afzal

    (Capital University of Science and Technology)

Abstract

Citations play a pivotal role in indicating various aspects of scientific literature. Quantitative citation analysis approaches have been used over the decades to measure the impact factor of journals, to rank researchers or institutions, to discover evolving research topics etc. Researchers doubted the pure quantitative citation analysis approaches and argued that all citations are not equally important; citation reasons must be considered while counting. In the recent past, researchers have focused on identifying important citation reasons by classifying them into important and non-important classes rather than individually classifying each reason. Most of contemporary citation classification techniques either rely on full content of articles, or they are dominated by content based features. However, most of the time content is not freely available as various journal publishers do not provide open access to articles. This paper presents a binary citation classification scheme, which is dominated by metadata based parameters. The study demonstrates the significance of metadata and content based parameters in varying scenarios. The experiments are performed on two annotated data sets, which are evaluated by employing SVM, KLR, Random Forest machine learning classifiers. The results are compared with the contemporary study that has performed similar classification employing rich list of content-based features. The results of comparisons revealed that the proposed model has attained improved value of precision (i.e., 0.68) just by relying on freely available metadata. We claim that the proposed approach can serve as the best alternative in the scenarios wherein content in unavailable.

Suggested Citation

  • Faiza Qayyum & Muhammad Tanvir Afzal, 2019. "Identification of important citations by exploiting research articles’ metadata and cue-terms from content," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(1), pages 21-43, January.
  • Handle: RePEc:spr:scient:v:118:y:2019:i:1:d:10.1007_s11192-018-2961-x
    DOI: 10.1007/s11192-018-2961-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-018-2961-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-018-2961-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Terrence A. Brooks, 1985. "Private acts and public objects: An investigation of citer motivations," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 36(4), pages 223-229, July.
    2. Michael H. MacRoberts & Barbara R. MacRoberts, 2018. "The mismeasure of science: Citation analysis," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 69(3), pages 474-482, March.
    3. Muhammad Raheel & Samreen Ayaz & Muhammad Tanvir Afzal, 2018. "Evaluation of h-index, its variants and extensions based on publication age & citation intensity in civil engineering," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 1107-1127, March.
    4. Susan Bonzi, 1982. "Characteristics of a Literature as Predictors of Relatedness Between Cited and Citing Works," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 33(4), pages 208-216, July.
    5. Rinze Benedictus & Frank Miedema & Mark W. J. Ferguson, 2016. "Fewer numbers, better science," Nature, Nature, vol. 538(7626), pages 453-455, October.
    6. Xiaodan Zhu & Peter Turney & Daniel Lemire & André Vellino, 2015. "Measuring academic influence: Not all citations are equal," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(2), pages 408-427, February.
    7. Jeong, Yoo Kyung & Song, Min & Ding, Ying, 2014. "Content-based author co-citation analysis," Journal of Informetrics, Elsevier, vol. 8(1), pages 197-211.
    8. Samreen Ayaz & Muhammad Tanvir Afzal, 2016. "Identification of conversion factor for completing-h index for the field of mathematics," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(3), pages 1511-1524, December.
    9. Donald O. Case & Georgeann M. Higgins, 2000. "How can we investigate citation behavior? A study of reasons for citing literature in communication," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 51(7), pages 635-645.
    10. Richard C. Anderson & Francis Narin & Paul McAllister, 1978. "Publication ratings versus peer ratings of universities," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 29(2), pages 91-103, March.
    11. Charles Oppenheim & Susan P. Renn, 1978. "Highly cited old papers and the reasons why they continue to be cited," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 29(5), pages 225-231, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Xin An & Xin Sun & Shuo Xu, 2022. "Important citations identification with semi-supervised classification model," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6533-6555, November.
    2. Faiza Qayyum & Harun Jamil & Naeem Iqbal & DoHyeun Kim & Muhammad Tanvir Afzal, 2022. "Toward potential hybrid features evaluation using MLP-ANN binary classification model to tackle meaningful citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6471-6499, November.
    3. Xiaorui Jiang & Jingqiang Chen, 2023. "Contextualised segment-wise citation function classification," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(9), pages 5117-5158, September.
    4. Imran Ihsan & M. Abdul Qadir, 2021. "An NLP-based citation reason analysis using CCRO," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(6), pages 4769-4791, June.
    5. Mingyang Wang & Jiaqi Zhang & Shijia Jiao & Xiangrong Zhang & Na Zhu & Guangsheng Chen, 2020. "Important citation identification by exploiting the syntactic and contextual information of citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2109-2129, December.
    6. Yu, Dejian & Yan, Zhaoping, 2023. "Main path analysis considering citation structure and content: Case studies in different domains," Journal of Informetrics, Elsevier, vol. 17(1).
    7. Heng Huang & Donghua Zhu & Xuefeng Wang, 2022. "Evaluating scientific impact of publications: combining citation polarity and purpose," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5257-5281, September.
    8. Naif Radi Aljohani & Ayman Fayoumi & Saeed-Ul Hassan, 2021. "An in-text citation classification predictive model for a scholarly search system," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5509-5529, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Faiza Qayyum & Harun Jamil & Naeem Iqbal & DoHyeun Kim & Muhammad Tanvir Afzal, 2022. "Toward potential hybrid features evaluation using MLP-ANN binary classification model to tackle meaningful citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6471-6499, November.
    2. Dangzhi Zhao & Andreas Strotmann, 2020. "Deep and narrow impact: introducing location filtered citation counting," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(1), pages 503-517, January.
    3. Dangzhi Zhao & Andreas Strotmann, 2020. "Telescopic and panoramic views of library and information science research 2011–2018: a comparison of four weighting schemes for author co-citation analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 255-270, July.
    4. Mingyang Wang & Jiaqi Zhang & Shijia Jiao & Xiangrong Zhang & Na Zhu & Guangsheng Chen, 2020. "Important citation identification by exploiting the syntactic and contextual information of citations," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2109-2129, December.
    5. Dongqing Lyu & Xuanmin Ruan & Juan Xie & Ying Cheng, 2021. "The classification of citing motivations: a meta-synthesis," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(4), pages 3243-3264, April.
    6. Sehrish Iqbal & Saeed-Ul Hassan & Naif Radi Aljohani & Salem Alelyani & Raheel Nawaz & Lutz Bornmann, 2021. "A decade of in-text citation analysis based on natural language processing and machine learning techniques: an overview of empirical studies," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(8), pages 6551-6599, August.
    7. Yi Bu & Binglu Wang & Win-bin Huang & Shangkun Che & Yong Huang, 2018. "Using the appearance of citations in full text on author co-citation analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 275-289, July.
    8. Naif Radi Aljohani & Ayman Fayoumi & Saeed-Ul Hassan, 2021. "An in-text citation classification predictive model for a scholarly search system," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 5509-5529, July.
    9. Tahamtan, Iman & Bornmann, Lutz, 2018. "Core elements in the process of citing publications: Conceptual overview of the literature," Journal of Informetrics, Elsevier, vol. 12(1), pages 203-216.
    10. Tanzila Ahmed & Ben Johnson & Charles Oppenheim & Catherine Peck, 2004. "Highly cited old papers and the reasons why they continue to be cited. Part II., The 1953 Watson and Crick article on the structure of DNA," Scientometrics, Springer;Akadémiai Kiadó, vol. 61(2), pages 147-156, October.
    11. Matthias Sebastian Rüdiger & David Antons & Torsten-Oliver Salge, 2021. "The explanatory power of citations: a new approach to unpacking impact in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9779-9809, December.
    12. Thelwall, Mike, 2016. "Are there too many uncited articles? Zero inflated variants of the discretised lognormal and hooked power law distributions," Journal of Informetrics, Elsevier, vol. 10(2), pages 622-633.
    13. Frederique Bordignon, 2020. "Self-correction of science: a comparative study of negative citations and post-publication peer review," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 1225-1239, August.
    14. Teresa H. Jones & Claire Donovan & Steve Hanney, 2012. "Tracing the wider impacts of biomedical research: a literature search to develop a novel citation categorisation technique," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(1), pages 125-134, October.
    15. Martorell Cunil, Onofre & Otero González, Luis & Durán Santomil, Pablo & Mulet Forteza, Carlos, 2023. "How to accomplish a highly cited paper in the tourism, leisure and hospitality field," Journal of Business Research, Elsevier, vol. 157(C).
    16. Madiha Ameer & Muhammad Tanvir Afzal, 2019. "Evaluation of h-index and its qualitative and quantitative variants in Neuroscience," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(2), pages 653-673, November.
    17. Hamid R. Jamali & Majid Nabavi & Saeid Asadi, 2018. "How video articles are cited, the case of JoVE: Journal of Visualized Experiments," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(3), pages 1821-1839, December.
    18. Muhammad Usman & Ghulam Mustafa & Muhammad Tanvir Afzal, 2021. "Ranking of author assessment parameters using Logistic Regression," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(1), pages 335-353, January.
    19. Chi-Shiou Lin, 2018. "An analysis of citation functions in the humanities and social sciences research from the perspective of problematic citation analysis assumptions," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 797-813, August.
    20. CholMyong Pak & Guang Yu & Weibin Wang, 2018. "A study on the citation situation within the citing paper: citation distribution of references according to mention frequency," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 905-918, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:118:y:2019:i:1:d:10.1007_s11192-018-2961-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.