IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v128y2023i5d10.1007_s11192-023-04681-x.html
   My bibliography  Save this article

Academic information retrieval using citation clusters: in-depth evaluation based on systematic reviews

Author

Listed:
  • Juan Pablo Bascur

    (Leiden University
    Leiden University)

  • Suzan Verberne

    (Leiden University)

  • Nees Jan Eck

    (Leiden University)

  • Ludo Waltman

    (Leiden University)

Abstract

The field of science mapping has shown the power of citation-based clusters for literature analysis, yet this technique has barely been used for information retrieval tasks. This work evaluates the performance of citation-based clusters for information retrieval tasks. We simulated a search process with a tree hierarchy of clusters and a cluster selection algorithm. We evaluated the task of finding the relevant documents for 25 systematic reviews. Our evaluation considered several trade-offs between recall and precision for the cluster selection. We also replicated the Boolean queries self-reported by the systematic reviews to serve as a reference. We found that citation-based clusters’ search performance is highly variable and unpredictable, that the clusters work best for users that prefer recall over precision at a ratio between 2 and 8, and that the clusters are able to complement query-based search by finding additional relevant documents.

Suggested Citation

  • Juan Pablo Bascur & Suzan Verberne & Nees Jan Eck & Ludo Waltman, 2023. "Academic information retrieval using citation clusters: in-depth evaluation based on systematic reviews," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 2895-2921, May.
  • Handle: RePEc:spr:scient:v:128:y:2023:i:5:d:10.1007_s11192-023-04681-x
    DOI: 10.1007/s11192-023-04681-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-023-04681-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-023-04681-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Nees Jan Eck & Ludo Waltman, 2017. "Citation-based clustering of publications using CitNetExplorer and VOSviewer," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1053-1070, May.
    2. M.J. Cobo & A.G. López‐Herrera & E. Herrera‐Viedma & F. Herrera, 2011. "Science mapping software tools: Review, analysis, and cooperative study among tools," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(7), pages 1382-1402, July.
    3. van Eck, Nees Jan & Waltman, Ludo, 2014. "CitNetExplorer: A new software tool for analyzing and visualizing citation networks," Journal of Informetrics, Elsevier, vol. 8(4), pages 802-823.
    4. Bradley M. Hemminger & Dihui Lu & K.T.L. Vaughan & Stephanie J. Adams, 2007. "Information seeking behavior of academic scientists," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(14), pages 2205-2225, December.
    5. Ludo Waltman & Nees Jan van Eck, 2012. "A new methodology for constructing a publication‐level classification system of science," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(12), pages 2378-2392, December.
    6. Richard Klavans & Kevin W. Boyack, 2006. "Quantitative evaluation of large maps of science," Scientometrics, Springer;Akadémiai Kiadó, vol. 68(3), pages 475-499, September.
    7. Philipp Mayr & Andrea Scharnhorst, 2015. "Scientometrics and information retrieval: weak-links revitalized," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2193-2199, March.
    8. Chaomei Chen, 2006. "CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(3), pages 359-377, February.
    9. Holly J Atkinson & John H Morris & Thomas E Ferrin & Patricia C Babbitt, 2009. "Using Sequence Similarity Networks for Visualization of Relationships Across Diverse Protein Superfamilies," PLOS ONE, Public Library of Science, vol. 4(2), pages 1-14, February.
    10. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    11. Muhammad Kamran Abbasi & Ingo Frommholz, 2015. "Erratum to: Cluster-based polyrepresentation as science modelling approach for information retrieval," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(3), pages 1151-1152, June.
    12. Peter Mutschke & Philipp Mayr, 2015. "Science models for search: a study on combining scholarly information retrieval and scientometrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2323-2345, March.
    13. Henry Small, 1973. "Co‐citation in the scientific literature: A new measure of the relationship between two documents," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 24(4), pages 265-269, July.
    14. Carol C. Kuhlthau, 1991. "Inside the search process: Information seeking from the user's perspective," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 42(5), pages 361-371, June.
    15. Nees Jan Eck & Ludo Waltman, 2010. "Software survey: VOSviewer, a computer program for bibliometric mapping," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(2), pages 523-538, August.
    16. Muhammad Kamran Abbasi & Ingo Frommholz, 2015. "Cluster-based polyrepresentation as science modelling approach for information retrieval," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2301-2322, March.
    17. Christopher W. Belter, 2016. "Citation analysis as a literature search method for systematic reviews," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(11), pages 2766-2777, November.
    18. Sjögårde, Peter & Ahlgren, Per, 2018. "Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics," Journal of Informetrics, Elsevier, vol. 12(1), pages 133-152.
    19. Robin Haunschild & Werner Marx, 2020. "Discovering seminal works with marker papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2955-2969, December.
    20. M.J. Cobo & A.G. López-Herrera & E. Herrera-Viedma & F. Herrera, 2011. "Science mapping software tools: Review, analysis, and cooperative study among tools," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(7), pages 1382-1402, July.
    21. Peter Sjögårde & Per Ahlgren & Ludo Waltman, 2021. "Algorithmic labeling in hierarchical classifications of publications: Evaluation of bibliographic fields and term weighting approaches," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(7), pages 853-869, July.
    22. Ludo Waltman & Nees Jan Eck, 2012. "A new methodology for constructing a publication-level classification system of science," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(12), pages 2378-2392, December.
    23. Christopher W. Belter, 2017. "A relevance ranking method for citation-based search results," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(2), pages 731-746, August.
    24. Jiangen He & Qing Ping & Wen Lou & Chaomei Chen, 2019. "PaperPoles: Facilitating adaptive visual exploration of scientific publications by citation links," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 70(8), pages 843-857, August.
    25. Dietmar Wolfram, 2015. "The symbiotic relationship between information retrieval and informetrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2201-2214, March.
    26. Frizo Janssens & Wolfgang Glänzel & Bart Moor, 2008. "A hybrid mapping of information science," Scientometrics, Springer;Akadémiai Kiadó, vol. 75(3), pages 607-631, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mehdi Amirkhani & Igor Martek & Mark B. Luther, 2021. "Mapping Research Trends in Residential Construction Retrofitting: A Scientometric Literature Review," Energies, MDPI, vol. 14(19), pages 1-18, September.
    2. Gaviria-Marin, Magaly & Merigó, José M. & Baier-Fuentes, Hugo, 2019. "Knowledge management: A global examination based on bibliometric analysis," Technological Forecasting and Social Change, Elsevier, vol. 140(C), pages 194-220.
    3. Margarida Rodrigues & Cidália Oliveira & MárioFranco & Ana Daniel, 2024. "A Bibliometric Study About the Rural Creative Class: Proposal of a Conceptual Framework and Future Agenda," Journal of the Knowledge Economy, Springer;Portland International Center for Management of Engineering and Technology (PICMET), vol. 15(3), pages 15278-15303, September.
    4. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    5. Jensen, Scott & Liu, Xiaozhong & Yu, Yingying & Milojevic, Staša, 2016. "Generation of topic evolution trees from heterogeneous bibliographic networks," Journal of Informetrics, Elsevier, vol. 10(2), pages 606-621.
    6. Wenceslao Arroyo-Machado & Daniel Torres-Salinas & Nicolas Robinson-Garcia, 2021. "Identifying and characterizing social media communities: a socio-semantic network approach to altmetrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 9267-9289, November.
    7. Bilal Manzoor & Idris Othman & Juan Carlos Pomares, 2021. "Digital Technologies in the Architecture, Engineering and Construction (AEC) Industry—A Bibliometric—Qualitative Literature Review of Research Activities," IJERPH, MDPI, vol. 18(11), pages 1-26, June.
    8. Boyack, Kevin W. & Klavans, Richard, 2014. "Including cited non-source items in a large-scale map of science: What difference does it make?," Journal of Informetrics, Elsevier, vol. 8(3), pages 569-580.
    9. Yue Guiling & Siti Aisyah Panatik & Mohammad Saipol Mohd Sukor & Noraini Rusbadrol & Li Cunlin, 2022. "Bibliometric Analysis of Global Research on Organizational Citizenship Behavior From 2000 to 2019," SAGE Open, , vol. 12(1), pages 21582440221, February.
    10. Hugo Baier-Fuentes & José M. Merigó & José Ernesto Amorós & Magaly Gaviria-Marín, 2019. "International entrepreneurship: a bibliometric overview," International Entrepreneurship and Management Journal, Springer, vol. 15(2), pages 385-429, June.
    11. Shashi & Piera Centobelli & Roberto Cerchione & Amit Mittal, 2021. "Managing sustainability in luxury industry to pursue circular economy strategies," Business Strategy and the Environment, Wiley Blackwell, vol. 30(1), pages 432-462, January.
    12. Daniele Rotolo & Ismael Rafols & Michael Hopkins & Loet Leydesdorff, 2014. "Scientometric Mapping as a Strategic Intelligence Tool for the Governance of Emerging Technologies," SPRU Working Paper Series 2014-10, SPRU - Science Policy Research Unit, University of Sussex Business School.
    13. Philipp Mayr & Andrea Scharnhorst, 2015. "Scientometrics and information retrieval: weak-links revitalized," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2193-2199, March.
    14. Shuo Xu & Junwan Liu & Dongsheng Zhai & Xin An & Zheng Wang & Hongshen Pang, 2018. "Overlapping thematic structures extraction with mixed-membership stochastic blockmodel," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 61-84, October.
    15. Haiko Lietz, 2020. "Drawing impossible boundaries: field delineation of Social Network Science," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2841-2876, December.
    16. Serhat Burmaoglu & Ozcan Saritas, 2019. "An evolutionary analysis of the innovation policy domain: Is there a paradigm shift?," Scientometrics, Springer;Akadémiai Kiadó, vol. 118(3), pages 823-847, March.
    17. Gallego-Losada, María-Jesús & Montero-Navarro, Antonio & García-Abajo, Elisa & Gallego-Losada, Rocío, 2023. "Digital financial inclusion. Visualizing the academic literature," Research in International Business and Finance, Elsevier, vol. 64(C).
    18. Ignacio Rodríguez-Rodríguez & José-Víctor Rodríguez & Niloofar Shirvanizadeh & Andrés Ortiz & Domingo-Javier Pardo-Quiles, 2021. "Applications of Artificial Intelligence, Machine Learning, Big Data and the Internet of Things to the COVID-19 Pandemic: A Scientometric Review Using Text Mining," IJERPH, MDPI, vol. 18(16), pages 1-29, August.
    19. Ying Huang & Wolfgang Glänzel & Lin Zhang, 2021. "Tracing the development of mapping knowledge domains," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6201-6224, July.
    20. Zhigao Liu & Yimei Yin & Weidong Liu & Michael Dunford, 2015. "Visualizing the intellectual structure and evolution of innovation systems research: a bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(1), pages 135-158, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:128:y:2023:i:5:d:10.1007_s11192-023-04681-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.