IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v12y2018i1p133-152.html
   My bibliography  Save this article

Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics

Author

Listed:
  • Sjögårde, Peter
  • Ahlgren, Per

Abstract

The purpose of this study is to find a theoretically grounded, practically applicable and useful granularity level of an algorithmically constructed publication-level classification of research publications (ACPLC). The level addressed is the level of research topics. The methodology we propose uses synthesis papers and their reference articles to construct a baseline classification. A dataset of about 31 million publications, and their mutual citations relations, is used to obtain several ACPLCs of different granularity. Each ACPLC is compared to the baseline classification and the best performing ACPLC is identified. The results of two case studies show that the topics of the cases are closely associated with different classes of the identified ACPLC, and that these classes tend to treat only one topic. Further, the class size variation is moderate, and only a small proportion of the publications belong to very small classes. For these reasons, we conclude that the proposed methodology is suitable to determine the topic granularity level of an ACPLC and that the ACPLC identified by this methodology is useful for bibliometric analyses.

Suggested Citation

  • Sjögårde, Peter & Ahlgren, Per, 2018. "Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics," Journal of Informetrics, Elsevier, vol. 12(1), pages 133-152.
  • Handle: RePEc:eee:infome:v:12:y:2018:i:1:p:133-152
    DOI: 10.1016/j.joi.2017.12.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S1751157717303371
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2017.12.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zhang, Lin & Liu, Xinhai & Janssens, Frizo & Liang, Liming & Glänzel, Wolfgang, 2010. "Subject clustering analysis based on ISI category classification," Journal of Informetrics, Elsevier, vol. 4(2), pages 185-193.
    2. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    3. Cristian Colliander, 2015. "A novel approach to citation normalization: A similarity-based method for creating reference sets," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(3), pages 489-500, March.
    4. Ismael Rafols & Alan L. Porter & Loet Leydesdorff, 2010. "Science overlay maps: A new tool for research policy and library management," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(9), pages 1871-1887, September.
    5. Peter van den Besselaar & Gaston Heimeriks, 2006. "Mapping research topics using word-reference co-occurrences: A method and an exploratory case study," Scientometrics, Springer;Akadémiai Kiadó, vol. 68(3), pages 377-393, September.
    6. Loet Leydesdorff & Lutz Bornmann & Caroline S. Wagner, 2017. "Generating clustered journal maps: an automated system for hierarchical classification," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1601-1614, March.
    7. Loet Leydesdorff, 2006. "Can scientific journals be classified in terms of aggregated journal‐journal citation relations using the Journal Citation Reports?," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(5), pages 601-613, March.
    8. Erjia Yan & Ying Ding & Elin K. Jacob, 2012. "Overlaying communities and topics: an analysis on publication networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 90(2), pages 499-513, February.
    9. Ludo Waltman & Nees Jan van Eck, 2012. "A new methodology for constructing a publication‐level classification system of science," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(12), pages 2378-2392, December.
    10. Lovro Šubelj & Nees Jan van Eck & Ludo Waltman, 2016. "Clustering Scientific Publications Based on Citation Relations: A Systematic Comparison of Different Methods," PLOS ONE, Public Library of Science, vol. 11(4), pages 1-23, April.
    11. Small, Henry & Boyack, Kevin W. & Klavans, Richard, 2014. "Identifying emerging topics in science and technology," Research Policy, Elsevier, vol. 43(8), pages 1450-1467.
    12. Yan, Erjia, 2014. "Research dynamics: Measuring the continuity and popularity of research topics," Journal of Informetrics, Elsevier, vol. 8(1), pages 98-110.
    13. Kevin W. Boyack & Richard Klavans, 2010. "Co‐citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 61(12), pages 2389-2404, December.
    14. Boyack, Kevin W. & Klavans, Richard, 2014. "Including cited non-source items in a large-scale map of science: What difference does it make?," Journal of Informetrics, Elsevier, vol. 8(3), pages 569-580.
    15. Kevin W. Boyack & Richard Klavans & Katy Börner, 2005. "Mapping the backbone of science," Scientometrics, Springer;Akadémiai Kiadó, vol. 64(3), pages 351-374, August.
    16. Ludo Waltman & Nees Eck, 2013. "A smart local moving algorithm for large-scale modularity-based community detection," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 86(11), pages 1-14, November.
    17. Richard Klavans & Kevin W. Boyack, 2011. "Using global mapping to create more accurate document‐level maps of research fields," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(1), pages 1-18, January.
    18. Joachim Schummer, 2004. "Multidisciplinarity, interdisciplinarity, and patterns of research collaboration in nanoscience and nanotechnology," Scientometrics, Springer;Akadémiai Kiadó, vol. 59(3), pages 425-465, March.
    19. M. M. Kessler, 1965. "Comparison of the results of bibliographic coupling and analytic subject indexing," American Documentation, Wiley Blackwell, vol. 16(3), pages 223-233, July.
    20. Richard Klavans & Kevin W. Boyack, 2011. "Using global mapping to create more accurate document-level maps of research fields," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(1), pages 1-18, January.
    21. Kevin W. Boyack & Richard Klavans, 2014. "Creation of a highly detailed, dynamic, global model and map of science," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 65(4), pages 670-685, April.
    22. Henry Small, 1973. "Co‐citation in the scientific literature: A new measure of the relationship between two documents," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 24(4), pages 265-269, July.
    23. Min Song & Go Eun Heo & Su Yeon Kim, 2014. "Analyzing topic evolution in bioinformatics: investigation of dynamics of the field with conference data in DBLP," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 397-428, October.
    24. Perianes-Rodriguez, Antonio & Ruiz-Castillo, Javier, 2017. "A comparison of the Web of Science and publication-level classification systems of science," Journal of Informetrics, Elsevier, vol. 11(1), pages 32-45.
    25. Kevin W. Boyack, 2017. "Investigating the effect of global data on topic detection," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 999-1015, May.
    26. Kevin W. Boyack & Richard Klavans, 2010. "Co-citation analysis, bibliographic coupling, and direct citation: Which citation approach represents the research front most accurately?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(12), pages 2389-2404, December.
    27. S. Phineas Upham & Henry Small, 2010. "Emerging research fronts in science and technology: patterns of new knowledge development," Scientometrics, Springer;Akadémiai Kiadó, vol. 83(1), pages 15-38, April.
    28. Ludo Waltman & Nees Jan Eck, 2012. "A new methodology for constructing a publication-level classification system of science," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(12), pages 2378-2392, December.
    29. Richard Klavans & Kevin W. Boyack, 2017. "Which Type of Citation Analysis Generates the Most Accurate Taxonomy of Scientific and Technical Knowledge?," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(4), pages 984-998, April.
    30. Wolfgang Glänzel & Bart Thijs, 2017. "Using hybrid methods and ‘core documents’ for the representation of clusters and topics: the astronomy dataset," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1071-1087, May.
    31. Ahlgren, Per & Colliander, Cristian, 2009. "Document–document similarity approaches and science mapping: Experimental comparison of five approaches," Journal of Informetrics, Elsevier, vol. 3(1), pages 49-63.
    32. Waltman, Ludo & van Eck, Nees Jan & van Leeuwen, Thed N. & Visser, Martijn S., 2013. "Some modifications to the SNIP journal impact indicator," Journal of Informetrics, Elsevier, vol. 7(2), pages 272-285.
    33. Alexander I. Pudovkin & Eugene Garfield, 2002. "Algorithmic procedure for finding semantically related journals," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 53(13), pages 1113-1119, November.
    34. Bei Wen & Edwin Horlings & Mariëlle van der Zouwen & Peter van den Besselaar, 2017. "Mapping science through bibliometric triangulation: An experimental approach applied to water research," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 68(3), pages 724-738, March.
    35. W. Glänzel & A. Schubert & U. Schoepflin & H. J. Czerwon, 1999. "An item-by-item subject classification of papers published in journals covered by the SSCI database using reference analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 46(3), pages 431-441, November.
    36. Angela Hullmann & Martin Meyer, 2003. "Publications and patents in nanotechnology," Scientometrics, Springer;Akadémiai Kiadó, vol. 58(3), pages 507-527, November.
    37. W. Glänzel & A. Schubert & H. -J. Czerwon, 1999. "An item-by-item subject classification of papers published in multidisciplinary and general journals using reference analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 44(3), pages 427-439, March.
    38. Ludo Waltman & Nees Jan Eck, 2013. "Source normalized indicators of citation impact: an overview of different approaches and an empirical comparison," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(3), pages 699-716, September.
    39. Martin Rosvall & Carl T Bergstrom, 2011. "Multilevel Compression of Random Walks on Networks Reveals Hierarchical Organization in Large Integrated Systems," PLOS ONE, Public Library of Science, vol. 6(4), pages 1-10, April.
    40. Yan, Erjia & Ding, Ying & Milojević, Staša & Sugimoto, Cassidy R., 2012. "Topics in dynamic research communities: An exploratory study for the field of information retrieval," Journal of Informetrics, Elsevier, vol. 6(1), pages 140-153.
    41. Wolfgang Glänzel & András Schubert, 2003. "A new classification scheme of science fields and subfields designed for scientometric evaluation purposes," Scientometrics, Springer;Akadémiai Kiadó, vol. 56(3), pages 357-367, March.
    42. Xiaoguang Wang & Qikai Cheng & Wei Lu, 2014. "Analyzing evolution of research topics with NEViewer: a new method based on dynamic co-word networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1253-1271, November.
    43. Jochen Gläser & Wolfgang Glänzel & Andrea Scharnhorst, 2017. "Same data—different results? Towards a comparative approach to the identification of thematic structures in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 981-998, May.
    44. Theresa Velden & Kevin W. Boyack & Jochen Gläser & Rob Koopman & Andrea Scharnhorst & Shenghui Wang, 2017. "Comparison of topic extraction approaches and their results," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1169-1221, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Matthias Held & Grit Laudel & Jochen Gläser, 2021. "Challenges to the validity of topic reconstruction," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4511-4536, May.
    2. Paul Donner, 2021. "Validation of the Astro dataset clustering solutions with external data," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1619-1645, February.
    3. Juan Pablo Bascur & Suzan Verberne & Nees Jan Eck & Ludo Waltman, 2023. "Academic information retrieval using citation clusters: in-depth evaluation based on systematic reviews," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 2895-2921, May.
    4. Haunschild, Robin & Schier, Hermann & Marx, Werner & Bornmann, Lutz, 2018. "Algorithmically generated subject categories based on citation relations: An empirical micro study using papers on overall water splitting," Journal of Informetrics, Elsevier, vol. 12(2), pages 436-447.
    5. Gerson Pech & Catarina Delgado & Silvio Paolo Sorella, 2022. "Classifying papers into subfields using Abstracts, Titles, Keywords and KeyWords Plus through pattern detection and optimization procedures: An application in Physics," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 73(11), pages 1513-1528, November.
    6. Peter Sjögårde & Fereshteh Didegah, 2022. "The association between topic growth and citation impact of research publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 1903-1921, April.
    7. Haiko Lietz, 2020. "Drawing impossible boundaries: field delineation of Social Network Science," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(3), pages 2841-2876, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jochen Gläser & Wolfgang Glänzel & Andrea Scharnhorst, 2017. "Same data—different results? Towards a comparative approach to the identification of thematic structures in science," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 981-998, May.
    2. Matthias Held & Grit Laudel & Jochen Gläser, 2021. "Challenges to the validity of topic reconstruction," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 4511-4536, May.
    3. Rons, Nadine, 2018. "Bibliometric approximation of a scientific specialty by combining key sources, title words, authors and references," Journal of Informetrics, Elsevier, vol. 12(1), pages 113-132.
    4. Kevin W. Boyack, 2017. "Investigating the effect of global data on topic detection," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 999-1015, May.
    5. Sitaram Devarakonda & Dmitriy Korobskiy & Tandy Warnow & George Chacko, 2020. "Viewing computer science through citation analysis: Salton and Bergmark Redux," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(1), pages 271-287, October.
    6. Carlos Olmeda-Gómez & Carlos Romá-Mateo & Maria-Antonia Ovalle-Perandones, 2019. "Overview of trends in global epigenetic research (2009–2017)," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(3), pages 1545-1574, June.
    7. Li, Menghui & Yang, Liying & Zhang, Huina & Shen, Zhesi & Wu, Chensheng & Wu, Jinshan, 2017. "Do mathematicians, economists and biomedical scientists trace large topics more strongly than physicists?," Journal of Informetrics, Elsevier, vol. 11(2), pages 598-607.
    8. Michel Zitt, 2015. "Meso-level retrieval: IR-bibliometrics interplay and hybrid citation-words methods in scientific fields delineation," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2223-2245, March.
    9. Nees Jan Eck & Ludo Waltman, 2017. "Citation-based clustering of publications using CitNetExplorer and VOSviewer," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1053-1070, May.
    10. Shu, Fei & Julien, Charles-Antoine & Zhang, Lin & Qiu, Junping & Zhang, Jing & Larivière, Vincent, 2019. "Comparing journal and paper level classifications of science," Journal of Informetrics, Elsevier, vol. 13(1), pages 202-225.
    11. Frank Havemann & Jochen Gläser & Michael Heinz, 2017. "Memetic search for overlapping topics based on a local evaluation of link communities," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 1089-1118, May.
    12. Shuo Xu & Liyuan Hao & Xin An & Hongshen Pang & Ting Li, 2020. "Review on emerging research topics with key-route main path analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(1), pages 607-624, January.
    13. Fang Han & Christopher L. Magee, 2018. "Testing the science/technology relationship by analysis of patent citations of scientific papers after decomposition of both science and technology," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(2), pages 767-796, August.
    14. Wang, Qi & Waltman, Ludo, 2016. "Large-scale analysis of the accuracy of the journal classification systems of Web of Science and Scopus," Journal of Informetrics, Elsevier, vol. 10(2), pages 347-364.
    15. Xu, Haiyun & Winnink, Jos & Yue, Zenghui & Zhang, Huiling & Pang, Hongshen, 2021. "Multidimensional Scientometric indicators for the detection of emerging research topics," Technological Forecasting and Social Change, Elsevier, vol. 163(C).
    16. Shuo Xu & Junwan Liu & Dongsheng Zhai & Xin An & Zheng Wang & Hongshen Pang, 2018. "Overlapping thematic structures extraction with mixed-membership stochastic blockmodel," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 61-84, October.
    17. Yanto Chandra, 2018. "Mapping the evolution of entrepreneurship as a field of research (1990–2013): A scientometric analysis," PLOS ONE, Public Library of Science, vol. 13(1), pages 1-24, January.
    18. Yi-Ming Wei & Jin-Wei Wang & Tianqi Chen & Bi-Ying Yu & Hua Liao, 2018. "Frontiers of Low-Carbon Technologies: Results from Bibliographic Coupling with Sliding Window," CEEP-BIT Working Papers 116, Center for Energy and Environmental Policy Research (CEEP), Beijing Institute of Technology.
    19. Peter Sjögårde & Fereshteh Didegah, 2022. "The association between topic growth and citation impact of research publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 1903-1921, April.
    20. Paul Donner, 2021. "Validation of the Astro dataset clustering solutions with external data," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1619-1645, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:12:y:2018:i:1:p:133-152. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.