IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v91y2012i2d10.1007_s11192-011-0600-x.html
   My bibliography  Save this article

Optimal and hierarchical clustering of large-scale hybrid networks for scientific mapping

Author

Listed:
  • Xinhai Liu

    (The People’s Bank of China
    The People’s Bank of China)

  • Wolfgang Glänzel

    (Katholieke Universiteit Leuven
    IRPS)

  • Bart Moor

    (Katholieke Universiteit Leuven)

Abstract

Previous studies have shown that hybrid clustering methods based on textual and citation information outperforms clustering methods that use only one of these components. However, former methods focus on the vector space model. In this paper we apply a hybrid clustering method which is based on the graph model to map the Web of Science database in the mirror of the journals covered by the database. Compared with former hybrid clustering strategies, our method is very fast and even achieves better clustering accuracy. In addition, it detects the number of clusters automatically and provides a top-down hierarchical analysis, which fits in with the practical application. We quantitatively and qualitatively asses the added value of such an integrated analysis and we investigate whether the clustering outcome provides an appropriate representation of the field structure by comparing with a text-only or citation-only clustering and with another hybrid method based on linear combination of distance matrices. Our dataset consists of about 8,000 journals published in the period 2002–2006. The cognitive analysis, including the ranked journals, term annotation and the visualization of cluster structure demonstrates the efficiency of our strategy.

Suggested Citation

  • Xinhai Liu & Wolfgang Glänzel & Bart Moor, 2012. "Optimal and hierarchical clustering of large-scale hybrid networks for scientific mapping," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(2), pages 473-493, May.
  • Handle: RePEc:spr:scient:v:91:y:2012:i:2:d:10.1007_s11192-011-0600-x
    DOI: 10.1007/s11192-011-0600-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-011-0600-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-011-0600-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Zhang, Lin & Liu, Xinhai & Janssens, Frizo & Liang, Liming & Glänzel, Wolfgang, 2010. "Subject clustering analysis based on ISI category classification," Journal of Informetrics, Elsevier, vol. 4(2), pages 185-193.
    2. Lambiotte, R. & Panzarasa, P., 2009. "Communities, knowledge creation, and information diffusion," Journal of Informetrics, Elsevier, vol. 3(3), pages 180-190.
    3. He, Xiaofeng & Zha, Hongyuan & H.Q. Ding, Chris & D. Simon, Horst, 2002. "Web document clustering using hyperlink structures," Computational Statistics & Data Analysis, Elsevier, vol. 41(1), pages 19-45, November.
    4. Frizo Janssens & Wolfgang Glänzel & Bart Moor, 2008. "A hybrid mapping of information science," Scientometrics, Springer;Akadémiai Kiadó, vol. 75(3), pages 607-631, June.
    5. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    6. Loet Leydesdorff & Ismael Rafols, 2009. "A global map of science based on the ISI subject categories," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 60(2), pages 348-362, February.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ronald N. Kostoff, 2014. "Literature-related discovery: common factors for Parkinson’s Disease and Crohn’s Disease," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(3), pages 623-657, September.
    2. Yeow Chong Goh & Xin Qing Cai & Walter Theseira & Giovanni Ko & Khiam Aik Khor, 2020. "Evaluating human versus machine learning performance in classifying research abstracts," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(2), pages 1197-1212, November.
    3. Mariani, Marcello & Borghi, Matteo, 2019. "Industry 4.0: A bibliometric review of its managerial intellectual structure and potential evolution in the service industries," Technological Forecasting and Social Change, Elsevier, vol. 149(C).
    4. Ying Huang & Wolfgang Glänzel & Lin Zhang, 2021. "Tracing the development of mapping knowledge domains," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(7), pages 6201-6224, July.
    5. Lin Zhang & Beibei Sun & Fei Shu & Ying Huang, 2022. "Comparing paper level classifications across different methods and systems: an investigation of Nature publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(12), pages 7633-7651, December.
    6. Guo Chen & Jing Chen & Yu Shao & Lu Xiao, 2023. "Automatic noise reduction of domain-specific bibliographic datasets using positive-unlabeled learning," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(2), pages 1187-1204, February.
    7. Xiangfeng Meng & Xinhai Liu & YunHai Tong & Wolfgang Glänzel & Shaohua Tan, 2015. "Multi-view clustering with exemplars for scientific mapping," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(3), pages 1527-1552, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Hric, Darko & Kaski, Kimmo & Kivelä, Mikko, 2018. "Stochastic block model reveals maps of citation patterns and their evolution in time," Journal of Informetrics, Elsevier, vol. 12(3), pages 757-783.
    2. Waltman, Ludo & van Eck, Nees Jan & Noyons, Ed C.M., 2010. "A unified approach to mapping and clustering of bibliometric networks," Journal of Informetrics, Elsevier, vol. 4(4), pages 629-635.
    3. Xiangfeng Meng & Xinhai Liu & YunHai Tong & Wolfgang Glänzel & Shaohua Tan, 2015. "Multi-view clustering with exemplars for scientific mapping," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(3), pages 1527-1552, December.
    4. Daniel Torres-Salinas & Nicolás Robinson-García & Álvaro Cabezas-Clavijo & Evaristo Jiménez-Contreras, 2014. "Analyzing the citation characteristics of books: edited books, book series and publisher types in the book citation index," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 2113-2127, March.
    5. Yan, Erjia & Ding, Ying & Cronin, Blaise & Leydesdorff, Loet, 2013. "A bird's-eye view of scientific trading: Dependency relations among fields of science," Journal of Informetrics, Elsevier, vol. 7(2), pages 249-264.
    6. Ruimin Ma & Erjia Yan, 2016. "Uncovering inter-specialty knowledge communication using author citation networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(2), pages 839-854, November.
    7. Yu-Chun Chen & Hsiao-Yun Yeh & Jau-Ching Wu & Ingo Haschler & Tzeng-Ji Chen & Thomas Wetter, 2011. "Taiwan’s National Health Insurance Research Database: administrative health care database as study object in bibliometrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 86(2), pages 365-380, February.
    8. Rongying Zhao & Bikun Chen, 2014. "Applying author co-citation analysis to user interaction analysis: a case study on instant messaging groups," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 985-997, November.
    9. Yu-Wei Chang, 2019. "Are articles in library and information science (LIS) journals primarily contributed to by LIS authors?," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 81-104, October.
    10. Sjögårde, Peter & Ahlgren, Per, 2018. "Granularity of algorithmically constructed publication-level classifications of research publications: Identification of topics," Journal of Informetrics, Elsevier, vol. 12(1), pages 133-152.
    11. Yuan Zhou & Heng Lin & Yufei Liu & Wei Ding, 2019. "A novel method to identify emerging technologies using a semi-supervised topic clustering model: a case of 3D printing industry," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(1), pages 167-185, July.
    12. Karmen Stopar & Damjana Drobne & Klemen Eler & Tomaz Bartol, 2016. "Citation analysis and mapping of nanoscience and nanotechnology: identifying the scope and interdisciplinarity of research," Scientometrics, Springer;Akadémiai Kiadó, vol. 106(2), pages 563-581, February.
    13. Hai-Yun Xu & Zeng-Hui Yue & Chao Wang & Kun Dong & Hong-Shen Pang & Zhengbiao Han, 2017. "Multi-source data fusion study in scientometrics," Scientometrics, Springer;Akadémiai Kiadó, vol. 111(2), pages 773-792, May.
    14. Jiancheng Guan & Wenjia Zhu, 2014. "How knowledge diffuses across countries: a case study in the field of management," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 2129-2144, March.
    15. Wu, Han-Ming & Tien, Yin-Jing & Chen, Chun-houh, 2010. "GAP: A graphical environment for matrix visualization and cluster analysis," Computational Statistics & Data Analysis, Elsevier, vol. 54(3), pages 767-778, March.
    16. José E. Chacón, 2021. "Explicit Agreement Extremes for a 2 × 2 Table with Given Marginals," Journal of Classification, Springer;The Classification Society, vol. 38(2), pages 257-263, July.
    17. Akinpelu, O.A. & Olaleye, O. & Fagbola, O., 2023. "The Soil Organic Matter Decomposers: A Bibliometric Analysis," International Journal of Agriculture and Environmental Research, Malwa International Journals Publication, vol. 9(4), August.
    18. Su, Hsin-Ning & Moaniba, Igam M., 2017. "Investigating the dynamics of interdisciplinary evolution in technology developments," Technological Forecasting and Social Change, Elsevier, vol. 122(C), pages 12-23.
    19. Roberto Rocci & Stefano Antonio Gattone & Roberto Di Mari, 2018. "A data driven equivariant approach to constrained Gaussian mixture modeling," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(2), pages 235-260, June.
    20. Redivo, Edoardo & Nguyen, Hien D. & Gupta, Mayetri, 2020. "Bayesian clustering of skewed and multimodal data using geometric skewed normal distributions," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:91:y:2012:i:2:d:10.1007_s11192-011-0600-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.