IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v10y2016i1p212-223.html
   My bibliography  Save this article

Selecting publication keywords for domain analysis in bibliometrics: A comparison of three methods

Author

Listed:
  • Chen, Guo
  • Xiao, Lu

Abstract

Publication keywords have been widely utilized to reveal the knowledge structure of research domains. An important but under-addressed problem is the decision of which keywords should be retained as analysis objects after a great number of keywords are gathered from domain publications. In this paper, we discuss the problems with the traditional term frequency (TF) method and introduce two alternative methods: TF-inverse document frequency (TF-IDF) and TF-Keyword Activity Index (TF-KAI). These two methods take into account keyword discrimination by considering their frequency both in and out of the domain. To test their performance, the keywords they select in China's Digital Library domain are evaluated both qualitatively and quantitatively. The evaluation results show that the TF-KAI method performs the best: it can retain keywords that match expert selection much better and reveal the research specialization of the domain with more details.

Suggested Citation

  • Chen, Guo & Xiao, Lu, 2016. "Selecting publication keywords for domain analysis in bibliometrics: A comparison of three methods," Journal of Informetrics, Elsevier, vol. 10(1), pages 212-223.
  • Handle: RePEc:eee:infome:v:10:y:2016:i:1:p:212-223
    DOI: 10.1016/j.joi.2016.01.006
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S175115771600002X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2016.01.006?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Erjia Yan, 2015. "Research dynamics, impact, and dissemination: A topic-level analysis," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(11), pages 2357-2372, November.
    2. Cassidy R. Sugimoto & Daifeng Li & Terrell G. Russell & S. Craig Finlay & Ying Ding, 2011. "The shifting sands of disciplinary development: Analyzing North American Library and Information Science dissertations using latent Dirichlet allocation," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(1), pages 185-204, January.
    3. Staša Milojević & Cassidy R. Sugimoto & Erjia Yan & Ying Ding, 2011. "The cognitive structure of Library and Information Science: Analysis of article title words," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(10), pages 1933-1953, October.
    4. Guo Chen & Lu Xiao & Chang-ping Hu & Xue-qin Zhao, 2015. "Identifying the research focus of Library and Information Science institutions in China with institution-specific keywords," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 707-724, May.
    5. Cassidy R. Sugimoto & Daifeng Li & Terrell G. Russell & S. Craig Finlay & Ying Ding, 2011. "The shifting sands of disciplinary development: Analyzing North American Library and Information Science dissertations using latent Dirichlet allocation," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(1), pages 185-204, January.
    6. Chaomei Chen, 2006. "CiteSpace II: Detecting and visualizing emerging trends and transient patterns in scientific literature," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 57(3), pages 359-377, February.
    7. Francesca De Battisti & Alfio Ferrara & Silvia Salini, 2015. "A decade of research in statistics: a topic model approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 413-433, May.
    8. Xinning Su & Sanhong Deng & Si Shen, 2014. "The design and application value of the Chinese Social Science Citation Index," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 1567-1582, March.
    9. Anke Piepenbrink & Elkin Nurmammadov, 2015. "Topics in the literature of transition economies and emerging markets," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(3), pages 2107-2130, March.
    10. Shaodong Xie & Jing Zhang & Yuh-Shan Ho, 2008. "Assessment of world aerosol research trends by bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 77(1), pages 113-130, October.
    11. Anastassios Pouris & Yuh-Shan Ho, 2014. "Research emphasis and collaboration in Africa," Scientometrics, Springer;Akadémiai Kiadó, vol. 98(3), pages 2169-2184, March.
    12. Staša Milojević & Cassidy R. Sugimoto & Erjia Yan & Ying Ding, 2011. "The cognitive structure of Library and Information Science: Analysis of article title words," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 62(10), pages 1933-1953, October.
    13. Alfio Ferrara & Silvia Salini, 2012. "Ten challenges in modeling bibliographic data for bibliometric analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(3), pages 765-785, December.
    14. Shimelis G. Assefa & Abebe Rorissa, 2013. "A bibliometric mapping of the structure of STEM education using co‐word analysis," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 64(12), pages 2513-2536, December.
    15. Zhong-Yi Wang & Gang Li & Chun-Ya Li & Ang Li, 2012. "Research on the semantic-based co-word analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 90(3), pages 855-875, March.
    16. Ding, Ying, 2011. "Community detection: Topological vs. topical," Journal of Informetrics, Elsevier, vol. 5(4), pages 498-514.
    17. Ivana Roche & Dominique Besagni & Claire François & Marianne Hörlesberger & Edgar Schiebel, 2010. "Identification and characterisation of technological topics in the field of Molecular Biology," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(3), pages 663-676, March.
    18. Shimelis G. Assefa & Abebe Rorissa, 2013. "A bibliometric mapping of the structure of STEM education using co-word analysis," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 64(12), pages 2513-2536, December.
    19. Carmen López-Illescas & Félix Moya-Anegón & Henk F. Moed, 2011. "A ranking of universities should account for differences in their disciplinary specialization," Scientometrics, Springer;Akadémiai Kiadó, vol. 88(2), pages 563-574, August.
    20. Sangyoon Yi & Jinho Choi, 2012. "The organization of scientific knowledge: the structural characteristics of keyword networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 90(3), pages 1015-1026, March.
    21. Jabłońska-Sabuka, Matylda & Sitarz, Robert & Kraslawski, Andrzej, 2014. "Forecasting research trends using population dynamics model with Burgers’ type interaction," Journal of Informetrics, Elsevier, vol. 8(1), pages 111-122.
    22. Rongying Zhao & Ju Wang, 2011. "Visualizing the research on pervasive and ubiquitous computing," Scientometrics, Springer;Akadémiai Kiadó, vol. 86(3), pages 593-612, March.
    23. Harzing, Anne-Wil & Giroud, Axèle, 2014. "The competitive advantage of nations: An application to academia," Journal of Informetrics, Elsevier, vol. 8(1), pages 29-42.
    24. Hsin-Ning Su & Pei-Chun Lee, 2010. "Mapping knowledge structure by keyword co-occurrence: a first look at journal papers in Technology Foresight," Scientometrics, Springer;Akadémiai Kiadó, vol. 85(1), pages 65-79, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guo Chen & Lu Xiao & Chang-ping Hu & Xue-qin Zhao, 2015. "Identifying the research focus of Library and Information Science institutions in China with institution-specific keywords," Scientometrics, Springer;Akadémiai Kiadó, vol. 103(2), pages 707-724, May.
    2. Yosuke Miyata & Emi Ishita & Fang Yang & Michimasa Yamamoto & Azusa Iwase & Keiko Kurata, 2020. "Knowledge structure transition in library and information science: topic modeling and visualization," Scientometrics, Springer;Akadémiai Kiadó, vol. 125(1), pages 665-687, October.
    3. Guan, Jiancheng & Yan, Yan & Zhang, Jing Jing, 2017. "The impact of collaboration and knowledge networks on citations," Journal of Informetrics, Elsevier, vol. 11(2), pages 407-422.
    4. Abhijit Thakuria & Dipen Deka, 2024. "A decadal study on identifying latent topics and research trends in open access LIS journals using topic modeling approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 129(7), pages 3841-3869, July.
    5. Pin Li & Guoli Yang & Chuanqi Wang, 2019. "Visual topical analysis of library and information science," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1753-1791, December.
    6. Qian-Jin Zong & Hong-Zhou Shen & Qin-Jian Yuan & Xiao-Wei Hu & Zhi-Ping Hou & Shun-Guo Deng, 2013. "Doctoral dissertations of Library and Information Science in China: A co-word analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(2), pages 781-799, February.
    7. Kai Hu & Huayi Wu & Kunlun Qi & Jingmin Yu & Siluo Yang & Tianxing Yu & Jie Zheng & Bo Liu, 2018. "A domain keyword analysis approach extending Term Frequency-Keyword Active Index with Google Word2Vec model," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 1031-1068, March.
    8. Yi Bu & Binglu Wang & Win-bin Huang & Shangkun Che & Yong Huang, 2018. "Using the appearance of citations in full text on author co-citation analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 275-289, July.
    9. Erjia Yan, 2014. "Topic-based Pagerank: toward a topic-level scientific evaluation," Scientometrics, Springer;Akadémiai Kiadó, vol. 100(2), pages 407-437, August.
    10. Yu-Wei Chang, 2018. "Examining interdisciplinarity of library and information science (LIS) based on LIS articles contributed by non-LIS authors," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 1589-1613, September.
    11. Ping Liu & Qiong Wu & Xiangming Mu & Kaipeng Yu & Yiting Guo, 2015. "Detecting the intellectual structure of library and information science based on formal concept analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(3), pages 737-762, September.
    12. Carlos G. Figuerola & Francisco Javier García Marco & María Pinto, 2017. "Mapping the evolution of library and information science (1978–2014) using topic modeling on LISA," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(3), pages 1507-1535, September.
    13. Chaoqun Ni & Cassidy R. Sugimoto & Blaise Cronin, 2013. "Visualizing and comparing four facets of scholarly communication: producers, artifacts, concepts, and gatekeepers," Scientometrics, Springer;Akadémiai Kiadó, vol. 94(3), pages 1161-1173, March.
    14. Bo Wang & Shengbo Liu & Kun Ding & Zeyuan Liu & Jing Xu, 2014. "Identifying technological topics and institution-topic distribution probability for patent competitive intelligence analysis: a case study in LTE technology," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(1), pages 685-704, October.
    15. Yan, Erjia, 2014. "Research dynamics: Measuring the continuity and popularity of research topics," Journal of Informetrics, Elsevier, vol. 8(1), pages 98-110.
    16. Hao Wang & Sanhong Deng & Xinning Su, 2016. "A study on construction and analysis of discipline knowledge structure of Chinese LIS based on CSSCI," Scientometrics, Springer;Akadémiai Kiadó, vol. 109(3), pages 1725-1759, December.
    17. Yuen-Hsien Tseng & Ming-Yueh Tsay, 2013. "Journal clustering of library and information science for subfield delineation using the bibliometric analysis toolkit: CATAR," Scientometrics, Springer;Akadémiai Kiadó, vol. 95(2), pages 503-528, May.
    18. Beibei Hu & Xianlei Dong & Chenwei Zhang & Timothy D. Bowman & Ying Ding & Staša Milojević & Chaoqun Ni & Erjia Yan & Vincent Larivière, 2015. "A lead-lag analysis of the topic evolution patterns for preprints and publications," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(12), pages 2643-2656, December.
    19. Jianhua Hou & Xiucai Yang & Chaomei Chen, 2018. "Emerging trends and new developments in information science: a document co-citation analysis (2009–2016)," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(2), pages 869-892, May.
    20. Sarah Tiba & Frank J. van Rijnsoever & Marko P. Hekkert, 2019. "Firms with benefits: A systematic review of responsible entrepreneurship and corporate social responsibility literature," Corporate Social Responsibility and Environmental Management, John Wiley & Sons, vol. 26(2), pages 265-284, March.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:10:y:2016:i:1:p:212-223. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.