IDEAS home Printed from https://ideas.repec.org/a/eee/infome/v17y2023i2s175115772300024x.html
   My bibliography  Save this article

Predicting the evolution of scientific communities by interpretable machine learning approaches

Author

Listed:
  • Tian, Yunpei
  • Li, Gang
  • Mao, Jin

Abstract

Scientific communities serve as a fundamental structure of academic activity, and its evolutionary behavior also reveals the development of science. To track the evolution of scientific communities and dig into the mechanism behind it, we formulate the task of event-based Group Evolution Prediction and apply interpretable machine learning approaches to the task. Seven evolution events for prediction are defined based on the evolution chains of scientific communities detected from the collaboration network. By using a detailed feature set, including topological, external, core node, and temporal attributes, Extreme Gradient Boosting, and Random Forest are adopted for the prediction models. Experiments on the dataset of Library and Information Science shows that Random Forest performs the best, with the F1 scores of five events greater than 0.60. Shapley Additive exPlanations measure is applied to interpret the best model, i.e., quantify the contributions of features. It is observed that connectivity within a community has the most crucial influence, and community size, research topic consistency, research topic diversity, average node age, and the ratio of intermediary nodes play vital roles. The proposed methodology offers a solution to unearth the underlying mechanisms of the evolution of scientific communities, and the findings could be useful for scholars and policymakers to monitor scientific communities and take proactive actions.

Suggested Citation

  • Tian, Yunpei & Li, Gang & Mao, Jin, 2023. "Predicting the evolution of scientific communities by interpretable machine learning approaches," Journal of Informetrics, Elsevier, vol. 17(2).
  • Handle: RePEc:eee:infome:v:17:y:2023:i:2:s175115772300024x
    DOI: 10.1016/j.joi.2023.101399
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S175115772300024X
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.joi.2023.101399?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lingfei Wu & Dashun Wang & James A. Evans, 2019. "Large teams develop and small teams disrupt science and technology," Nature, Nature, vol. 566(7744), pages 378-382, February.
    2. Liliana Arroyo Moliner & Eva Gallardo-Gallardo & Pedro Gallo de Puelles, 2017. "Understanding scientific communities: a social network approach to collaborations in Talent Management research," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(3), pages 1439-1462, December.
    3. Ma, Ruimin, 2012. "Author bibliographic coupling analysis: A test based on a Chinese academic database," Journal of Informetrics, Elsevier, vol. 6(4), pages 532-542.
    4. Jin Mao & Yujie Cao & Kun Lu & Gang Li, 2017. "Topic scientific community in science: a combined perspective of scientific collaboration and topics," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(2), pages 851-875, August.
    5. Gergely Palla & Albert-László Barabási & Tamás Vicsek, 2007. "Quantifying social group evolution," Nature, Nature, vol. 446(7136), pages 664-667, April.
    6. Jingbei Wang & Naiding Yang, 2019. "Dynamics of collaboration network community and exploratory innovation: the moderation of knowledge networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(2), pages 1067-1084, November.
    7. Heinze, Thomas & Kuhlmann, Stefan, 2008. "Across institutional boundaries?: Research collaboration in German public sector nanoscience," Research Policy, Elsevier, vol. 37(5), pages 888-899, June.
    8. Vieira, Elizabeth S. & Cerdeira, Jorge & Teixeira, Aurora A.C., 2022. "Which distance dimensions matter in international research collaboration? A cross-country analysis by scientific domain," Journal of Informetrics, Elsevier, vol. 16(2).
    9. Yu-Wei Chang & Mu-Hsuan Huang, 2012. "A study of the evolution of interdisciplinarity in library and information science: Using three bibliometric methods," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(1), pages 22-33, January.
    10. Smith, Thomas Bryan & Vacca, Raffaele & Krenz, Till & McCarty, Christopher, 2021. "Great minds think alike, or do they often differ? Research topic overlap and the formation of scientific teams," Journal of Informetrics, Elsevier, vol. 15(1).
    11. Jie Zheng & Jianya Gong & Rui Li & Kai Hu & Huayi Wu & Siluo Yang, 2017. "Community evolution analysis based on co-author network: a case study of academic communities of the journal of “Annals of the Association of American Geographers”," Scientometrics, Springer;Akadémiai Kiadó, vol. 113(2), pages 845-865, November.
    12. Caroline S. Wagner & Travis A. Whetsell & Loet Leydesdorff, 2017. "Growth of international collaboration in science: revisiting six specialties," Scientometrics, Springer;Akadémiai Kiadó, vol. 110(3), pages 1633-1652, March.
    13. Geraldo J. Pessoa Junior & Thiago M. R. Dias & Thiago H. P. Silva & Alberto H. F. Laender, 2020. "On interdisciplinary collaborations in scientific coauthorship networks: the case of the Brazilian community," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 2341-2360, September.
    14. T. S. Evans & R. Lambiotte & P. Panzarasa, 2011. "Community structure and patterns of scientific collaboration in Business and Management," Scientometrics, Springer;Akadémiai Kiadó, vol. 89(1), pages 381-396, October.
    15. Weihua Li & Tomaso Aste & Fabio Caccioli & Giacomo Livan, 2019. "Early coauthorship with top scientists predicts success in academic careers," Nature Communications, Nature, vol. 10(1), pages 1-9, December.
    16. Ding, Ying, 2011. "Community detection: Topological vs. topical," Journal of Informetrics, Elsevier, vol. 5(4), pages 498-514.
    17. Carusi, Chiara & Bianchi, Giuseppe, 2019. "Scientific community detection via bipartite scholar/journal graph co-clustering," Journal of Informetrics, Elsevier, vol. 13(1), pages 354-386.
    18. Jennifer Dusdal & Justin J W Powell, 2021. "Benefits, Motivations, and Challenges of International Collaborative Research: A Sociology of Science Case Study," Science and Public Policy, Oxford University Press, vol. 48(2), pages 235-245.
    19. Katz, J. Sylvan & Martin, Ben R., 1997. "What is research collaboration?," Research Policy, Elsevier, vol. 26(1), pages 1-18, March.
    20. Howard D. White & Belver C. Griffith, 1981. "Author cocitation: A literature measure of intellectual structure," Journal of the American Society for Information Science, Association for Information Science & Technology, vol. 32(3), pages 163-171, May.
    21. Leo Katz, 1953. "A new status index derived from sociometric analysis," Psychometrika, Springer;The Psychometric Society, vol. 18(1), pages 39-43, March.
    22. Jung, Sukhwan & Yoon, Wan Chul, 2020. "An alternative topic model based on Common Interest Authors for topic evolution analysis," Journal of Informetrics, Elsevier, vol. 14(3).
    23. Dag W. Aksnes & Fredrik Niclas Piro & Kristoffer Rørstad, 2019. "Gender gaps in international research collaboration: a bibliometric approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(2), pages 747-774, August.
    24. Yan, Erjia & Guns, Raf, 2014. "Predicting and recommending collaborations: An author-, institution-, and country-level analysis," Journal of Informetrics, Elsevier, vol. 8(2), pages 295-309.
    25. Yu‐Wei Chang & Mu‐Hsuan Huang, 2012. "A study of the evolution of interdisciplinarity in library and information science: Using three bibliometric methods," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(1), pages 22-33, January.
    26. Jasjit Singh & Lee Fleming, 2010. "Lone Inventors as Sources of Breakthroughs: Myth or Reality?," Management Science, INFORMS, vol. 56(1), pages 41-56, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ma, Guoshuai & Yuhua, Qian & Zhang, Yayu & Yan, Hongren & Cheng, Honghong & Hu, Zhiguo, 2022. "The recognition of kernel research team," Journal of Informetrics, Elsevier, vol. 16(4).
    2. Ping Liu & Qiong Wu & Xiangming Mu & Kaipeng Yu & Yiting Guo, 2015. "Detecting the intellectual structure of library and information science based on formal concept analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 104(3), pages 737-762, September.
    3. Karimi, Fatemeh & Lotfi, Shahriar & Izadkhah, Habib, 2021. "Community-guided link prediction in multiplex networks," Journal of Informetrics, Elsevier, vol. 15(4).
    4. Dongqing Lyu & Kaile Gong & Xuanmin Ruan & Ying Cheng & Jiang Li, 2021. "Does research collaboration influence the “disruption” of articles? Evidence from neurosciences," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(1), pages 287-303, January.
    5. Fontana, Magda & Iori, Martina & Leone Sciabolazza, Valerio & Souza, Daniel, 2022. "The interdisciplinarity dilemma: Public versus private interests," Research Policy, Elsevier, vol. 51(7).
    6. Smith, Thomas Bryan & Vacca, Raffaele & Krenz, Till & McCarty, Christopher, 2021. "Great minds think alike, or do they often differ? Research topic overlap and the formation of scientific teams," Journal of Informetrics, Elsevier, vol. 15(1).
    7. Liu, Meijun & Jaiswal, Ajay & Bu, Yi & Min, Chao & Yang, Sijie & Liu, Zhibo & Acuña, Daniel & Ding, Ying, 2022. "Team formation and team impact: The balance between team freshness and repeat collaboration," Journal of Informetrics, Elsevier, vol. 16(4).
    8. de Frutos-Belizón, Jesús & García-Carbonell, Natalia & Ruíz-Martínez, Marta & Sánchez-Gardey, Gonzalo, 2023. "Disentangling international research collaboration in the Spanish academic context: Is there a desirable researcher human capital profile?," Research Policy, Elsevier, vol. 52(6).
    9. Tracy Klarenbeek & Nelius Boshoff, 2018. "Measuring multidisciplinary health research at South African universities: a comparative analysis based on co-authorships and journal subject categories," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 1461-1485, September.
    10. Anna Małgorzata Kamińska & Łukasz Opaliński & Łukasz Wyciślik, 2022. "The Landscapes of Sustainability in the Library and Information Science: Collaboration Insights," Sustainability, MDPI, vol. 14(24), pages 1-23, December.
    11. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    12. Pin Li & Guoli Yang & Chuanqi Wang, 2019. "Visual topical analysis of library and information science," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1753-1791, December.
    13. Jian Xu & Yi Bu & Ying Ding & Sinan Yang & Hongli Zhang & Chen Yu & Lin Sun, 2018. "Understanding the formation of interdisciplinary research from the perspective of keyword evolution: a case study on joint attention," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(2), pages 973-995, November.
    14. Marian-Gabriel Hâncean & Matjaž Perc & Jürgen Lerner, 2021. "The coauthorship networks of the most productive European researchers," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(1), pages 201-224, January.
    15. Shiji Chen & Clément Arsenault & Yves Gingras & Vincent Larivière, 2015. "Exploring the interdisciplinary evolution of a discipline: the case of Biochemistry and Molecular Biology," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(2), pages 1307-1323, February.
    16. Hoekman, Jarno & Rake, Bastian, 2024. "Geography of authorship: How geography shapes authorship attribution in big team science," Research Policy, Elsevier, vol. 53(2).
    17. María Pinto & Rosaura Fernández-Pascual & David Caballero-Mariscal & Dora Sales, 2020. "Information literacy trends in higher education (2006–2019): visualizing the emerging field of mobile information literacy," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(2), pages 1479-1510, August.
    18. Yu, Shuo & Alqahtani, Fayez & Tolba, Amr & Lee, Ivan & Jia, Tao & Xia, Feng, 2022. "Collaborative Team Recognition: A Core Plus Extension Structure," Journal of Informetrics, Elsevier, vol. 16(4).
    19. Önder, Ali Sina & Schweitzer, Sascha & Yilmazkuday, Hakan, 2021. "Specialization, field distance, and quality in economists’ collaborations," Journal of Informetrics, Elsevier, vol. 15(4).
    20. Hans Pohl, 2021. "Internationalisation, innovation, and academic–corporate co-publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1329-1358, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:infome:v:17:y:2023:i:2:s175115772300024x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/joi .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.