IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v101y2014i2d10.1007_s11192-013-1228-9.html
   My bibliography  Save this article

Recommending research collaborations using link prediction and random forest classifiers

Author

Listed:
  • Raf Guns

    (University of Antwerp)

  • Ronald Rousseau

    (University of Antwerp
    KU Leuven)

Abstract

We introduce a method to predict or recommend high-potential future (i.e., not yet realized) collaborations. The proposed method is based on a combination of link prediction and machine learning techniques. First, a weighted co-authorship network is constructed. We calculate scores for each node pair according to different measures called predictors. The resulting scores can be interpreted as indicative of the likelihood of future linkage for the given node pair. To determine the relative merit of each predictor, we train a random forest classifier on older data. The same classifier can then generate predictions for newer data. The top predictions are treated as recommendations for future collaboration. We apply the technique to research collaborations between cities in Africa, the Middle East and South-Asia, focusing on the topics of malaria and tuberculosis. Results show that the method yields accurate recommendations. Moreover, the method can be used to determine the relative strengths of each predictor.

Suggested Citation

  • Raf Guns & Ronald Rousseau, 2014. "Recommending research collaborations using link prediction and random forest classifiers," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1461-1473, November.
  • Handle: RePEc:spr:scient:v:101:y:2014:i:2:d:10.1007_s11192-013-1228-9
    DOI: 10.1007/s11192-013-1228-9
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-013-1228-9
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-013-1228-9?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Naoki Shibata & Yuya Kajikawa & Ichiro Sakata, 2012. "Link prediction in citation networks," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(1), pages 78-85, January.
    2. Leo Egghe & Ronald Rousseau, 2003. "A measure for the cohesion of weighted networks," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 54(3), pages 193-202, February.
    3. Naoki Shibata & Yuya Kajikawa & Ichiro Sakata, 2012. "Link prediction in citation networks," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 63(1), pages 78-85, January.
    4. Nees Jan Eck & Ludo Waltman, 2010. "Software survey: VOSviewer, a computer program for bibliometric mapping," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(2), pages 523-538, August.
    5. Nelius Boshoff, 2010. "South–South research collaboration of countries in the Southern African Development Community (SADC)," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(2), pages 481-503, August.
    6. Torben Schubert & Radhamany Sooryamoorthy, 2010. "Can the centre–periphery model explain patterns of international scientific collaboration among threshold and industrialised countries? The case of South Africa and Germany," Scientometrics, Springer;Akadémiai Kiadó, vol. 83(1), pages 181-203, April.
    7. Frenken, Koen & Hardeman, Sjoerd & Hoekman, Jarno, 2009. "Spatial scientometrics: Towards a cumulative research program," Journal of Informetrics, Elsevier, vol. 3(3), pages 222-232.
    8. Leo Katz, 1953. "A new status index derived from sociometric analysis," Psychometrika, Springer;The Psychometric Society, vol. 18(1), pages 39-43, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ali Daud & Min Song & Malik Khizar Hayat & Tehmina Amjad & Rabeeh Ayaz Abbasi & Hassan Dawood & Anwar Ghani, 2020. "Finding rising stars in bibliometric networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 633-661, July.
    2. Behrouzi, Saman & Shafaeipour Sarmoor, Zahra & Hajsadeghi, Khosrow & Kavousi, Kaveh, 2020. "Predicting scientific research trends based on link prediction in keyword networks," Journal of Informetrics, Elsevier, vol. 14(4).
    3. Huang, Lu & Chen, Xiang & Ni, Xingxing & Liu, Jiarun & Cao, Xiaoli & Wang, Changtian, 2021. "Tracking the dynamics of co-word networks for emerging topic identification," Technological Forecasting and Social Change, Elsevier, vol. 170(C).
    4. Eustache Mêgnigbêto, 2018. "Correlation Between Transmission Power and Some Indicators Used to Measure the Knowledge-Based Economy: Case of Six OECD Countries," Journal of the Knowledge Economy, Springer;Portland International Center for Management of Engineering and Technology (PICMET), vol. 9(4), pages 1168-1183, December.
    5. Liat Ayalon & Inbal Yahav, 2019. "Location, location, location: Close ties among older continuing care retirement community residents," PLOS ONE, Public Library of Science, vol. 14(11), pages 1-17, November.
    6. Ali Daud & Muhammad Ahmad & M. S. I. Malik & Dunren Che, 2015. "Using machine learning techniques for rising star prediction in co-author network," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(2), pages 1687-1711, February.
    7. Mêgnigbêto, Eustache, 2018. "Modelling the Triple Helix of university-industry-government relationships with game theory: Core, Shapley value and nucleolus as indicators of synergy within an innovation system," Journal of Informetrics, Elsevier, vol. 12(4), pages 1118-1132.
    8. Bütün, Ertan & Kaya, Mehmet, 2019. "A pattern based supervised link prediction in directed complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 525(C), pages 1136-1145.
    9. Jinseok Kim & Jana Diesner, 2019. "Formational bounds of link prediction in collaboration networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 687-706, May.
    10. Ting Xiong & Liang Zhou & Ying Zhao & Xiaojuan Zhang, 2022. "Mining semantic information of co-word network to improve link prediction performance," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(6), pages 2981-3004, June.
    11. Ma, Jing & Abrams, Natalie F. & Porter, Alan L. & Zhu, Donghua & Farrell, Dorothy, 2019. "Identifying translational indicators and technology opportunities for nanomedical research using tech mining: The case of gold nanostructures," Technological Forecasting and Social Change, Elsevier, vol. 146(C), pages 767-775.
    12. Xiaowen Xi & Jiaqi Wei & Ying Guo & Weiyu Duan, 2022. "Academic collaborations: a recommender framework spanning research interests and network topology," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6787-6808, November.
    13. Yan Qi & Xin Zhang & Zhengyin Hu & Bin Xiang & Ran Zhang & Shu Fang, 2022. "Choosing the right collaboration partner for innovation: a framework based on topic analysis and link prediction," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5519-5550, September.
    14. Jing Ma & Yaohui Pan & Chih-Yi Su, 2022. "Organization-oriented technology opportunities analysis based on predicting patent networks: a case of Alzheimer’s disease," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5497-5517, September.
    15. Tranos, Emmanouil & Incera, Andre Carrascal & Willis, George, 2022. "Using the web to predict regional trade flows: data extraction, modelling, and validation," OSF Preprints 9bu5z, Center for Open Science.
    16. Song Wang & Jiexin Wang & Chenqi Wei & Xueli Wang & Fei Fan, 2021. "Collaborative innovation efficiency: From within cities to between cities—Empirical analysis based on innovative cities in China," Growth and Change, Wiley Blackwell, vol. 52(3), pages 1330-1360, September.
    17. Guns, Raf & Wang, Lili, 2017. "Detecting the emergence of new scientific collaboration links in Africa: A comparison of expected and realized collaboration intensities," Journal of Informetrics, Elsevier, vol. 11(3), pages 892-903.
    18. Lu Huang & Xiang Chen & Yi Zhang & Yihe Zhu & Suyi Li & Xingxing Ni, 2021. "Dynamic network analytics for recommending scientific collaborators," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 8789-8814, November.
    19. Orzechowski, Kamil P. & Mrowinski, Maciej J. & Fronczak, Agata & Fronczak, Piotr, 2023. "Asymmetry of social interactions and its role in link predictability: The case of coauthorship networks," Journal of Informetrics, Elsevier, vol. 17(2).
    20. Rodica Ioana Lung & Noémi Gaskó & Mihai Alexandru Suciu, 2018. "A hypergraph model for representing scientific output," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(3), pages 1361-1379, December.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Nazim Choudhury & Shahadat Uddin, 2016. "Time-aware link prediction to explore network effects on temporal knowledge evolution," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(2), pages 745-776, August.
    2. Yan, Erjia & Guns, Raf, 2014. "Predicting and recommending collaborations: An author-, institution-, and country-level analysis," Journal of Informetrics, Elsevier, vol. 8(2), pages 295-309.
    3. Guns, Raf & Wang, Lili, 2017. "Detecting the emergence of new scientific collaboration links in Africa: A comparison of expected and realized collaboration intensities," Journal of Informetrics, Elsevier, vol. 11(3), pages 892-903.
    4. Luis Araya-Castillo & Felipe Hernández-Perlines & Hugo Moraga & Antonio Ariza-Montes, 2021. "Scientometric Analysis of Research on Socioemotional Wealth," Sustainability, MDPI, vol. 13(7), pages 1-26, March.
    5. Ángel Acevedo-Duque & Alejandro Vega-Muñoz & Guido Salazar-Sepúlveda, 2020. "Analysis of Hospitality, Leisure, and Tourism Studies in Chile," Sustainability, MDPI, vol. 12(18), pages 1-20, September.
    6. Yichi Zhang & Zhiliang Dong & Sen Liu & Peixiang Jiang & Cuizhi Zhang & Chao Ding, 2021. "Forecast of International Trade of Lithium Carbonate Products in Importing Countries and Small-Scale Exporting Countries," Sustainability, MDPI, vol. 13(3), pages 1-23, January.
    7. Sasaki, Hajime & Sakata, Ichiro, 2021. "Identifying potential technological spin-offs using hierarchical information in international patent classification," Technovation, Elsevier, vol. 100(C).
    8. Adilson Vital & Diego R. Amancio, 2022. "A comparative analysis of local similarity metrics and machine learning approaches: application to link prediction in author citation networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(10), pages 6011-6028, October.
    9. Bornmann, Lutz & Waltman, Ludo, 2011. "The detection of “hot regions” in the geography of science—A visualization approach by using density maps," Journal of Informetrics, Elsevier, vol. 5(4), pages 547-553.
    10. Wang, Feifei & Dong, Jiaxin & Lu, Wanzhao & Xu, Shuo, 2023. "Collaboration prediction based on multilayer all-author tripartite citation networks: A case study of gene editing," Journal of Informetrics, Elsevier, vol. 17(1).
    11. Dosso, Mafini & Cassi, Lorenzo & Mescheba, Wilfriedo, 2023. "Towards regional scientific integration in Africa? Evidence from co-publications," Research Policy, Elsevier, vol. 52(1).
    12. Nelson Casimiro Zavale & Patrício Vitorino Langa, 2018. "University-industry linkages’ literature on Sub-Saharan Africa: systematic literature review and bibliometric account," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(1), pages 1-49, July.
    13. Copiello, Sergio, 2019. "Peer and neighborhood effects: Citation analysis using a spatial autoregressive model and pseudo-spatial data," Journal of Informetrics, Elsevier, vol. 13(1), pages 238-254.
    14. Zhi Li & Qinke Peng & Che Liu, 2016. "Two citation-based indicators to measure latent referential value of papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(3), pages 1299-1313, September.
    15. Jing Ma & Yaohui Pan & Chih-Yi Su, 2022. "Organization-oriented technology opportunities analysis based on predicting patent networks: a case of Alzheimer’s disease," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5497-5517, September.
    16. Elizabeth S. Vieira, 2022. "International research collaboration in Africa: a bibliometric and thematic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(5), pages 2747-2772, May.
    17. Lutz Bornmann & Loet Leydesdorff, 2011. "Which cities produce more excellent papers than can be expected? A new mapping approach, using Google Maps, based on statistical significance testing," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 62(10), pages 1954-1962, October.
    18. Giorgia Bondanini & Gabriele Giorgi & Antonio Ariza-Montes & Alejandro Vega-Muñoz & Paola Andreucci-Annunziata, 2020. "Technostress Dark Side of Technology in the Workplace: A Scientometric Analysis," IJERPH, MDPI, vol. 17(21), pages 1-23, October.
    19. Tofighy, Sajjad & Charkari, Nasrollah Moghadam & Ghaderi, Foad, 2022. "Link prediction in multiplex networks using intralayer probabilistic distance and interlayer co-evolving factors," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 606(C).
    20. Mario Morales-Parragué & Luis Araya-Castillo & Fidel Molina-Luque & Hugo Moraga-Flores, 2022. "Scientometric Analysis of Research on Corporate Social Responsibility," Sustainability, MDPI, vol. 14(4), pages 1-22, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:101:y:2014:i:2:d:10.1007_s11192-013-1228-9. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.