IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v119y2019i2d10.1007_s11192-019-03055-6.html
   My bibliography  Save this article

Formational bounds of link prediction in collaboration networks

Author

Listed:
  • Jinseok Kim

    (University of Michigan)

  • Jana Diesner

    (University of Illinois at Urbana-Champaign)

Abstract

Link prediction in collaboration networks is often solved by identifying structural properties of existing nodes that are disconnected at one point in time, and that share a link later on. The maximally possible recall rate or upper bound of this approach’s success is capped by the proportion of links that are formed among existing nodes embedded in these properties. Consequentially, sustained links as well as links that involve one or two new network participants are typically not predicted. The purpose of this study is to highlight formational constraints that need to be considered to increase the practical value of link prediction methods targeted for collaboration networks. In this study, we identify the distribution of basic link formation types based on four large-scale, over-time collaboration networks, showing that roughly speaking, 25% of links represent continued collaborations, 25% of links are new collaborations between existing authors, and 50% are formed between an existing author and a new network member. This implies that for collaboration networks, increasing the accuracy of computational link prediction solutions may not be a reasonable goal when the ratio of collaboration links that are eligible to the classic link prediction process is low.

Suggested Citation

  • Jinseok Kim & Jana Diesner, 2019. "Formational bounds of link prediction in collaboration networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 119(2), pages 687-706, May.
  • Handle: RePEc:spr:scient:v:119:y:2019:i:2:d:10.1007_s11192-019-03055-6
    DOI: 10.1007/s11192-019-03055-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-019-03055-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-019-03055-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David Liben‐Nowell & Jon Kleinberg, 2007. "The link‐prediction problem for social networks," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 58(7), pages 1019-1031, May.
    2. Kim, Jinseok & Diesner, Jana, 2015. "The effect of data pre-processing on understanding the evolution of collaboration networks," Journal of Informetrics, Elsevier, vol. 9(1), pages 226-236.
    3. Tibor Braun & Wolfgang Glänzel & András Schubert, 2001. "Publication and cooperation patterns of the authors of neuroscience journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 50(3), pages 499-510, January.
    4. Guillaume Cabanac & Gilles Hubert & Béatrice Milard, 2015. "Academic careers in Computer Science: continuance and transience of lifetime co-authorships," Scientometrics, Springer;Akadémiai Kiadó, vol. 102(1), pages 135-150, January.
    5. Staša Milojević, 2010. "Modes of collaboration in modern science: Beyond power laws and preferential attachment," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 61(7), pages 1410-1423, July.
    6. Jinseok Kim & Liang Tao & Seok-Hyoung Lee & Jana Diesner, 2016. "Evolution and structure of scientific co-publishing network in Korea between 1948–2011," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(1), pages 27-41, April.
    7. Barabási, A.L & Jeong, H & Néda, Z & Ravasz, E & Schubert, A & Vicsek, T, 2002. "Evolution of the social network of scientific collaborations," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 311(3), pages 590-614.
    8. Brent D Fegley & Vetle I Torvik, 2013. "Has Large-Scale Named-Entity Network Analysis Been Resting on a Flawed Assumption?," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-16, July.
    9. Staša Milojević, 2010. "Modes of collaboration in modern science: Beyond power laws and preferential attachment," Journal of the American Society for Information Science and Technology, Association for Information Science & Technology, vol. 61(7), pages 1410-1423, July.
    10. Jinseok Kim, 2018. "Evaluating author name disambiguation for digital libraries: a case of DBLP," Scientometrics, Springer;Akadémiai Kiadó, vol. 116(3), pages 1867-1886, September.
    11. Jinseok Kim & Jana Diesner, 2016. "Distortive effects of initial-based name disambiguation on measurements of large-scale coauthorship networks," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 67(6), pages 1446-1461, June.
    12. Lü, Linyuan & Zhou, Tao, 2011. "Link prediction in complex networks: A survey," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 390(6), pages 1150-1170.
    13. Wagner, Caroline S. & Leydesdorff, Loet, 2005. "Network structure, self-organization, and the growth of international collaboration in science," Research Policy, Elsevier, vol. 34(10), pages 1608-1618, December.
    14. Raf Guns & Ronald Rousseau, 2014. "Recommending research collaborations using link prediction and random forest classifiers," Scientometrics, Springer;Akadémiai Kiadó, vol. 101(2), pages 1461-1473, November.
    15. Yan, Erjia & Guns, Raf, 2014. "Predicting and recommending collaborations: An author-, institution-, and country-level analysis," Journal of Informetrics, Elsevier, vol. 8(2), pages 295-309.
    16. Tibor Braun & Wolfgang Glänzel & András Schubert, 2001. "Publication and cooperation patterns of the authors of neuroscience journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 51(3), pages 499-510, July.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yueran Duan & Qing Guan, 2021. "Predicting potential knowledge convergence of solar energy: bibliometric analysis based on link prediction model," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(5), pages 3749-3773, May.
    2. Patrick Doreian & Andrej Mrvar, 2022. "Public issues, policy proposals, social movements, and the interests of the Koch Brothers network of allies," Quality & Quantity: International Journal of Methodology, Springer, vol. 56(1), pages 305-332, February.
    3. Lu Huang & Xiang Chen & Yi Zhang & Yihe Zhu & Suyi Li & Xingxing Ni, 2021. "Dynamic network analytics for recommending scientific collaborators," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 8789-8814, November.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zheng Xie, 2019. "A cooperative game model for the multimodality of coauthorship networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 503-519, October.
    2. J. Sylvan Katz & Guillermo Armando Ronda-Pupo, 2019. "Cooperation, scale-invariance and complex innovation systems: a generalization," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(2), pages 1045-1065, November.
    3. Jinseok Kim, 2019. "A fast and integrative algorithm for clustering performance evaluation in author name disambiguation," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(2), pages 661-681, August.
    4. Alberto Pepe & Marko A. Rodriguez, 2010. "Collaboration in sensor network research: an in-depth longitudinal analysis of assortative mixing patterns," Scientometrics, Springer;Akadémiai Kiadó, vol. 84(3), pages 687-701, September.
    5. Jinseok Kim & Jenna Kim, 2018. "The impact of imbalanced training data on machine learning for author name disambiguation," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 511-526, October.
    6. Huang, Lu & Chen, Xiang & Ni, Xingxing & Liu, Jiarun & Cao, Xiaoli & Wang, Changtian, 2021. "Tracking the dynamics of co-word networks for emerging topic identification," Technological Forecasting and Social Change, Elsevier, vol. 170(C).
    7. Zheng Xie & Zonglin Xie & Miao Li & Jianping Li & Dongyun Yi, 2017. "Modeling the coevolution between citations and coauthorship of scientific papers," Scientometrics, Springer;Akadémiai Kiadó, vol. 112(1), pages 483-507, July.
    8. Lu Huang & Xiang Chen & Yi Zhang & Yihe Zhu & Suyi Li & Xingxing Ni, 2021. "Dynamic network analytics for recommending scientific collaborators," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(11), pages 8789-8814, November.
    9. Avishag Gordon, 2007. "Transient and continuant authors in a research field: The case of terrorism," Scientometrics, Springer;Akadémiai Kiadó, vol. 72(2), pages 213-224, August.
    10. Kim, Jinseok & Diesner, Jana, 2015. "The effect of data pre-processing on understanding the evolution of collaboration networks," Journal of Informetrics, Elsevier, vol. 9(1), pages 226-236.
    11. Brent D Fegley & Vetle I Torvik, 2013. "Has Large-Scale Named-Entity Network Analysis Been Resting on a Flawed Assumption?," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-16, July.
    12. Xiaowen Xi & Jiaqi Wei & Ying Guo & Weiyu Duan, 2022. "Academic collaborations: a recommender framework spanning research interests and network topology," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6787-6808, November.
    13. Chao Lu & Yingyi Zhang & Yong‐Yeol Ahn & Ying Ding & Chenwei Zhang & Dandan Ma, 2020. "Co‐contributorship network and division of labor in individual scientific collaborations," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 71(10), pages 1162-1178, October.
    14. Jinseok Kim & Liang Tao & Seok-Hyoung Lee & Jana Diesner, 2016. "Evolution and structure of scientific co-publishing network in Korea between 1948–2011," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(1), pages 27-41, April.
    15. Yan Qi & Xin Zhang & Zhengyin Hu & Bin Xiang & Ran Zhang & Shu Fang, 2022. "Choosing the right collaboration partner for innovation: a framework based on topic analysis and link prediction," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5519-5550, September.
    16. Yu, Jiating & Wu, Ling-Yun, 2022. "Multiple Order Local Information model for link prediction in complex networks," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 600(C).
    17. Jinseok Kim & Jenna Kim & Jason Owen‐Smith, 2021. "Ethnicity‐based name partitioning for author name disambiguation using supervised machine learning," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(8), pages 979-994, August.
    18. Jeppe Nicolaisen & Tove Faber Frandsen, 2022. "Epistemic community formation: a bibliometric study of recurring authors in medical journals," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(7), pages 4167-4189, July.
    19. Inoue, Masaaki & Pham, Thong & Shimodaira, Hidetoshi, 2020. "Joint estimation of non-parametric transitivity and preferential attachment functions in scientific co-authorship networks," Journal of Informetrics, Elsevier, vol. 14(3).
    20. Liu, Junwan & Guo, Xiaofei & Xu, Shuo & Song, Yinglu & Ding, Kaiyue, 2023. "A new interpretation of scientific collaboration patterns from the perspective of symbiosis: An investigation for long-term collaboration in publications," Journal of Informetrics, Elsevier, vol. 17(1).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:119:y:2019:i:2:d:10.1007_s11192-019-03055-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.