IDEAS home Printed from https://ideas.repec.org/a/spr/scient/v127y2022i11d10.1007_s11192-022-04337-2.html
   My bibliography  Save this article

Predicting the future impact of Computer Science researchers: Is there a gender bias?

Author

Listed:
  • Matthias Kuppler

    (University of Siegen)

Abstract

The advent of large-scale bibliographic databases and powerful prediction algorithms led to calls for data-driven approaches for targeting scarce funds at researchers with high predicted future scientific impact. The potential side-effects and fairness implications of such approaches are unknown, however. Using a large-scale bibliographic data set of N = 111,156 Computer Science researchers active from 1993 to 2016, I build and evaluate a realistic scientific impact prediction model. Given the persistent under-representation of women in Computer Science, the model is audited for disparate impact based on gender. Random forests and Gradient Boosting Machines are used to predict researchers’ h-index in 2010 from their bibliographic profiles in 2005. Based on model predictions, it is determined whether the researcher will become a high-performer with an h-index in the top-25% of the discipline-specific h-index distribution. The models predict the future h-index with an accuracy of $$R^2 = 0.875$$ R 2 = 0.875 and correctly classify 91.0% of researchers as high-performers and low-performers. Overall accuracy does not vary strongly across researcher gender. Nevertheless, there is indication of disparate impact against women. The models under-estimate the true h-index of female researchers more strongly than the h-index of male researchers. Further, women are 8.6% less likely to be predicted to become high-performers than men. In practice, hiring, tenure, and funding decisions that are based on model predictions risk to perpetuate the under-representation of women in Computer Science.

Suggested Citation

  • Matthias Kuppler, 2022. "Predicting the future impact of Computer Science researchers: Is there a gender bias?," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6695-6732, November.
  • Handle: RePEc:spr:scient:v:127:y:2022:i:11:d:10.1007_s11192-022-04337-2
    DOI: 10.1007/s11192-022-04337-2
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11192-022-04337-2
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11192-022-04337-2?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Mohsen Jadidi & Fariba Karimi & Haiko Lietz & Claudia Wagner, 2018. "Gender Disparities In Science? Dropout, Productivity, Collaborations And Success Of Male And Female Computer Scientists," Advances in Complex Systems (ACS), World Scientific Publishing Co. Pte. Ltd., vol. 21(03n04), pages 1-23, May.
    2. Camil Demetrescu & Irene Finocchi & Andrea Ribichini & Marco Schaerf, 2020. "On bibliometrics in academic promotions: a case study in computer science and engineering in Italy," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(3), pages 2207-2228, September.
    3. Beaudry, Catherine & Larivière, Vincent, 2016. "Which gender gap? Factors affecting researchers’ scientific impact in science and medicine," Research Policy, Elsevier, vol. 45(9), pages 1790-1817.
    4. Jevin D West & Jennifer Jacquet & Molly M King & Shelley J Correll & Carl T Bergstrom, 2013. "The Role of Gender in Scholarly Authorship," PLOS ONE, Public Library of Science, vol. 8(7), pages 1-6, July.
    5. Dimitris Bertsimas & Erik Brynjolfsson & Shachar Reichman & John Silberholz, 2015. "OR Forum—Tenure Analytics: Models for Predicting Research Impact," Operations Research, INFORMS, vol. 63(6), pages 1246-1261, December.
    6. Bedoor K. AlShebli & Talal Rahwan & Wei Lee Woon, 2018. "The preeminence of ethnic diversity in scientific collaboration," Nature Communications, Nature, vol. 9(1), pages 1-10, December.
    7. Christine Wennerås & Agnes Wold, 1997. "Nepotism and sexism in peer-review," Nature, Nature, vol. 387(6631), pages 341-343, May.
    8. Vincent Larivière & Chaoqun Ni & Yves Gingras & Blaise Cronin & Cassidy R. Sugimoto, 2013. "Bibliometrics: Global gender disparities in science," Nature, Nature, vol. 504(7479), pages 211-213, December.
    9. Seeber, Marco & Cattaneo, Mattia & Meoli, Michele & Malighetti, Paolo, 2019. "Self-citations as strategic response to the use of metrics for career decisions," Research Policy, Elsevier, vol. 48(2), pages 478-491.
    10. Heather Sarsons, 2017. "Recognition for Group Work: Gender Differences in Academia," American Economic Review, American Economic Association, vol. 107(5), pages 141-145, May.
    11. Panagopoulos, George & Tsatsaronis, George & Varlamis, Iraklis, 2017. "Detecting rising stars in dynamic collaborative networks," Journal of Informetrics, Elsevier, vol. 11(1), pages 198-222.
    12. Luke Holman & Devi Stuart-Fox & Cindy E Hauser, 2018. "The gender gap in science: How long until women are equally represented?," PLOS Biology, Public Library of Science, vol. 16(4), pages 1-20, April.
    13. Francine D. Blau & Janet M. Currie & Rachel T. A. Croson & Donna K. Ginther, 2010. "Can Mentoring Help Female Assistant Professors? Interim Results from a Randomized Trial," American Economic Review, American Economic Association, vol. 100(2), pages 348-352, May.
    14. Abramo, Giovanni & Cicero, Tindaro & D’Angelo, Ciriaco Andrea, 2015. "Should the research performance of scientists be distinguished by gender?," Journal of Informetrics, Elsevier, vol. 9(1), pages 25-38.
    15. Diana Hicks & Paul Wouters & Ludo Waltman & Sarah de Rijcke & Ismael Rafols, 2015. "Bibliometrics: The Leiden Manifesto for research metrics," Nature, Nature, vol. 520(7548), pages 429-431, April.
    16. Xiaodan Zhu & Peter Turney & Daniel Lemire & André Vellino, 2015. "Measuring academic influence: Not all citations are equal," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 66(2), pages 408-427, February.
    17. Abramo, Giovanni & D’Angelo, Ciriaco Andrea & Murgia, Gianluca, 2013. "Gender differences in research collaboration," Journal of Informetrics, Elsevier, vol. 7(4), pages 811-822.
    18. Matthew RE Symonds & Neil J Gemmell & Tamsin L Braisher & Kylie L Gorringe & Mark A Elgar, 2006. "Gender Differences in Publication Output: Towards an Unbiased Metric of Research Performance," PLOS ONE, Public Library of Science, vol. 1(1), pages 1-5, December.
    19. Daniel E. Acuna & Stefano Allesina & Konrad P. Kording, 2012. "Predicting scientific success," Nature, Nature, vol. 489(7415), pages 201-202, September.
    20. Yubing Nie & Yifan Zhu & Qika Lin & Sifan Zhang & Pengfei Shi & Zhendong Niu, 2019. "Academic rising star prediction via scholar’s evaluation model and machine learning techniques," Scientometrics, Springer;Akadémiai Kiadó, vol. 120(2), pages 461-476, August.
    21. Pleun Arensbergen & Inge van der Weijden & Peter Besselaar, 2012. "Gender differences in scientific productivity: a persisting phenomenon?," Scientometrics, Springer;Akadémiai Kiadó, vol. 93(3), pages 857-868, December.
    22. Ali Daud & Min Song & Malik Khizar Hayat & Tehmina Amjad & Rabeeh Ayaz Abbasi & Hassan Dawood & Anwar Ghani, 2020. "Finding rising stars in bibliometric networks," Scientometrics, Springer;Akadémiai Kiadó, vol. 124(1), pages 633-661, July.
    23. Amin Mazloumian, 2012. "Predicting Scholars' Scientific Impact," PLOS ONE, Public Library of Science, vol. 7(11), pages 1-5, November.
    24. Alonso, S. & Cabrerizo, F.J. & Herrera-Viedma, E. & Herrera, F., 2009. "h-Index: A review focused in its variants, computation and standardization for different scientific fields," Journal of Informetrics, Elsevier, vol. 3(4), pages 273-289.
    25. Zhiya Zuo & Kang Zhao, 2021. "Understanding and predicting future research impact at different career stages—A social network perspective," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(4), pages 454-472, April.
    26. Samreen Ayaz & Nayyer Masood & Muhammad Arshad Islam, 2018. "Predicting scientific impact based on h-index," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 993-1010, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yi Zhang & Chengzhi Zhang & Philipp Mayr & Arho Suominen, 2022. "An editorial of “AI + informetrics”: multi-disciplinary interactions in the era of big data," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6503-6507, November.
    2. Fabio Zagonari & Paolo Foschi, 2024. "Coping with the Inequity and Inefficiency of the H-Index: A Cross-Disciplinary Empirical Analysis," Publications, MDPI, vol. 12(2), pages 1-30, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Lu Liu & Benjamin F. Jones & Brian Uzzi & Dashun Wang, 2023. "Data, measurement and empirical methods in the science of science," Nature Human Behaviour, Nature, vol. 7(7), pages 1046-1058, July.
    2. Josh Yamamoto & Eitan Frachtenberg, 2022. "Gender Differences in Collaboration Patterns in Computer Science," Publications, MDPI, vol. 10(1), pages 1-21, February.
    3. Fengyuan Liu & Petter Holme & Matteo Chiesa & Bedoor AlShebli & Talal Rahwan, 2023. "Gender inequality and self-publication are common among academic editors," Nature Human Behaviour, Nature, vol. 7(3), pages 353-364, March.
    4. Wanjun Xia & Tianrui Li & Chongshou Li, 2023. "A review of scientific impact prediction: tasks, features and methods," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(1), pages 543-585, January.
    5. Hajibabaei, Anahita & Schiffauerova, Andrea & Ebadi, Ashkan, 2022. "Gender-specific patterns in the artificial intelligence scientific ecosystem," Journal of Informetrics, Elsevier, vol. 16(2).
    6. Jyoti Paswan & Vivek Kumar Singh, 2020. "Gender and research publishing analyzed through the lenses of discipline, institution types, impact and international collaboration: a case study from India," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(1), pages 497-515, April.
    7. Abramo, Giovanni & Aksnes, Dag W. & D’Angelo, Ciriaco Andrea, 2021. "Gender differences in research performance within and between countries: Italy vs Norway," Journal of Informetrics, Elsevier, vol. 15(2).
    8. Lisa Geraci & Steve Balsis & Alexander J. Busch Busch, 2015. "Gender and the h index in psychology," Scientometrics, Springer;Akadémiai Kiadó, vol. 105(3), pages 2023-2034, December.
    9. Nakajima, Kazuki & Liu, Ruodan & Shudo, Kazuyuki & Masuda, Naoki, 2023. "Quantifying gender imbalance in East Asian academia: Research career and citation practice," Journal of Informetrics, Elsevier, vol. 17(4).
    10. Roberta Ruggieri & Fabrizio Pecoraro & Daniela Luzi, 2021. "An intersectional approach to analyse gender productivity and open access: a bibliometric analysis of the Italian National Research Council," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(2), pages 1647-1673, February.
    11. Marek Kwiek & Wojciech Roszka, 2022. "Are female scientists less inclined to publish alone? The gender solo research gap," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(4), pages 1697-1735, April.
    12. Gen-Chang Hsu & Wei-Jiun Lin & Syuan-Jyun Sun, 2023. "Temporal trends in academic performance and career duration of principal investigators in ecology and evolutionary biology in Taiwan," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(6), pages 3437-3451, June.
    13. Lin Zhang & Yuanyuan Shang & Ying Huang & Gunnar Sivertsen, 2022. "Gender differences among active reviewers: an investigation based on publons," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(1), pages 145-179, January.
    14. Rodrigo Dorantes-Gilardi & Aurora A. Ramírez-Álvarez & Diana Terrazas-Santamaría, 2023. "Is there a differentiated gender effect of collaboration with super-cited authors? Evidence from junior researchers in economics," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(4), pages 2317-2336, April.
    15. Frandsen, Tove Faber & Jacobsen, Rasmus Højbjerg & Wallin, Johan A. & Brixen, Kim & Ousager, Jakob, 2015. "Gender differences in scientific performance: A bibliometric matching analysis of Danish health sciences Graduates," Journal of Informetrics, Elsevier, vol. 9(4), pages 1007-1017.
    16. Zhang, Lin & Shang, Yuanyuan & HUANG, Ying & Sivertsen, Gunnar, 2021. "Gender differences among active reviewers: an investigation based on Publons," SocArXiv 4z6w8, Center for Open Science.
    17. Hamid R. Jamali & Alireza Abbasi, 2023. "Gender gaps in Australian research publishing, citation and co-authorship," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 2879-2893, May.
    18. Anahita Hajibabaei & Andrea Schiffauerova & Ashkan Ebadi, 2023. "Women and key positions in scientific collaboration networks: analyzing central scientists’ profiles in the artificial intelligence ecosystem through a gender lens," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(2), pages 1219-1240, February.
    19. Liu, Meijun & Zhang, Ning & Hu, Xiao & Jaiswal, Ajay & Xu, Jian & Chen, Hong & Ding, Ying & Bu, Yi, 2022. "Further divided gender gaps in research productivity and collaboration during the COVID-19 pandemic: Evidence from coronavirus-related literature," Journal of Informetrics, Elsevier, vol. 16(2).
    20. Abdelghani Maddi & Yves Gingras, 2021. "Gender Diversity In Research Teams And Citation Impact In Economics And Management," Journal of Economic Surveys, Wiley Blackwell, vol. 35(5), pages 1381-1404, December.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:scient:v:127:y:2022:i:11:d:10.1007_s11192-022-04337-2. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.