IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0264270.html
   My bibliography  Save this article

Avoiding bias when inferring race using name-based approaches

Author

Listed:
  • Diego Kozlowski
  • Dakota S Murray
  • Alexis Bell
  • Will Hulsey
  • Vincent Larivière
  • Thema Monroe-White
  • Cassidy R Sugimoto

Abstract

Racial disparity in academia is a widely acknowledged problem. The quantitative understanding of racial-based systemic inequalities is an important step towards a more equitable research system. However, because of the lack of robust information on authors’ race, few large-scale analyses have been performed on this topic. Algorithmic approaches offer one solution, using known information about authors, such as their names, to infer their perceived race. As with any other algorithm, the process of racial inference can generate biases if it is not carefully considered. The goal of this article is to assess the extent to which algorithmic bias is introduced using different approaches for name-based racial inference. We use information from the U.S. Census and mortgage applications to infer the race of U.S. affiliated authors in the Web of Science. We estimate the effects of using given and family names, thresholds or continuous distributions, and imputation. Our results demonstrate that the validity of name-based inference varies by race/ethnicity and that threshold approaches underestimate Black authors and overestimate White authors. We conclude with recommendations to avoid potential biases. This article lays the foundation for more systematic and less-biased investigations into racial disparities in science.

Suggested Citation

  • Diego Kozlowski & Dakota S Murray & Alexis Bell & Will Hulsey & Vincent Larivière & Thema Monroe-White & Cassidy R Sugimoto, 2022. "Avoiding bias when inferring race using name-based approaches," PLOS ONE, Public Library of Science, vol. 17(3), pages 1-16, March.
  • Handle: RePEc:plo:pone00:0264270
    DOI: 10.1371/journal.pone.0264270
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0264270
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0264270&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0264270?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Vincent Larivière & Chaoqun Ni & Yves Gingras & Blaise Cronin & Cassidy R. Sugimoto, 2013. "Bibliometrics: Global gender disparities in science," Nature, Nature, vol. 504(7479), pages 211-213, December.
    2. Baum, Matthew A. & Dietrich, Bryce J. & Goldstein, Rebecca & Sen, Maya, 2019. "Estimating the Effect of Asking About Citizenship on the US Census: Results from a Randomized Controlled Trial," Working Paper Series rwp19-015, Harvard University, John F. Kennedy School of Government.
    3. Roland G. Fryer & Steven D. Levitt, 2004. "The Causes and Consequences of Distinctively Black Names," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 119(3), pages 767-805.
    4. Richard B. Freeman & Wei Huang, 2014. "Collaborating With People Like Me: Ethnic co-authorship within the US," NBER Working Papers 19905, National Bureau of Economic Research, Inc.
    5. Jinseok Kim & Jenna Kim & Jason Owen‐Smith, 2021. "Ethnicity‐based name partitioning for author name disambiguation using supervised machine learning," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(8), pages 979-994, August.
    6. Gerald Marschke & Allison Nunez & Bruce A. Weinberg & Huifeng Yu, 2018. "Last Place? The Intersection of Ethnicity, Gender, and Race in Biomedical Authorship," AEA Papers and Proceedings, American Economic Association, vol. 108, pages 222-227, May.
    7. Lisa Cook, 2014. "Violence and economic activity: evidence from African American patents, 1870–1940," Journal of Economic Growth, Springer, vol. 19(2), pages 221-257, June.
    8. John Brandt & Kathleen Buckingham & Cody Buntain & Will Anderson & Sabin Ray & John-Rob Pool & Natasha Ferrari, 2020. "Identifying social media user demographics and topic diversity with computational social science: a case study of a major international policy forum," Journal of Computational Social Science, Springer, vol. 3(1), pages 167-188, April.
    9. Allison L. Hopkins & James W. Jawitz & Christopher McCarty & Alex Goldman & Nandita B. Basu, 2013. "Disparities in publication patterns by gender, race and ethnicity based on a survey of a random sample of authors," Scientometrics, Springer;Akadémiai Kiadó, vol. 96(2), pages 515-534, August.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Nakajima, Kazuki & Liu, Ruodan & Shudo, Kazuyuki & Masuda, Naoki, 2023. "Quantifying gender imbalance in East Asian academia: Research career and citation practice," Journal of Informetrics, Elsevier, vol. 17(4).
    2. Diego Kozlowski & Thema Monroe‐White & Vincent Larivière & Cassidy R. Sugimoto, 2024. "The Howard‐Harvard effect: Institutional reproduction of intersectional inequalities," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 75(8), pages 869-882, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Zhang, Ning & He, Guangye & Shi, Dongbo & Zhao, Zhenyue & Li, Jiang, 2022. "Does a gender-neutral name associate with the research impact of a scientist?," Journal of Informetrics, Elsevier, vol. 16(1).
    2. Mike Thelwall & Tamara Nevill, 2019. "No evidence of citation bias as a determinant of STEM gender disparities in US biochemistry, genetics and molecular biology research," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(3), pages 1793-1801, December.
    3. Vasarhelyi, Orsolya & Brooke, Siân, 2022. "Computing Gender," SocArXiv admcs, Center for Open Science.
    4. Luke Holman & Claire Morandin, 2019. "Researchers collaborate with same-gendered colleagues more often than expected across the life sciences," PLOS ONE, Public Library of Science, vol. 14(4), pages 1-19, April.
    5. Camilo Garcia-Jimeno & Sahar Parsa, 2024. "Cultural Change Through Writing Style: Gendered Pronoun Use in the Economics Profession," Working Paper Series WP 2024-23, Federal Reserve Bank of Chicago.
    6. Aleksandra Cislak & Magdalena Formanowicz & Tamar Saguy, 2018. "Bias against research on gender bias," Scientometrics, Springer;Akadémiai Kiadó, vol. 115(1), pages 189-200, April.
    7. González-Álvarez, Julio & Cervera-Crespo, Teresa, 2017. "Research production in high-impact journals of contemporary neuroscience: A gender analysis," Journal of Informetrics, Elsevier, vol. 11(1), pages 232-243.
    8. Johannes Buggle & Thierry Mayer & Seyhun Orcan Sakalli & Mathias Thoenig, 2023. "The Refugee’s Dilemma: Evidence from Jewish Migration out of Nazi Germany," The Quarterly Journal of Economics, President and Fellows of Harvard College, vol. 138(2), pages 1273-1345.
    9. Mujcic, Redzo & Frijters, Paul, 2013. "Still Not Allowed on the Bus: It Matters If You're Black or White!," IZA Discussion Papers 7300, Institute of Labor Economics (IZA).
    10. Hoekman, Jarno & Rake, Bastian, 2024. "Geography of authorship: How geography shapes authorship attribution in big team science," Research Policy, Elsevier, vol. 53(2).
    11. Kai On Wong & Osmar R Zaïane & Faith G Davis & Yutaka Yasui, 2020. "A machine learning approach to predict ethnicity using personal name and census location in Canada," PLOS ONE, Public Library of Science, vol. 15(11), pages 1-16, November.
    12. Wu, Jiang & Ou, Guiyan & Liu, Xiaohui & Dong, Ke, 2022. "How does academic education background affect top researchers’ performance? Evidence from the field of artificial intelligence," Journal of Informetrics, Elsevier, vol. 16(2).
    13. Nicolás Ajzenman & Bruno Ferman & Sant’Anna Pedro C., 2023. "Discrimination in the Formation of Academic Networks: A Field Experiment on #EconTwitter," Working Papers 235, Red Nacional de Investigadores en Economía (RedNIE).
    14. Chaojiang Wu & Erjia Yan & Yongjun Zhu & Kai Li, 2021. "Gender imbalance in the productivity of funded projects: A study of the outputs of National Institutes of Health R01 grants," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 72(11), pages 1386-1399, November.
    15. Yann Algan & Clément Malgouyres & Thierry Mayer & Mathias Thoenig, 2022. "The Economic Incentives of Cultural Transmission: Spatial Evidence from Naming Patterns Across France [‘Cultural assimilation during the age of mass migration’]," The Economic Journal, Royal Economic Society, vol. 132(642), pages 437-470.
    16. Julian Kolev & Yuly Fuentes-Medel & Fiona Murray, 2019. "Is Blinded Review Enough? How Gendered Outcomes Arise Even Under Anonymous Evaluation," NBER Working Papers 25759, National Bureau of Economic Research, Inc.
    17. Muriuki, James & Hudson, Darren & Fuad, Syed & March, Raymond J. & Lacombe, Donald J., 2023. "Spillover effect of violent conflicts on food insecurity in sub-Saharan Africa," Food Policy, Elsevier, vol. 115(C).
    18. Sorana-Alexandra Constantinescu & Maria-Henriete Pozsar, 2022. "Was This Supposed to Be on the Test? Academic Leadership, Gender and the COVID-19 Pandemic in Denmark, Hungary, Romania, and United Kingdom," Publications, MDPI, vol. 10(2), pages 1-13, April.
    19. George J. Borjas & Kirk B. Doran, 2015. "How High-Skill Immigration Affects Science: Evidence from the Collapse of the USSR," Innovation Policy and the Economy, University of Chicago Press, vol. 15(1), pages 1-25.
    20. Bindler, Anna Louisa & Hjalmarsson, Randi & Machin, Stephen Jonathan & Rubio, Melissa, 2023. "Murphy's Law or luck of the Irish? Disparate treatment of the Irish in 19th century courts," LSE Research Online Documents on Economics 121339, London School of Economics and Political Science, LSE Library.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0264270. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.