IDEAS home Printed from https://ideas.repec.org/p/zbw/zewdip/21021.html
   My bibliography  Save this paper

Disambiguation by namesake risk assessment

Author

Listed:
  • Doherr, Thorsten

Abstract

Most bibliometric databases only provide names as the handle to their careers leading to the issue of namesakes. We introduce a universal method to assess the risk of linking documents of different individuals sharing the same name with the goal of collecting the documents into personalized clusters. A theoretical setup for the probability of drawing a namesake depending on the number of namesakes in the population and the size of the observed unit replaces the need for training datasets, thereby avoiding a namesake bias caused by the inherent underestimation of namesakes in training/benchmark data. A Poisson model based on a master sample of unambiguously identified individuals estimates the main component, the number of namesakes for any given name. To implement the algorithm, we reduce the complexity in the data by resolving similarity in properties. At the core of the implementation is a mechanism returning the unit size of the intersected mutual properties linking two documents. Because of the high computational demands of this mechanism, it is a necessity to discuss means to optimize the procedure.

Suggested Citation

  • Doherr, Thorsten, 2021. "Disambiguation by namesake risk assessment," ZEW Discussion Papers 21-021, ZEW - Leibniz Centre for European Economic Research.
  • Handle: RePEc:zbw:zewdip:21021
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/231411/1/1750558505.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Benjamin F. Jones, 2009. "The Burden of Knowledge and the "Death of the Renaissance Man": Is Innovation Getting Harder?," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 76(1), pages 283-317.
    2. Doherr, Thorsten, 2017. "Inventor mobility index: A method to disambiguate inventor careers," ZEW Discussion Papers 17-018, ZEW - Leibniz Centre for European Economic Research.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jordan Bisset & Dirk Czarnitzki & Thorsten Doherr, 2022. "High Skilled Mobility Under Uncertainty," Working Papers of Department of Management, Strategy and Innovation, Leuven 700195, KU Leuven, Faculty of Economics and Business (FEB), Department of Management, Strategy and Innovation, Leuven.
    2. Jordan Bisset & Dirk Czarnitzki & Thorsten Doherr, 2022. "Policy Uncertainty and Inventor Mobility," Working Papers of ECOOM - Centre for Research and Development Monitoring 700195, KU Leuven, Faculty of Economics and Business (FEB), ECOOM - Centre for Research and Development Monitoring.
    3. Doherr, Thorsten, 2023. "The SearchEngine: A holistic approach to matching," ZEW Discussion Papers 23-001, ZEW - Leibniz Centre for European Economic Research.
    4. Bisset, Jordan & Czarnitzki, Dirk & Doherr, Thorsten, 2024. "Inventor mobility under uncertainty," Research Policy, Elsevier, vol. 53(1).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Laura Barbieri & Daniela Bragoli & Flavia Cortelezzi & Giovanni Marseguerra, 2015. "Public Support to Innovation Strategies," DISCE - Quaderni del Dipartimento di Scienze Economiche e Sociali dises1509, Università Cattolica del Sacro Cuore, Dipartimenti e Istituti di Scienze Economiche (DISCE).
    2. Hussinger, Katrin & Pellens, Maikel, 2019. "Guilt by association: How scientific misconduct harms prior collaborators," Research Policy, Elsevier, vol. 48(2), pages 516-530.
    3. Naudé, Wim & Nagler, Paula, 2022. "The Ossified Economy: The Case of Germany, 1870-2020," IZA Discussion Papers 15607, Institute of Labor Economics (IZA).
    4. Alje van Dam & Koen Frenken, 2019. "Variety, Complexity and Economic Development," Papers 1903.07997, arXiv.org.
    5. Singh, Anuraag & Triulzi, Giorgio & Magee, Christopher L., 2021. "Technological improvement rate predictions for all technologies: Use of patent data and an extended domain description," Research Policy, Elsevier, vol. 50(9).
    6. David Grosse Kathoefer & Jens Leker, 2012. "Knowledge transfer in academia: an exploratory study on the Not-Invented-Here Syndrome," The Journal of Technology Transfer, Springer, vol. 37(5), pages 658-675, October.
    7. Laurent R. Bergé, 2017. "Network proximity in the geography of research collaboration," Papers in Regional Science, Wiley Blackwell, vol. 96(4), pages 785-815, November.
    8. Balland, Pierre-Alexandre & Broekel, Tom & Diodato, Dario & Giuliani, Elisa & Hausmann, Ricardo & O'Clery, Neave & Rigby, David, 2022. "Reprint of The new paradigm of economic complexity," Research Policy, Elsevier, vol. 51(8).
    9. Deyun Yin & Kazuyuki Motohashi & Jianwei Dang, 2020. "Large-scale name disambiguation of Chinese patent inventors (1985–2016)," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(2), pages 765-790, February.
    10. Ajay Bhaskarbhatla & Luis Cabral & Deepak Hegde & Thomas (T.L.P.R.) Peeters, 2017. "Human Capital, Firm Capabilities, and Innovation," Tinbergen Institute Discussion Papers 17-115/VII, Tinbergen Institute, revised 03 Mar 2020.
    11. Michele Pezzoni & Fabiana Visentin, 2024. "Gender bias in team formation: the case of the European Science Foundation’s grants," Science and Public Policy, Oxford University Press, vol. 51(2), pages 247-260.
    12. Nicholas Bloom & Philip Bunn & Paul Mizen & Pawel Smietanka & Gregory Thwaites, 2020. "The Impact of Covid-19 on Productivity," NBER Working Papers 28233, National Bureau of Economic Research, Inc.
    13. Hager, Sebastian & Schwarz, Carlo & Waldinger, Fabian, 2023. "Measuring Science: Performance Metrics and the Allocation of Talent," CEPR Discussion Papers 18248, C.E.P.R. Discussion Papers.
    14. Jeffrey L. Furman & Florenta Teodoridis, 2020. "Automation, Research Technology, and Researchers’ Trajectories: Evidence from Computer Science and Electrical Engineering," Organization Science, INFORMS, vol. 31(2), pages 330-354, March.
    15. Manuel Trajtenberg & Gil Shiff & Ran Melamed, 2009. "The "Names Game": Harnessing Inventors, Patent Data for Economic Research," Annals of Economics and Statistics, GENES, issue 93-94, pages 67-77.
    16. Carolin Haeussler & Henry Sauermann, 2016. "The Division of Labor in Teams: A Conceptual Framework and Application to Collaborations in Science," NBER Working Papers 22241, National Bureau of Economic Research, Inc.
    17. William Latham & Christian Le Bas & Dmitry Volodin, 2012. "Mobility, Productivity and Patent Value for Asian Prolific Inventors : China, Japan, Korea and Taiwan, 1975 - 2010," Working Papers halshs-00734980, HAL.
    18. Kim, Jinyoung, 2017. "Racing against Time in Research: A Study of the 1995 U.S. Patent Law Amendment," IZA Discussion Papers 10815, Institute of Labor Economics (IZA).
    19. Jinyoung Kim & Kanghyock Koh, 2023. "Jack of fewer trades: Evolution of specialization in research," Canadian Journal of Economics/Revue canadienne d'économique, John Wiley & Sons, vol. 56(2), pages 423-452, May.
    20. Chung-Souk Han, 2011. "On the demographical changes of U.S. research doctorate awardees and corresponding trends in research fields," Scientometrics, Springer;Akadémiai Kiadó, vol. 89(3), pages 845-865, December.

    More about this item

    Keywords

    homonymy; namesakes; disambiguation; scientific careers; inventors; patents; publications;
    All these keywords.

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C36 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Instrumental Variables (IV) Estimation

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:zewdip:21021. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/zemande.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.