IDEAS home Printed from https://ideas.repec.org/p/zbw/zewdip/21021.html
   My bibliography  Save this paper

Disambiguation by namesake risk assessment

Author

Listed:
  • Doherr, Thorsten

Abstract

Most bibliometric databases only provide names as the handle to their careers leading to the issue of namesakes. We introduce a universal method to assess the risk of linking documents of different individuals sharing the same name with the goal of collecting the documents into personalized clusters. A theoretical setup for the probability of drawing a namesake depending on the number of namesakes in the population and the size of the observed unit replaces the need for training datasets, thereby avoiding a namesake bias caused by the inherent underestimation of namesakes in training/benchmark data. A Poisson model based on a master sample of unambiguously identified individuals estimates the main component, the number of namesakes for any given name. To implement the algorithm, we reduce the complexity in the data by resolving similarity in properties. At the core of the implementation is a mechanism returning the unit size of the intersected mutual properties linking two documents. Because of the high computational demands of this mechanism, it is a necessity to discuss means to optimize the procedure.

Suggested Citation

  • Doherr, Thorsten, 2021. "Disambiguation by namesake risk assessment," ZEW Discussion Papers 21-021, ZEW - Leibniz Centre for European Economic Research.
  • Handle: RePEc:zbw:zewdip:21021
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/231411/1/1750558505.pdf
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Benjamin F. Jones, 2009. "The Burden of Knowledge and the "Death of the Renaissance Man": Is Innovation Getting Harder?," The Review of Economic Studies, Review of Economic Studies Ltd, vol. 76(1), pages 283-317.
    2. Doherr, Thorsten, 2017. "Inventor mobility index: A method to disambiguate inventor careers," ZEW Discussion Papers 17-018, ZEW - Leibniz Centre for European Economic Research.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Jordan Bisset & Dirk Czarnitzki & Thorsten Doherr, 2022. "High Skilled Mobility Under Uncertainty," Working Papers of Department of Management, Strategy and Innovation, Leuven 700195, KU Leuven, Faculty of Economics and Business (FEB), Department of Management, Strategy and Innovation, Leuven.
    2. Jordan Bisset & Dirk Czarnitzki & Thorsten Doherr, 2022. "Policy Uncertainty and Inventor Mobility," Working Papers of ECOOM - Centre for Research and Development Monitoring 700195, KU Leuven, Faculty of Economics and Business (FEB), ECOOM - Centre for Research and Development Monitoring.
    3. Doherr, Thorsten, 2023. "The SearchEngine: A holistic approach to matching," ZEW Discussion Papers 23-001, ZEW - Leibniz Centre for European Economic Research.
    4. Bisset, Jordan & Czarnitzki, Dirk & Doherr, Thorsten, 2024. "Inventor mobility under uncertainty," Research Policy, Elsevier, vol. 53(1).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ufuk Akcigit & Murat Celik & Daron Acemoglu, 2014. "Young, Restless and Creative: Openness to Disruption and Creative Innovations," 2014 Meeting Papers 377, Society for Economic Dynamics.
    2. Balland, Pierre-Alexandre & Boschma, Ron, 2022. "Do scientific capabilities in specific domains matter for technological diversification in European regions?," Research Policy, Elsevier, vol. 51(10).
    3. Laura Barbieri & Daniela Bragoli & Flavia Cortelezzi & Giovanni Marseguerra, 2015. "Public Support to Innovation Strategies," DISCE - Quaderni del Dipartimento di Scienze Economiche e Sociali dises1509, Università Cattolica del Sacro Cuore, Dipartimenti e Istituti di Scienze Economiche (DISCE).
    4. Hussinger, Katrin & Pellens, Maikel, 2019. "Guilt by association: How scientific misconduct harms prior collaborators," Research Policy, Elsevier, vol. 48(2), pages 516-530.
    5. Forman, Chris & van Zeebroeck, Nicolas, 2019. "Digital technology adoption and knowledge flows within firms: Can the Internet overcome geographic and technological distance?," Research Policy, Elsevier, vol. 48(8), pages 1-1.
    6. Naudé, Wim & Nagler, Paula, 2022. "The Ossified Economy: The Case of Germany, 1870-2020," IZA Discussion Papers 15607, Institute of Labor Economics (IZA).
    7. Alje van Dam & Koen Frenken, 2019. "Variety, Complexity and Economic Development," Papers 1903.07997, arXiv.org.
    8. Yusuke Oh & Koji Takahashi, 2020. "R&D and Innovation: Evidence from Patent Data," Bank of Japan Working Paper Series 20-E-7, Bank of Japan.
    9. Singh, Anuraag & Triulzi, Giorgio & Magee, Christopher L., 2021. "Technological improvement rate predictions for all technologies: Use of patent data and an extended domain description," Research Policy, Elsevier, vol. 50(9).
    10. Carillo, Maria Rosaria & Papagni, Erasmo & Sapio, Alessandro, 2013. "Do collaborations enhance the high-quality output of scientific institutions? Evidence from the Italian Research Assessment Exercise," Journal of Behavioral and Experimental Economics (formerly The Journal of Socio-Economics), Elsevier, vol. 47(C), pages 25-36.
    11. David Grosse Kathoefer & Jens Leker, 2012. "Knowledge transfer in academia: an exploratory study on the Not-Invented-Here Syndrome," The Journal of Technology Transfer, Springer, vol. 37(5), pages 658-675, October.
    12. Marcus Berliant & Masahisa Fujita, 2011. "The Dynamics of Knowledge Diversity and Economic Growth," Southern Economic Journal, John Wiley & Sons, vol. 77(4), pages 856-884, April.
    13. Laurent R. Bergé, 2017. "Network proximity in the geography of research collaboration," Papers in Regional Science, Wiley Blackwell, vol. 96(4), pages 785-815, November.
    14. Balland, Pierre-Alexandre & Broekel, Tom & Diodato, Dario & Giuliani, Elisa & Hausmann, Ricardo & O'Clery, Neave & Rigby, David, 2022. "Reprint of The new paradigm of economic complexity," Research Policy, Elsevier, vol. 51(8).
    15. Nicholas Bloom & Charles I. Jones & John Van Reenen & Michael Webb, 2020. "Are Ideas Getting Harder to Find?," American Economic Review, American Economic Association, vol. 110(4), pages 1104-1144, April.
    16. Stefano Bianchini & Moritz Müller & Pierre Pelletier, 2022. "Artificial intelligence in science: An emerging general method of invention," Post-Print hal-03958025, HAL.
    17. Daniel S. Hamermesh, 2013. "Six Decades of Top Economics Publishing: Who and How?," Journal of Economic Literature, American Economic Association, vol. 51(1), pages 162-172, March.
    18. Deyun Yin & Kazuyuki Motohashi & Jianwei Dang, 2020. "Large-scale name disambiguation of Chinese patent inventors (1985–2016)," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(2), pages 765-790, February.
    19. Ajay Bhaskarbhatla & Luis Cabral & Deepak Hegde & Thomas (T.L.P.R.) Peeters, 2017. "Human Capital, Firm Capabilities, and Innovation," Tinbergen Institute Discussion Papers 17-115/VII, Tinbergen Institute, revised 03 Mar 2020.
    20. Haneda, Shoko & Ito, Keiko, 2018. "Organizational and human resource management and innovation: Which management practices are linked to product and/or process innovation?," Research Policy, Elsevier, vol. 47(1), pages 194-208.

    More about this item

    Keywords

    homonymy; namesakes; disambiguation; scientific careers; inventors; patents; publications;
    All these keywords.

    JEL classification:

    • C18 - Mathematical and Quantitative Methods - - Econometric and Statistical Methods and Methodology: General - - - Methodolical Issues: General
    • C36 - Mathematical and Quantitative Methods - - Multiple or Simultaneous Equation Models; Multiple Variables - - - Instrumental Variables (IV) Estimation

    NEP fields

    This paper has been announced in the following NEP Reports:

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:zewdip:21021. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/zemande.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.