IDEAS home Printed from https://ideas.repec.org/a/spr/comaot/v11y2005i2d10.1007_s10588-005-3940-3.html
   My bibliography  Save this article

A Network Analysis Model for Disambiguation of Names in Lists

Author

Listed:
  • Bradley Malin

    (Carnegie Mellon University
    School of Computer Science, Carnegie Mellon University)

  • Edoardo Airoldi

    (Carnegie Mellon University)

  • Kathleen M. Carley

    (School of Computer Science, Carnegie Mellon University)

Abstract

In research and application, social networks are increasingly extracted from relationships inferred by name collocations in text-based documents. Despite the fact that names represent real entities, names are not unique identifiers and it is often unclear when two name observations correspond to the same underlying entity. One confounder stems from ambiguity, in which the same name correctly references multiple entities. Prior name disambiguation methods measured similarity between two names as a function of their respective documents. In this paper, we propose an alternative similarity metric based on the probability of walking from one ambiguous name to another in a random walk of the social network constructed from all documents. We experimentally validate our model on actor-actor relationships derived from the Internet Movie Database. Using a global similarity threshold, we demonstrate random walks achieve a significant increase in disambiguation capability in comparison to prior models.

Suggested Citation

  • Bradley Malin & Edoardo Airoldi & Kathleen M. Carley, 2005. "A Network Analysis Model for Disambiguation of Names in Lists," Computational and Mathematical Organization Theory, Springer, vol. 11(2), pages 119-139, July.
  • Handle: RePEc:spr:comaot:v:11:y:2005:i:2:d:10.1007_s10588-005-3940-3
    DOI: 10.1007/s10588-005-3940-3
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10588-005-3940-3
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10588-005-3940-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Moradabadi, Behnaz & Meybodi, Mohammad Reza, 2017. "A novel time series link prediction method: Learning automata approach," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 482(C), pages 422-432.
    2. Jan Schulz, 2016. "Using Monte Carlo simulations to assess the impact of author name disambiguation quality on different bibliometric analyses," Scientometrics, Springer;Akadémiai Kiadó, vol. 107(3), pages 1283-1298, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:comaot:v:11:y:2005:i:2:d:10.1007_s10588-005-3940-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.