IDEAS home Printed from https://ideas.repec.org/a/bla/jinfst/v71y2020i1p69-83.html
   My bibliography  Save this article

A Graph Combination With Edge Pruning‐Based Approach for Author Name Disambiguation

Author

Listed:
  • Pooja KM
  • Samrat Mondal
  • Joydeep Chandra

Abstract

Author name disambiguation (AND) is a challenging problem due to several issues such as missing key identifiers, same name corresponding to multiple authors, along with inconsistent representation. Several techniques have been proposed but maintaining consistent accuracy levels over all data sets is still a major challenge. We identify two major issues associated with the AND problem. First, the namesake problem in which two or more authors with the same name publishes in a similar domain. Second, the diverse topic problem in which one author publishes in diverse topical domains with a different set of coauthors. In this work, we initially propose a method named ATGEP for AND that addresses the namesake issue. We evaluate the performance of ATGEP using various ambiguous name references collected from the Arnetminer Citation (AC) and Web of Science (WoS) data set. We empirically show that the two aforementioned problems are crucial to address the AND problem that are difficult to handle using state‐of‐the‐art techniques. To handle the diverse topic issue, we extend ATGEP to a new variant named ATGEP‐web that considers external web information of the authors. Experiments show that with enough information available from external web sources ATGEP‐web can significantly improve the results further compared with ATGEP.

Suggested Citation

  • Pooja KM & Samrat Mondal & Joydeep Chandra, 2020. "A Graph Combination With Edge Pruning‐Based Approach for Author Name Disambiguation," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 71(1), pages 69-83, January.
  • Handle: RePEc:bla:jinfst:v:71:y:2020:i:1:p:69-83
    DOI: 10.1002/asi.24212
    as

    Download full text from publisher

    File URL: https://doi.org/10.1002/asi.24212
    Download Restriction: no

    File URL: https://libkey.io/10.1002/asi.24212?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Cristian Santini & Genet Asefa Gesese & Silvio Peroni & Aldo Gangemi & Harald Sack & Mehwish Alam, 2022. "A knowledge graph embeddings based approach for author name disambiguation using literals," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(8), pages 4887-4912, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:jinfst:v:71:y:2020:i:1:p:69-83. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.asis.org .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.