Author
Listed:
- Paul J Albert
- Sarbajit Dutta
- Jie Lin
- Zimeng Zhu
- Michael Bales
- Stephen B Johnson
- Mohammad Mansour
- Drew Wright
- Terrie R Wheeler
- Curtis L Cole
Abstract
Academic institutions need to maintain publication lists for thousands of faculty and other scholars. Automated tools are essential to minimize the need for direct feedback from the scholars themselves who are practically unable to commit necessary effort to keep the data accurate. In relying exclusively on clustering techniques, author disambiguation applications fail to satisfy key use cases of academic institutions. Algorithms can perfectly group together a set of publications authored by a common individual, but, for them to be useful to an academic institution, they need to programmatically and recurrently map articles to thousands of scholars of interest en masse. Consistent with a savvy librarian’s approach for generating a scholar’s list of publications, identity-driven authorship prediction is the process of using information about a scholar to quantify the likelihood that person wrote certain articles. ReCiter is an application that attempts to do exactly that. ReCiter uses institutionally-maintained identity data such as name of department and year of terminal degree to predict which articles a given scholar has authored. To compute the overall score for a given candidate article from PubMed (and, optionally, Scopus), ReCiter uses: up to 12 types of commonly available, identity data; whether other members of a cluster have been accepted or rejected by a user; and the average score of a cluster. In addition, ReCiter provides scoring and qualitative evidence supporting why particular articles are suggested. This context and confidence scoring allows curators to more accurately provide feedback on behalf of scholars. To help users to more efficiently curate publication lists, we used a support vector machine analysis to optimize the scoring of the ReCiter algorithm. In our analysis of a diverse test group of 500 scholars at an academic private medical center, ReCiter correctly predicted 98% of their publications in PubMed.
Suggested Citation
Paul J Albert & Sarbajit Dutta & Jie Lin & Zimeng Zhu & Michael Bales & Stephen B Johnson & Mohammad Mansour & Drew Wright & Terrie R Wheeler & Curtis L Cole, 2021.
"ReCiter: An open source, identity-driven, authorship prediction algorithm optimized for academic institutions,"
PLOS ONE, Public Library of Science, vol. 16(4), pages 1-27, April.
Handle:
RePEc:plo:pone00:0244641
DOI: 10.1371/journal.pone.0244641
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0244641. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.