IDEAS home Printed from https://ideas.repec.org/a/wsi/jikmxx/v23y2024i05ns0219649224500771.html
   My bibliography  Save this article

Library Similar Literature Screening System Research Based on LDA Topic Model

Author

Listed:
  • Liang Gao

    (Library, Suqian University, Suqian 223800, P. R. China)

  • Fang Cui

    (��Library, Inner Mongolia University, Hohhot 010021, P. R. China)

  • Chengbo Zhang

    (Library, Suqian University, Suqian 223800, P. R. China)

Abstract

Science and technology are highly inheritable undertakings, and any scientific and technological worker can make good progress without the experience and achievements of predecessors or others. In the face of an ever-expanding pool of literature, the ability to efficiently and accurately search for similar works is a major challenge in current research. This paper uses Latent Dirichlet Allocation (LDA) topic model to construct feature vectors for the title and abstract, and the bag-of-words model to construct feature vectors for publication type. The similarity between the feature vectors is measured by calculating the cosine values. The experiment demonstrated that the precision, recall and WSS95 scores of the algorithm proposed in the study were 90.55%, 98.74% and 52.45% under the literature title element, and 91.78%, 99.58% and 62.47% under the literature abstract element, respectively. Under the literature publication type element, the precision, recall and WSS95 scores of the proposed algorithm were 90.77%, 98.05% and 40.14%, respectively. Under the combination of literature title, abstract and publication type elements, the WSS95 score of the proposed algorithm was 79.03%. In summary, the study proposes a robust performance of the literature screening (LS) algorithm based on the LDA topic model, and a similar LS system designed on this basis can effectively improve the efficiency of LS.

Suggested Citation

  • Liang Gao & Fang Cui & Chengbo Zhang, 2024. "Library Similar Literature Screening System Research Based on LDA Topic Model," Journal of Information & Knowledge Management (JIKM), World Scientific Publishing Co. Pte. Ltd., vol. 23(05), pages 1-20, October.
  • Handle: RePEc:wsi:jikmxx:v:23:y:2024:i:05:n:s0219649224500771
    DOI: 10.1142/S0219649224500771
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219649224500771
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219649224500771?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:jikmxx:v:23:y:2024:i:05:n:s0219649224500771. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/jikm/jikm.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.