IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v12y2024i9p1328-d1384103.html
   My bibliography  Save this article

Efficient List Intersection Algorithm for Short Documents by Document Reordering

Author

Listed:
  • Lianyin Jia

    (Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China
    Yunnan Key Lab of Computer Technology Applications, Kunming University of Science and Technology, Kunming 650500, China)

  • Dongyang Li

    (Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China)

  • Haihe Zhou

    (Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, China)

  • Fengling Xia

    (Faculty of Civil Aviation and Aeronautics, Kunming University of Science and Technology, Kunming 650500, China)

Abstract

List intersection plays a pivotal role in various domains such as search engines, database systems, and social networks. Efficient indexes and query strategies can significantly enhance the efficiency of list intersection. Existing inverted index-based algorithms fail to utilize the length information of documents and require excessive list intersections, resulting in lower efficiency. To address this issue, in this paper, we propose the LDRpV (Length-based Document Reordering plus Verification) algorithm. LDRpV filters out documents that are unlikely to satisfy the intersection results by reordering documents based on their length, thereby reducing the number of candidates. Additionally, to minimize the number of list intersection operations, an intersection and verification strategy is designed, where only the first m lists are intersected, and the resulting candidate set is directly verified. This approach effectively improves the efficiency of list intersection. Experimental results on four real datasets demonstrate that LDRpV can achieve a maximum efficiency improvement of 46.69% compared to the most competitive counterparts.

Suggested Citation

  • Lianyin Jia & Dongyang Li & Haihe Zhou & Fengling Xia, 2024. "Efficient List Intersection Algorithm for Short Documents by Document Reordering," Mathematics, MDPI, vol. 12(9), pages 1-14, April.
  • Handle: RePEc:gam:jmathe:v:12:y:2024:i:9:p:1328-:d:1384103
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/12/9/1328/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/12/9/1328/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:12:y:2024:i:9:p:1328-:d:1384103. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.