IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v9y2021i16p1929-d613680.html
   My bibliography  Save this article

Hybrid Fruit-Fly Optimization Algorithm with K-Means for Text Document Clustering

Author

Listed:
  • Timea Bezdan

    (Faculty of Informatics and Computing, Singidunum University, Danijelova 32, 11010 Belgrade, Serbia)

  • Catalin Stoean

    (Human Language Technology Research Center, University of Bucharest, 010014 Bucharest, Romania)

  • Ahmed Al Naamany

    (Department for Mathematics and Computer Science, Modern College of Business and Science, Muscat 113, Oman)

  • Nebojsa Bacanin

    (Faculty of Informatics and Computing, Singidunum University, Danijelova 32, 11010 Belgrade, Serbia)

  • Tarik A. Rashid

    (Computer Science and Engineering Department, University of Kurdistan Hewler, Erbil 44001, Iraq)

  • Miodrag Zivkovic

    (Faculty of Informatics and Computing, Singidunum University, Danijelova 32, 11010 Belgrade, Serbia)

  • K. Venkatachalam

    (Department of Computer Science and Engineering, CHRIST (Deemed to be University), Bangalore 560029, India)

Abstract

The fast-growing Internet results in massive amounts of text data. Due to the large volume of the unstructured format of text data, extracting relevant information and its analysis becomes very challenging. Text document clustering is a text-mining process that partitions the set of text-based documents into mutually exclusive clusters in such a way that documents within the same group are similar to each other, while documents from different clusters differ based on the content. One of the biggest challenges in text clustering is partitioning the collection of text data by measuring the relevance of the content in the documents. Addressing this issue, in this work a hybrid swarm intelligence algorithm with a K-means algorithm is proposed for text clustering. First, the hybrid fruit-fly optimization algorithm is tested on ten unconstrained CEC2019 benchmark functions. Next, the proposed method is evaluated on six standard benchmark text datasets. The experimental evaluation on the unconstrained functions, as well as on text-based documents, indicated that the proposed approach is robust and superior to other state-of-the-art methods.

Suggested Citation

  • Timea Bezdan & Catalin Stoean & Ahmed Al Naamany & Nebojsa Bacanin & Tarik A. Rashid & Miodrag Zivkovic & K. Venkatachalam, 2021. "Hybrid Fruit-Fly Optimization Algorithm with K-Means for Text Document Clustering," Mathematics, MDPI, vol. 9(16), pages 1-19, August.
  • Handle: RePEc:gam:jmathe:v:9:y:2021:i:16:p:1929-:d:613680
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/9/16/1929/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/9/16/1929/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Andrea Lodi & Silvano Martello & Daniele Vigo, 1999. "Heuristic and Metaheuristic Approaches for a Class of Two-Dimensional Bin Packing Problems," INFORMS Journal on Computing, INFORMS, vol. 11(4), pages 345-357, November.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Mohammed Azmi Al-Betar & Ammar Kamal Abasi & Ghazi Al-Naymat & Kamran Arshad & Sharif Naser Makhadmeh, 2023. "Optimization of scientific publications clustering with ensemble approach for topic extraction," Scientometrics, Springer;Akadémiai Kiadó, vol. 128(5), pages 2819-2877, May.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Francisco Trespalacios & Ignacio E. Grossmann, 2017. "Symmetry breaking for generalized disjunctive programming formulation of the strip packing problem," Annals of Operations Research, Springer, vol. 258(2), pages 747-759, November.
    2. Schmid, Verena & Doerner, Karl F. & Laporte, Gilbert, 2013. "Rich routing problems arising in supply chain management," European Journal of Operational Research, Elsevier, vol. 224(3), pages 435-448.
    3. Gregory S. Taylor & Yupo Chan & Ghulam Rasool, 2017. "A three-dimensional bin-packing model: exact multicriteria solution and computational complexity," Annals of Operations Research, Springer, vol. 251(1), pages 397-427, April.
    4. Bayliss, Christopher & Currie, Christine S.M. & Bennell, Julia A. & Martinez-Sykora, Antonio, 2021. "Queue-constrained packing: A vehicle ferry case study," European Journal of Operational Research, Elsevier, vol. 289(2), pages 727-741.
    5. A Ghanmi & R H A D Shaw, 2008. "Modelling and analysis of Canadian Forces strategic lift and pre-positioning options," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 59(12), pages 1591-1602, December.
    6. M. Muntazir Mehdi & Le Wang & Sean P. Willems, 2022. "Developing a Maximum Inscribed Rectangle Heuristic to Satisfy Rush Orders for Heavy Plate Steel," Interfaces, INFORMS, vol. 52(3), pages 283-294, May.
    7. Lodi, Andrea & Martello, Silvano & Vigo, Daniele, 2002. "Heuristic algorithms for the three-dimensional bin packing problem," European Journal of Operational Research, Elsevier, vol. 141(2), pages 410-420, September.
    8. Zachariadis, Emmanouil E. & Tarantilis, Christos D. & Kiranoudis, Christos T., 2009. "A Guided Tabu Search for the Vehicle Routing Problem with two-dimensional loading constraints," European Journal of Operational Research, Elsevier, vol. 195(3), pages 729-743, June.
    9. Lodi, Andrea & Martello, Silvano & Monaci, Michele, 2002. "Two-dimensional packing problems: A survey," European Journal of Operational Research, Elsevier, vol. 141(2), pages 241-252, September.
    10. Emmanouil E. Zachariadis & Christos D. Tarantilis & Chris T. Kiranoudis, 2017. "Vehicle routing strategies for pick-up and delivery service under two dimensional loading constraints," Operational Research, Springer, vol. 17(1), pages 115-143, April.
    11. Emmanouil E. Zachariadis & Christos D. Tarantilis & Chris T. Kiranoudis, 2012. "The Pallet-Packing Vehicle Routing Problem," Transportation Science, INFORMS, vol. 46(3), pages 341-358, August.
    12. Yi-Ping Cui & Yongwu Zhou & Yaodong Cui, 2017. "Triple-solution approach for the strip packing problem with two-staged patterns," Journal of Combinatorial Optimization, Springer, vol. 34(2), pages 588-604, August.
    13. G Belov & G Scheithauer & E A Mukhacheva, 2008. "One-dimensional heuristics adapted for two-dimensional rectangular strip packing," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 59(6), pages 823-832, June.
    14. Michele Monaci & Paolo Toth, 2006. "A Set-Covering-Based Heuristic Approach for Bin-Packing Problems," INFORMS Journal on Computing, INFORMS, vol. 18(1), pages 71-85, February.
    15. Oscar Dominguez & Angel Juan & Barry Barrios & Javier Faulin & Alba Agustin, 2016. "Using biased randomization for solving the two-dimensional loading vehicle routing problem with heterogeneous fleet," Annals of Operations Research, Springer, vol. 236(2), pages 383-404, January.
    16. Alexander Hübner & Fabian Schäfer & Kai N. Schaal, 2020. "Maximizing Profit via Assortment and Shelf‐Space Optimization for Two‐Dimensional Shelves," Production and Operations Management, Production and Operations Management Society, vol. 29(3), pages 547-570, March.
    17. Imahori, S. & Yagiura, M. & Ibaraki, T., 2005. "Improved local search algorithms for the rectangle packing problem with general spatial costs," European Journal of Operational Research, Elsevier, vol. 167(1), pages 48-67, November.
    18. Henriette Koch & Andreas Bortfeldt & Gerhard Wäscher, 2017. "A hybrid solution approach for the 3L-VRP with simultaneous delivery and pickups," FEMM Working Papers 170005, Otto-von-Guericke University Magdeburg, Faculty of Economics and Management.
    19. Wei, Lijun & Tian, Tian & Zhu, Wenbin & Lim, Andrew, 2014. "A block-based layer building approach for the 2D guillotine strip packing problem," European Journal of Operational Research, Elsevier, vol. 239(1), pages 58-69.
    20. Silvano Martello & Michele Monaci & Daniele Vigo, 2003. "An Exact Approach to the Strip-Packing Problem," INFORMS Journal on Computing, INFORMS, vol. 15(3), pages 310-319, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:9:y:2021:i:16:p:1929-:d:613680. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.