IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v7y2019i12p1229-d297051.html
   My bibliography  Save this article

A Clustering System for Dynamic Data Streams Based on Metaheuristic Optimisation

Author

Listed:
  • Jia Ming Yeoh

    (Institute of Artificial Intelligence, School of Computer Science and Informatics, De Montfort University, Leicester LE1 9BH, UK)

  • Fabio Caraffini

    (Institute of Artificial Intelligence, School of Computer Science and Informatics, De Montfort University, Leicester LE1 9BH, UK)

  • Elmina Homapour

    (Institute of Artificial Intelligence, School of Computer Science and Informatics, De Montfort University, Leicester LE1 9BH, UK)

  • Valentino Santucci

    (Department of Humanities and Social Sciences, University for Foreigners of Perugia, piazza G. Spitella 3, 06123 Perugia, Italy)

  • Alfredo Milani

    (Department of Mathematics and Computer Science, University of Perugia, via Vanvitelli 1, 06123 Perugia, Italy)

Abstract

This article presents the Optimised Stream clustering algorithm (OpStream), a novel approach to cluster dynamic data streams. The proposed system displays desirable features, such as a low number of parameters and good scalability capabilities to both high-dimensional data and numbers of clusters in the dataset, and it is based on a hybrid structure using deterministic clustering methods and stochastic optimisation approaches to optimally centre the clusters. Similar to other state-of-the-art methods available in the literature, it uses “microclusters” and other established techniques, such as density based clustering. Unlike other methods, it makes use of metaheuristic optimisation to maximise performances during the initialisation phase, which precedes the classic online phase. Experimental results show that OpStream outperforms the state-of-the-art methods in several cases, and it is always competitive against other comparison algorithms regardless of the chosen optimisation method. Three variants of OpStream, each coming with a different optimisation algorithm, are presented in this study. A thorough sensitive analysis is performed by using the best variant to point out OpStream’s robustness to noise and resiliency to parameter changes.

Suggested Citation

  • Jia Ming Yeoh & Fabio Caraffini & Elmina Homapour & Valentino Santucci & Alfredo Milani, 2019. "A Clustering System for Dynamic Data Streams Based on Metaheuristic Optimisation," Mathematics, MDPI, vol. 7(12), pages 1-24, December.
  • Handle: RePEc:gam:jmathe:v:7:y:2019:i:12:p:1229-:d:297051
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/7/12/1229/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/7/12/1229/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Matthias Carnein & Heike Trautmann, 2019. "Optimizing Data Stream Representation: An Extensive Survey on Stream Clustering Algorithms," Business & Information Systems Engineering: The International Journal of WIRTSCHAFTSINFORMATIK, Springer;Gesellschaft für Informatik e.V. (GI), vol. 61(3), pages 277-297, June.
    2. Lei Gao & Zhen-yun Jiang & Fan Min, 2019. "First-Arrival Travel Times Picking through Sliding Windows and Fuzzy C-Means," Mathematics, MDPI, vol. 7(3), pages 1-13, February.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mohamad Alissa & Kevin Sim & Emma Hart, 2023. "Automated Algorithm Selection: from Feature-Based to Feature-Free Approaches," Journal of Heuristics, Springer, vol. 29(1), pages 1-38, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:7:y:2019:i:12:p:1229-:d:297051. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.