IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v314y2024i3p1065-1077.html
   My bibliography  Save this article

Cluster ensemble selection and consensus clustering: A multi-objective optimization approach

Author

Listed:
  • Aktaş, Dilay
  • Lokman, Banu
  • İnkaya, Tülin
  • Dejaegere, Gilles

Abstract

Cluster ensembles have emerged as a powerful tool to obtain clusters of data points by combining a library of clustering solutions into a consensus solution. In this paper, we address the cluster ensemble selection problem and design a multi-objective optimization-based solution framework to produce consensus solutions. Given a library of clustering solutions, we first design a preprocessing procedure that measures the agreement of each clustering solution with the other solutions and eliminates the ones that may mislead the process. We then develop a multi-objective optimization algorithm that selects representative clustering solutions from the preprocessed library with respect to size, coverage, and diversity criteria and combines them into a single consensus solution, for which the true number of clusters is assumed to be unknown. We conduct experiments on different benchmark data sets. The results show that our approach yields more accurate consensus solutions compared to full-ensemble and the existing approaches for most data sets. We also present an application on the customer segmentation problem, where our approach is used to segment customers and to find a consensus solution for each segment, simultaneously.

Suggested Citation

  • Aktaş, Dilay & Lokman, Banu & İnkaya, Tülin & Dejaegere, Gilles, 2024. "Cluster ensemble selection and consensus clustering: A multi-objective optimization approach," European Journal of Operational Research, Elsevier, vol. 314(3), pages 1065-1077.
  • Handle: RePEc:eee:ejores:v:314:y:2024:i:3:p:1065-1077
    DOI: 10.1016/j.ejor.2023.10.029
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377221723008044
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.ejor.2023.10.029?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David H. Wolpert & William G. Macready, 1995. "No Free Lunch Theorems for Search," Working Papers 95-02-010, Santa Fe Institute.
    2. Santi, Éverton & Aloise, Daniel & Blanchard, Simon J., 2016. "A model for clustering data from heterogeneous dissimilarities," European Journal of Operational Research, Elsevier, vol. 253(3), pages 659-672.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jui-Sheng Chou & Dinh-Nhat Truong & Chih-Fong Tsai, 2021. "Solving Regression Problems with Intelligent Machine Learner for Engineering Informatics," Mathematics, MDPI, vol. 9(6), pages 1-25, March.
    2. Karmitsa, Napsu & Bagirov, Adil M. & Taheri, Sona, 2017. "New diagonal bundle method for clustering problems in large data sets," European Journal of Operational Research, Elsevier, vol. 263(2), pages 367-379.
    3. Sevvandi Kandanaarachchi & Mario A Munoz & Rob J Hyndman & Kate Smith-Miles, 2018. "On normalization and algorithm selection for unsupervised outlier detection," Monash Econometrics and Business Statistics Working Papers 16/18, Monash University, Department of Econometrics and Business Statistics.
    4. Kamran Zolfi, 2023. "Gold rush optimizer: A new population-based metaheuristic algorithm," Operations Research and Decisions, Wroclaw University of Science and Technology, Faculty of Management, vol. 33(1), pages 113-150.
    5. Y.C. Ho & D.L. Pepyne, 2002. "Simple Explanation of the No-Free-Lunch Theorem and Its Implications," Journal of Optimization Theory and Applications, Springer, vol. 115(3), pages 549-570, December.
    6. Murtadha Al-Kaabi & Virgil Dumbrava & Mircea Eremia, 2022. "A Slime Mould Algorithm Programming for Solving Single and Multi-Objective Optimal Power Flow Problems with Pareto Front Approach: A Case Study of the Iraqi Super Grid High Voltage," Energies, MDPI, vol. 15(20), pages 1-33, October.
    7. Abdel-Rahman Hedar & Emad Mabrouk & Masao Fukushima, 2011. "Tabu Programming: A New Problem Solver Through Adaptive Memory Programming Over Tree Data Structures," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 10(02), pages 373-406.
    8. Muangkote, Nipotepat & Sunat, Khamron & Chiewchanwattana, Sirapat & Kaiwinit, Sirilak, 2019. "An advanced onlooker-ranking-based adaptive differential evolution to extract the parameters of solar cell models," Renewable Energy, Elsevier, vol. 134(C), pages 1129-1147.
    9. Sharifian, Yeganeh & Abdi, Hamdi, 2023. "Solving multi-area economic dispatch problem using hybrid exchange market algorithm with grasshopper optimization algorithm," Energy, Elsevier, vol. 267(C).
    10. Díaz–Pachón, Daniel Andrés & Sáenz, Juan Pablo & Rao, J. Sunil, 2020. "Hypothesis testing with active information," Statistics & Probability Letters, Elsevier, vol. 161(C).
    11. Rota Bulò, Samuel & Pelillo, Marcello, 2017. "Dominant-set clustering: A review," European Journal of Operational Research, Elsevier, vol. 262(1), pages 1-13.
    12. Yi Peng & Gang Kou & Guoxun Wang & Honggang Wang & Franz I. S. Ko, 2009. "Empirical Evaluation Of Classifiers For Software Risk Management," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 8(04), pages 749-767.
    13. Zhang, Xueying & Li, Ruixian & Zhang, Bo & Yang, Yunxiang & Guo, Jing & Ji, Xiang, 2019. "An instance-based learning recommendation algorithm of imbalance handling methods," Applied Mathematics and Computation, Elsevier, vol. 351(C), pages 204-218.
    14. Peter F. Stadler & Gunjter P. Wagner, 1996. "The Algebraic Theory of Recombination Spaces," Working Papers 96-07-046, Santa Fe Institute.
    15. L. Ingber, 1996. "Adaptive simulated annealing (ASA): Lessons learned," Lester Ingber Papers 96as, Lester Ingber.
    16. Christopher Ifeanyi Eke & Azah Anir Norman & Liyana Shuib, 2021. "Multi-feature fusion framework for sarcasm identification on twitter data: A machine learning based approach," PLOS ONE, Public Library of Science, vol. 16(6), pages 1-32, June.
    17. Radek Hrebik & Jaromir Kukal & Josef Jablonsky, 2019. "Optimal unions of hidden classes," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 27(1), pages 161-177, March.
    18. Chen, Yi-Ting & Sun, Edward W. & Lin, Yi-Bing, 2020. "Merging anomalous data usage in wireless mobile telecommunications: Business analytics with a strategy-focused data-driven approach for sustainability," European Journal of Operational Research, Elsevier, vol. 281(3), pages 687-705.
    19. Gambella, Claudio & Ghaddar, Bissan & Naoum-Sawaya, Joe, 2021. "Optimization problems for machine learning: A survey," European Journal of Operational Research, Elsevier, vol. 290(3), pages 807-828.
    20. William G. Macready & David H. Wolpert, 1995. "What Makes an Optimization Problem Hard?," Working Papers 95-05-046, Santa Fe Institute.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:314:y:2024:i:3:p:1065-1077. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.