IDEAS home Printed from https://ideas.repec.org/a/wsi/ijitdm/v17y2018i03ns0219622018500141.html
   My bibliography  Save this article

Unsupervised Learning from Multi-Dimensional Data: A Fast Clustering Algorithm Utilizing Canopies and Statistical Information

Author

Listed:
  • Giyasettin Ozcan

    (Department of Computer Engineering, Uludag University, Gorukle Kampusu, Bursa 16059, Turkey)

Abstract

In this study, we consider unsupervised learning from multi-dimensional dataset problem. Particularly, we consider k-means clustering which require long duration time during execution of multi-dimensional datasets. In order to speed up clustering in an accurate form, we introduce a new algorithm, that we term Canopy+. The algorithm utilizes canopies and statistical techniques. Also, its efficient initiation and normalization methodologies contributes to the improvement. Furthermore, we consider early termination cases of clustering computation, provided that an intermediate result of the computation is accurate enough. We compared our algorithm with four popular clustering algorithms. Results denote that our algorithm speeds up the clustering computation by at least 2X. Also, we analyzed the contribution of early termination. Results present that further 2X improvement can be obtained while incurring 0.1% error rate. We also observe that our Canopy+ algorithm benefits from early termination and introduces extra 1.2X performance improvement.

Suggested Citation

  • Giyasettin Ozcan, 2018. "Unsupervised Learning from Multi-Dimensional Data: A Fast Clustering Algorithm Utilizing Canopies and Statistical Information," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(03), pages 841-856, May.
  • Handle: RePEc:wsi:ijitdm:v:17:y:2018:i:03:n:s0219622018500141
    DOI: 10.1142/S0219622018500141
    as

    Download full text from publisher

    File URL: http://www.worldscientific.com/doi/abs/10.1142/S0219622018500141
    Download Restriction: Access to full text is restricted to subscribers

    File URL: https://libkey.io/10.1142/S0219622018500141?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Anton Borg & Martin Boldt, 2016. "Clustering Residential Burglaries Using Modus Operandi and Spatiotemporal Information," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 15(01), pages 23-42, January.
    2. Baroudi Rouba & Safia Nait Bahloul, 2014. "A Multicriteria Clustering Approach Based on Similarity Indices and Clustering Ensemble Techniques," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 13(04), pages 811-837.
    3. Peng, Yi & Kou, Gang & Wang, Guoxun & Shi, Yong, 2011. "FAMCDM: A fusion approach of MCDM methods to rank multiclass classification algorithms," Omega, Elsevier, vol. 39(6), pages 677-689, December.
    4. Gang Kou & Yanqun Lu & Yi Peng & Yong Shi, 2012. "Evaluation Of Classification Algorithms Using Mcdm And Rank Correlation," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 11(01), pages 197-225.
    5. Yi Peng & Gang Kou & Yong Shi & Zhengxin Chen, 2008. "A Descriptive Framework For The Field Of Data Mining And Knowledge Discovery," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 7(04), pages 639-682.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Ginger Saltos & Mihaela Cocea, 2017. "An Exploration of Crime Prediction Using Data Mining on Open Data," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(05), pages 1155-1181, September.
    2. Rahime Ceylan & Hasan Koyuncu, 2016. "A New Breakpoint in Hybrid Particle Swarm-Neural Network Architecture: Individual Boundary Adjustment," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 15(06), pages 1313-1343, November.
    3. Jianfeng Xu & Yuanjian Zhang & Peng Zhang & Azhar Mahmood & Yu Li & Shaheen Khatoon, 2017. "Data Mining on ICU Mortality Prediction Using Early Temporal Data: A Survey," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(01), pages 117-159, January.
    4. Fenghua Wen & Xin Yang & Xu Gong & Kin Keung Lai, 2017. "Multi-Scale Volatility Feature Analysis and Prediction of Gold Price," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(01), pages 205-223, January.
    5. Małgorzata Przybyła-Kasperek, 2019. "Three Conflict Methods in Multiple Classifiers that Use Dispersed Knowledge," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 18(02), pages 555-599, March.
    6. P. D. Mahendhiran & S. Kannimuthu, 2018. "Deep Learning Techniques for Polarity Classification in Multimodal Sentiment Analysis," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(03), pages 883-910, May.
    7. Feyzan Arikan & Senay Citak, 2017. "Multiple Criteria Inventory Classification in an Electronics Firm," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(02), pages 315-331, March.
    8. O. H. Salman & A. A. Zaidan & B. B. Zaidan & Naserkalid & M. Hashim, 2017. "Novel Methodology for Triage and Prioritizing Using “Big Data” Patients with Chronic Heart Diseases Through Telemedicine Environmental," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(05), pages 1211-1245, September.
    9. Si He & Nabil Belacel & Alan Chan & Habib Hamam & Yassine Bouslimani, 2016. "A Hybrid Artificial Fish Swarm Simulated Annealing Optimization Algorithm for Automatic Identification of Clusters," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 15(05), pages 949-974, September.
    10. Thierno M. L. Diallo & Sébastien Henry & Yacine Ouzrout & Abdelaziz Bouras, 2018. "Data-Based Fault Diagnosis Model Using a Bayesian Causal Analysis Framework," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(02), pages 583-620, March.
    11. Yi Peng, 2015. "Regional earthquake vulnerability assessment using a combination of MCDM methods," Annals of Operations Research, Springer, vol. 234(1), pages 95-110, November.
    12. Chun-Hao Chen & Tzung-Pei Hong & Yeong-Chyi Lee & Vincent S. Tseng, 2015. "Finding Active Membership Functions for Genetic-Fuzzy Data Mining," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 14(06), pages 1215-1242, November.
    13. Peide Liu & Peng Wang, 2017. "Some Improved Linguistic Intuitionistic Fuzzy Aggregation Operators and Their Applications to Multiple-Attribute Decision Making," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(03), pages 817-850, May.
    14. N. Thillaigovindan & S. Anita Shanthi & J. Vadivel Naidu, 2016. "New Method for Solving a General Multiple Criteria Decision-Making Problem Under Risk in Fuzzy Environment," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 15(05), pages 1157-1179, September.
    15. Carmen De Maio & Aurelio Tommasetti & Orlando Troisi & Massimiliano Vesci & Giuseppe Fenza & Vincenzo Loia, 2016. "Contextual Fuzzy-Based Decision Support System Through Opinion Analysis: A Case Study at University of the Salerno," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 15(05), pages 923-948, September.
    16. Gang Kou & Wenshuai Wu, 2014. "Multi-criteria decision analysis for emergency medical service assessment," Annals of Operations Research, Springer, vol. 223(1), pages 239-254, December.
    17. Thomas L. Saaty & Daji Ergu, 2015. "When is a Decision-Making Method Trustworthy? Criteria for Evaluating Multi-Criteria Decision-Making Methods," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 14(06), pages 1171-1187, November.
    18. Juan Carlos Leyva Lopez & Jesus Jaime Solano Noriega & Diego Alonso Gastelum Chavira, 2017. "A Multi-Criteria Approach to Rank the Municipalities of the States of Mexico by its Marginalization Level: The Case of Jalisco," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(02), pages 473-513, March.
    19. Eduardo Fernandez & Jorge Navarro & Rafael Olmedo, 2018. "Characterization of the Effectiveness of Several Outranking-Based Multi-Criteria Sorting Methods," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 17(04), pages 1047-1084, July.
    20. Sarah Ben Amor & Fateh Belaid & Ramzi Benkraiem & Boumediene Ramdani & Khaled Guesmi, 2023. "Multi-criteria classification, sorting, and clustering: a bibliometric review and research agenda," Annals of Operations Research, Springer, vol. 325(2), pages 771-793, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wsi:ijitdm:v:17:y:2018:i:03:n:s0219622018500141. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Tai Tone Lim (email available below). General contact details of provider: http://www.worldscinet.com/ijitdm/ijitdm.shtml .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.