IDEAS home Printed from https://ideas.repec.org/r/bes/jnlasa/v98y2003p750-763.html
   My bibliography  Save this item

Finding the Number of Clusters in a Dataset: An Information-Theoretic Approach

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
as


Cited by:

  1. Heilmann, Christoph & Wozabal, David, 2021. "How much smart charging is smart?," Applied Energy, Elsevier, vol. 291(C).
  2. Véronique Cariou & Stéphane Verdun & Emmanuelle Diaz & El Qannari & Evelyne Vigneau, 2009. "Comparison of three hypothesis testing approaches for the selection of the appropriate number of clusters of variables," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 3(3), pages 227-241, December.
  3. Athanasios Constantopoulos & John Yfantopoulos & Panos Xenos & Athanassios Vozikis, 2019. "Cluster shifts based on healthcare factors: The case of Greece in an OECD background 2009-2014," Advances in Management and Applied Economics, SCIENPRESS Ltd, vol. 9(6), pages 1-4.
  4. Peter Radchenko & Gourab Mukherjee, 2017. "Convex clustering via l 1 fusion penalization," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 79(5), pages 1527-1546, November.
  5. Lingsong Meng & Dorina Avram & George Tseng & Zhiguang Huo, 2022. "Outcome‐guided sparse K‐means for disease subtype discovery via integrating phenotypic data with high‐dimensional transcriptomic data," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 71(2), pages 352-375, March.
  6. Anton Borg & Martin Boldt, 2016. "Clustering Residential Burglaries Using Modus Operandi and Spatiotemporal Information," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 15(01), pages 23-42, January.
  7. Šárka Brodinová & Peter Filzmoser & Thomas Ortner & Christian Breiteneder & Maia Rohm, 2019. "Robust and sparse k-means clustering for high-dimensional data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(4), pages 905-932, December.
  8. Lim, Alejandro & Chiang, Chin-Tsang & Teng, Jen-Chieh, 2021. "Estimating robot strengths with application to selection of alliance members in FIRST robotics competitions," Computational Statistics & Data Analysis, Elsevier, vol. 158(C).
  9. Xu, Liao & Gao, Han & Shi, Yukun & Zhao, Yang, 2020. "The heterogeneous volume-volatility relations in the exchange-traded fund market: Evidence from China," Economic Modelling, Elsevier, vol. 85(C), pages 400-408.
  10. Li, Xuemei & Liu, Xiaoxing, 2023. "Functional classification and dynamic prediction of cumulative intraday returns in crude oil futures," Energy, Elsevier, vol. 284(C).
  11. Z. Volkovich & Z. Barzily & G.-W. Weber & D. Toledano-Kitai & R. Avros, 2012. "An application of the minimal spanning tree approach to the cluster stability problem," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 20(1), pages 119-139, March.
  12. Al-Augby Salam & Majewski Sebastian & Majewska Agnieszka & Nermend Kesra, 2014. "A Comparison Of K-Means And Fuzzy C-Means Clustering Methods For A Sample Of Gulf Cooperation Council Stock Markets," Folia Oeconomica Stetinensia, Sciendo, vol. 14(2), pages 19-36, December.
  13. Ilias Petrou & Nikolaos Kyriazis & Pavlos Kassomenos, 2023. "Evaluating the Spatial and Temporal Characteristics of Summer Urban Overheating through Weather Types in the Attica Region, Greece," Sustainability, MDPI, vol. 15(13), pages 1-15, July.
  14. Yi Peng & Yong Zhang & Gang Kou & Yong Shi, 2012. "A Multicriteria Decision Making Approach for Estimating the Number of Clusters in a Data Set," PLOS ONE, Public Library of Science, vol. 7(7), pages 1-9, July.
  15. Volkovich, Vladimir & Kogan, Jacob & Nicholas, Charles, 2007. "Building initial partitions through sampling techniques," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1097-1105, December.
  16. J. Fernando Vera & Rodrigo Macías, 2021. "On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling," Psychometrika, Springer;The Psychometric Society, vol. 86(2), pages 489-513, June.
  17. Oliver Schaer & Nikolaos Kourentzes & Robert Fildes, 2022. "Predictive competitive intelligence with prerelease online search traffic," Production and Operations Management, Production and Operations Management Society, vol. 31(10), pages 3823-3839, October.
  18. Anis Hoayek & Didier Rullière, 2024. "Assessing clustering methods using Shannon's entropy," Post-Print hal-03812055, HAL.
  19. Julian Rossbroich & Jeffrey Durieux & Tom F. Wilderjans, 2022. "Model Selection Strategies for Determining the Optimal Number of Overlapping Clusters in Additive Overlapping Partitional Clustering," Journal of Classification, Springer;The Classification Society, vol. 39(2), pages 264-301, July.
  20. Pietro Monforte & Maria Alessandra Ragusa, 2022. "Temperature Trend Analysis and Investigation on a Case of Variability Climate," Mathematics, MDPI, vol. 10(13), pages 1-13, June.
  21. Osbert C Zalay, 2020. "Blind method for discovering number of clusters in multidimensional datasets by regression on linkage hierarchies generated from random data," PLOS ONE, Public Library of Science, vol. 15(1), pages 1-28, January.
  22. Romain Banchereau & Alejandro Jordan-Villegas & Monica Ardura & Asuncion Mejias & Nicole Baldwin & Hui Xu & Elizabeth Saye & Jose Rossello-Urgell & Phuong Nguyen & Derek Blankenship & Clarence B Creec, 2012. "Host Immune Transcriptional Profiles Reflect the Variability in Clinical Disease Manifestations in Patients with Staphylococcus aureus Infections," PLOS ONE, Public Library of Science, vol. 7(4), pages 1-11, April.
  23. Hand, David J. & Krzanowski, Wojtek J., 2005. "Optimising k-means clustering results with standard software packages," Computational Statistics & Data Analysis, Elsevier, vol. 49(4), pages 969-973, June.
  24. Mark Chiang & Boris Mirkin, 2010. "Intelligent Choice of the Number of Clusters in K-Means Clustering: An Experimental Study with Different Cluster Spreads," Journal of Classification, Springer;The Classification Society, vol. 27(1), pages 3-40, March.
  25. Gaynor, Sheila & Bair, Eric, 2017. "Identification of relevant subtypes via preweighted sparse clustering," Computational Statistics & Data Analysis, Elsevier, vol. 116(C), pages 139-154.
  26. Li, Pai-Ling & Chiou, Jeng-Min, 2011. "Identifying cluster number for subspace projected functional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2090-2103, June.
  27. Fang, Yixin & Wang, Junhui, 2011. "Penalized cluster analysis with applications to family data," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2128-2136, June.
  28. Marianna Mauro & Monica Giancotti & Giovanna Talarico, 2017. "Mapping the field: A bibliometric analysis of accountability literature in healthcare," MECOSAN, FrancoAngeli Editore, vol. 2017(101), pages 7-30.
  29. Yujia Li & Xiangrui Zeng & Chien‐Wei Lin & George C. Tseng, 2022. "Simultaneous estimation of cluster number and feature sparsity in high‐dimensional cluster analysis," Biometrics, The International Biometric Society, vol. 78(2), pages 574-585, June.
  30. Kyung-Duk Min & Ho-Jang Kwon & KyooSang Kim & Sun-Young Kim, 2017. "Air Pollution Monitoring Design for Epidemiological Application in a Densely Populated City," IJERPH, MDPI, vol. 14(7), pages 1-12, June.
  31. Fischer, Aurélie, 2011. "On the number of groups in clustering," Statistics & Probability Letters, Elsevier, vol. 81(12), pages 1771-1781.
  32. Iordanis Parikoglou & Grigorios Emvalomatis & Doris Läpple & Fiona Thorne & Michael Wallace, 2024. "The contribution of innovation to farm-level productivity," Journal of Productivity Analysis, Springer, vol. 62(2), pages 239-255, October.
  33. Adan Ortiz-Cordova & Bernard J. Jansen, 2012. "Classifying web search queries to identify high revenue generating customers," Journal of the Association for Information Science & Technology, Association for Information Science & Technology, vol. 63(7), pages 1426-1441, July.
  34. van Staden, Chantelle Y. & Vermeulen, Hendrik J. & Groch, Matthew, 2021. "Time-of-Use feature based clustering of spatiotemporal wind power profiles," Energy, Elsevier, vol. 236(C).
  35. Zura Kakushadze & Willie Yu, 2016. "Statistical Industry Classification," Papers 1607.04883, arXiv.org, revised Dec 2018.
  36. Koltcov, Sergei, 2018. "Application of Rényi and Tsallis entropies to topic modeling optimization," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 512(C), pages 1192-1204.
  37. Fujita, André & Takahashi, Daniel Y. & Patriota, Alexandre G., 2014. "A non-parametric method to estimate the number of clusters," Computational Statistics & Data Analysis, Elsevier, vol. 73(C), pages 27-39.
  38. Jane L. Harvill & Priya Kohli & Nalini Ravishanker, 2017. "Clustering Nonlinear, Nonstationary Time Series Using BSLEX," Methodology and Computing in Applied Probability, Springer, vol. 19(3), pages 935-955, September.
  39. Hofmeyr, David P., 2020. "Degrees of freedom and model selection for k-means clustering," Computational Statistics & Data Analysis, Elsevier, vol. 149(C).
  40. Castañeda, Gonzalo & Chávez-Juárez, Florian & Guerrero, Omar A., 2018. "How do governments determine policy priorities? Studying development strategies through spillover networks," Journal of Economic Behavior & Organization, Elsevier, vol. 154(C), pages 335-361.
  41. Irad Ben-Gal & Marcelo Bacher & Morris Amara & Erez Shmueli, 2023. "A Nonparametric Subspace Analysis Approach with Application to Anomaly Detection Ensembles," INFORMS Joural on Data Science, INFORMS, vol. 2(2), pages 99-115, October.
  42. Chattopadhyay, Asis Kumar & Mondal, Saptarshi & Chattopadhyay, Tanuka, 2013. "Independent Component Analysis for the objective classification of globular clusters of the galaxy NGC 5128," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 17-32.
  43. Kondo, Yumi & Salibian-Barrera, Matias & Zamar, Ruben, 2016. "RSKC: An R Package for a Robust and Sparse K-Means Clustering Algorithm," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 72(i05).
  44. Israel A. Almodóvar-Rivera & Rosa V. Rosario-Rosado & Cruz M. Nazario & Johan Hernández-Santiago & Farah A. Ramírez-Marrero & Maxime Nunez & Rohan Maharaj & Peter Adams & Josefa L. Martinez-Brockman &, 2022. "Development of the Anthropometric Grouping Index for the Eastern Caribbean Population Using the Eastern Caribbean Health Outcomes Research Network (ECHORN) Cohort Study Data," IJERPH, MDPI, vol. 19(16), pages 1-9, August.
  45. Jesús Miguel Jornet-Meliá & Carlos Sancho-Álvarez & Margarita Bakieva-Karimova, 2022. "Analysis of Profiles of Family Educational Situations during COVID-19 Lockdown in the Valencian Community (Spain)," Societies, MDPI, vol. 13(1), pages 1-20, December.
  46. Grace E Fox & Meng Li & Fang Zhao & Joe Z Tsien, 2017. "Distinct retrosplenial cortex cell populations and their spike dynamics during ketamine-induced unconscious state," PLOS ONE, Public Library of Science, vol. 12(10), pages 1-23, October.
  47. Fang, Yixin & Wang, Junhui, 2012. "Selection of the number of clusters via the bootstrap method," Computational Statistics & Data Analysis, Elsevier, vol. 56(3), pages 468-477.
  48. Daniel Fernández & Richard Arnold & Shirley Pledger & Ivy Liu & Roy Costilla, 2019. "Finite mixture biclustering of discrete type multivariate data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 117-143, March.
  49. Tomislava Pavić Kramarić & Mirjana Pejić Bach & Ksenija Dumičić & Berislav Žmuk & Maja Mihelja Žaja, 2018. "Exploratory study of insurance companies in selected post-transition countries: non-hierarchical cluster analysis," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 26(3), pages 783-807, September.
  50. J. Vera & Rodrigo Macías & Willem Heiser, 2013. "Cluster Differences Unfolding for Two-Way Two-Mode Preference Rating Data," Journal of Classification, Springer;The Classification Society, vol. 30(3), pages 370-396, October.
  51. Zura Kakushadze & Willie Yu, 2017. "*K-means and Cluster Models for Cancer Signatures," Papers 1703.00703, arXiv.org, revised Jul 2017.
  52. Christophe Genolini & Bruno Falissard, 2010. "KmL: k-means for longitudinal data," Computational Statistics, Springer, vol. 25(2), pages 317-328, June.
  53. Jonas M. B. Haslbeck & Dirk U. Wulff, 2020. "Estimating the number of clusters via a corrected clustering instability," Computational Statistics, Springer, vol. 35(4), pages 1879-1894, December.
  54. Andrey V. Orekhov, 2021. "Quasi-Deterministic Processes with Monotonic Trajectories and Unsupervised Machine Learning," Mathematics, MDPI, vol. 9(18), pages 1-26, September.
  55. Julien Jacques & Cristian Preda, 2014. "Functional data clustering: a survey," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 8(3), pages 231-255, September.
  56. Z. Volkovich & D. Toledano-Kitai & G.-W. Weber, 2013. "Self-learning K-means clustering: a global optimization approach," Journal of Global Optimization, Springer, vol. 56(2), pages 219-232, June.
  57. Zhou, Jian-Lan & Yu, Ze-Tai & Xiao, Ren-Bin, 2022. "A large-scale group Success Likelihood Index Method to estimate human error probabilities in the railway driving process," Reliability Engineering and System Safety, Elsevier, vol. 228(C).
  58. Gopal, Vikneswaran & Fuentes, Claudio & Casella, George, 2012. "bayesclust: An R Package for Testing and Searching for Significant Clusters," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 47(i14).
  59. Ashish Sood & Gareth M. James & Gerard J. Tellis, 2009. "Functional Regression: A New Model for Predicting Market Penetration of New Products," Marketing Science, INFORMS, vol. 28(1), pages 36-51, 01-02.
  60. Mi-Kyeong Kim & Sangpil Kim & Hong-Gyoo Sohn, 2018. "Relationship between Spatio-Temporal Travel Patterns Derived from Smart-Card Data and Local Environmental Characteristics of Seoul, Korea," Sustainability, MDPI, vol. 10(3), pages 1-18, March.
  61. Philip A. White & Alan E. Gelfand, 2021. "Multivariate functional data modeling with time-varying clustering," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(3), pages 586-602, September.
  62. Jaković Božidar & Ćurlin Tamara & Miloloža Ivan, 2021. "Enterprise Digital Divide: Website e-Commerce Functionalities among European Union Enterprises," Business Systems Research, Sciendo, vol. 12(1), pages 197-215, May.
  63. Alberto-Jesus Perea-Moreno & Gerardo Alcalá & Quetzalcoatl Hernandez-Escobedo, 2019. "Seasonal Wind Energy Characterization in the Gulf of Mexico," Energies, MDPI, vol. 13(1), pages 1-21, December.
  64. J. Fernando Vera & Rodrigo Macías, 2017. "Variance-Based Cluster Selection Criteria in a K-Means Framework for One-Mode Dissimilarity Data," Psychometrika, Springer;The Psychometric Society, vol. 82(2), pages 275-294, June.
  65. Ashish Arora & Michelle Gittelman & Sarah Kaplan & John Lynch & Will Mitchell & Nicolaj Siggelkow & Chi-Hyon Lee & Manuela N. Hoehn-Weiss & Samina Karim, 2016. "Grouping interdependent tasks: Using spectral graph partitioning to study complex systems," Strategic Management Journal, Wiley Blackwell, vol. 37(1), pages 177-191, January.
  66. Qiang Ji & Dayong Zhang & Yuqian Zhao, 2022. "Intra-day co-movements of crude oil futures: China and the international benchmarks," Annals of Operations Research, Springer, vol. 313(1), pages 77-103, June.
  67. Mingjin Yan & Keying Ye, 2007. "Determining the Number of Clusters Using the Weighted Gap Statistic," Biometrics, The International Biometric Society, vol. 63(4), pages 1031-1037, December.
  68. Ertl, Antal & Horn, Dániel & Kiss, Hubert János, 2024. "Economic Preferences across Generations and Family Clusters: A Comment," I4R Discussion Paper Series 105, The Institute for Replication (I4R).
  69. Vainora, J., 2024. "Latent Position-Based Modeling of Parameter Heterogeneity," Cambridge Working Papers in Economics 2455, Faculty of Economics, University of Cambridge.
  70. Paul, Biplab & De, Shyamal K. & Ghosh, Anil K., 2022. "Some clustering-based exact distribution-free k-sample tests applicable to high dimension, low sample size data," Journal of Multivariate Analysis, Elsevier, vol. 190(C).
IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.