A fast and recursive algorithm for clustering large datasets with k-medians
Author
Abstract
Suggested Citation
DOI: 10.1016/j.csda.2011.11.019
Download full text from publisher
As the access to this document is restricted, you may want to search for a different version of it.
References listed on IDEAS
- García-Treviño, E.S. & Barria, J.A., 2012. "Online wavelet-based density estimation for non-stationary streaming data," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 327-344.
- Luis García-Escudero & Alfonso Gordaliza & Carlos Matrán & Agustín Mayo-Iscar, 2010. "A review of robust clustering methods," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(2), pages 89-109, September.
- Monnez, Jean-Marie, 2006. "Almost sure convergence of stochastic gradient processes with matrix step sizes," Statistics & Probability Letters, Elsevier, vol. 76(5), pages 531-536, March.
- Croux, Christophe & Gallopoulos, Efstratios & Van Aelst, Stefan & Zha, Hongyuan, 2007. "Machine Learning and Robust Data Mining," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 151-154, September.
Citations
Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
Cited by:
- Monnez, Jean-Marie & Skiredj, Abderrahman, 2021. "Widening the scope of an eigenvector stochastic approximation process and application to streaming PCA and related methods," Journal of Multivariate Analysis, Elsevier, vol. 182(C).
- Godichon-Baggioni, Antoine & Lu, Wei, 2024. "Online stochastic Newton methods for estimating the geometric median and applications," Journal of Multivariate Analysis, Elsevier, vol. 202(C).
- Hervé Cardot & Antoine Godichon-Baggioni, 2017. "Fast estimation of the median covariation matrix with application to online robust principal components analysis," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 26(3), pages 461-480, September.
- Gaunand, A. & Hocdé, A. & Lemarié, S. & Matt, M. & Turckheim, E.de, 2015. "How does public agricultural research impact society? A characterization of various patterns," Research Policy, Elsevier, vol. 44(4), pages 849-861.
- Godichon-Baggioni, Antoine, 2016. "Estimating the geometric median in Hilbert spaces with stochastic gradient algorithms: Lp and almost sure rates of convergence," Journal of Multivariate Analysis, Elsevier, vol. 146(C), pages 209-222.
Most related items
These are the items that most often cite the same works as this one and are cited by the same works as this one.- Pierpaolo D’Urso & Livia De Giovanni & Riccardo Massari & Francesca G. M. Sica, 2019. "Cross Sectional and Longitudinal Fuzzy Clustering of the NUTS and Positioning of the Italian Regions with Respect to the Regional Competitiveness Index (RCI) Indicators with Contiguity Constraints," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 146(3), pages 609-650, December.
- García Treviño, E.S. & Alarcón Aquino, V. & Barria, J.A., 2019. "The radial wavelet frame density estimator," Computational Statistics & Data Analysis, Elsevier, vol. 130(C), pages 111-139.
- Rubin Daniel B., 2011. "A Calibrated Multiclass Extension of AdaBoost," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 10(1), pages 1-24, November.
- Brunet-Saumard, Camille & Genetay, Edouard & Saumard, Adrien, 2022. "K-bMOM: A robust Lloyd-type clustering algorithm based on bootstrap median-of-means," Computational Statistics & Data Analysis, Elsevier, vol. 167(C).
- Yang, Yu-Chen & Lin, Tsung-I & Castro, Luis M. & Wang, Wan-Lun, 2020. "Extending finite mixtures of t linear mixed-effects models with concomitant covariates," Computational Statistics & Data Analysis, Elsevier, vol. 148(C).
- Pietro Coretto & Christian Hennig, 2016. "Robust Improper Maximum Likelihood: Tuning, Computation, and a Comparison With Other Methods for Robust Gaussian Clustering," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1648-1659, October.
- C. Ruwet & L. García-Escudero & A. Gordaliza & A. Mayo-Iscar, 2012. "The influence function of the TCLUST robust clustering procedure," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 6(2), pages 107-130, July.
- Šárka Brodinová & Peter Filzmoser & Thomas Ortner & Christian Breiteneder & Maia Rohm, 2019. "Robust and sparse k-means clustering for high-dimensional data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(4), pages 905-932, December.
- Pierpaolo D’Urso & Livia Giovanni & Riccardo Massari, 2015. "Trimmed fuzzy clustering for interval-valued data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(1), pages 21-40, March.
- Luis García-Escudero & Alfonso Gordaliza & Carlos Matrán & Agustín Mayo-Iscar, 2010. "A review of robust clustering methods," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 4(2), pages 89-109, September.
- Sylvia Frühwirth-Schnatter, 2011. "Panel data analysis: a survey on model-based clustering of time series," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 5(4), pages 251-280, December.
- Slaets, Leen & Claeskens, Gerda & Hubert, Mia, 2012. "Phase and amplitude-based clustering for functional data," Computational Statistics & Data Analysis, Elsevier, vol. 56(7), pages 2360-2374.
- Ricardo Fraiman & Badih Ghattas & Marcela Svarc, 2013. "Interpretable clustering using unsupervised binary trees," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(2), pages 125-145, June.
- Alessio Farcomeni & Antonio Punzo, 2020. "Robust model-based clustering with mild and gross outliers," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(4), pages 989-1007, December.
- C. Ruwet & L. García-Escudero & A. Gordaliza & A. Mayo-Iscar, 2013. "On the breakdown behavior of the TCLUST clustering procedure," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 22(3), pages 466-487, September.
- Pierpaolo D’Urso & Livia Giovanni & Riccardo Massari & Dario Lallo, 2013. "Noise fuzzy clustering of time series by autoregressive metric," METRON, Springer;Sapienza Università di Roma, vol. 71(3), pages 217-243, November.
- Farnè, Matteo & Vouldis, Angelos T., 2018. "A methodology for automised outlier detection in high-dimensional datasets: an application to euro area banks' supervisory data," Working Paper Series 2171, European Central Bank.
- Marek A. Dąbrowski & Monika Papież & Sławomir Śmiech, 2020.
"Classifying de facto exchange rate regimes of financially open and closed economies: A statistical approach,"
The Journal of International Trade & Economic Development, Taylor & Francis Journals, vol. 29(7), pages 821-849, October.
- Dąbrowski, Marek A. & Papież, Monika & Śmiech, Sławomir, 2019. "Classifying de facto exchange rate regimes of financially open and closed economies: A statistical approach," MPRA Paper 91348, University Library of Munich, Germany.
- Marco Riani & Andrea Cerioli & Domenico Perrotta & Francesca Torti, 2015. "Simulating mixtures of multivariate data with fixed cluster overlap in FSDA library," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(4), pages 461-481, December.
- Andrea Cappozzo & Francesca Greselin & Thomas Brendan Murphy, 2020. "A robust approach to model-based classification based on trimming and constraints," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 327-354, June.
More about this item
Keywords
Averaging; High dimensional data; k-medoids; Online clustering; Partitioning around medoids; Recursive estimators; Robbins–Monro; Stochastic approximation; Stochastic gradient;All these keywords.
Statistics
Access and download statisticsCorrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:56:y:2012:i:6:p:1434-1449. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .
Please note that corrections may take a couple of weeks to filter through the various RePEc services.