IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v143y2020ics0167947319301975.html
   My bibliography  Save this article

A Bayesian mixture model for clustering circular data

Author

Listed:
  • Rodríguez, Carlos E.
  • Núñez-Antonio, Gabriel
  • Escarela, Gabriel

Abstract

Clustering complex circular phenomena is a common problem in different scientific disciplines. Examples include the clustering of directions of animal movement in the wild to identify migration patterns, and the classification of angular positions of meteorological events to investigate seasonality fluctuations. The main goal is to develop a novel methodology for clustering and classification of circular data, under a Bayesian mixture modeling framework. The mixture model is defined assuming that the number of components is finite, but unknown, and that each component follows a projected normal distribution. Model selection is performed by jointly making inferences about the parameters of the mixture model and the number of components, choosing the model with the highest posterior probability. A deterministic relabeling strategy is used to recover identifiability for the components in the chosen model. Estimates of both the posterior classification probabilities and the scaled densities are approximated via the relabeled MCMC output. The proposed methods are illustrated using both simulated and real datasets, and performance comparisons with existing strategies are also given. The results suggest that the new approach is an appealing alternative for the clustering and classification of circular data.

Suggested Citation

  • Rodríguez, Carlos E. & Núñez-Antonio, Gabriel & Escarela, Gabriel, 2020. "A Bayesian mixture model for clustering circular data," Computational Statistics & Data Analysis, Elsevier, vol. 143(C).
  • Handle: RePEc:eee:csdana:v:143:y:2020:i:c:s0167947319301975
    DOI: 10.1016/j.csda.2019.106842
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167947319301975
    Download Restriction: Full text for ScienceDirect subscribers only.

    File URL: https://libkey.io/10.1016/j.csda.2019.106842?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    2. McVinish, R. & Mengersen, K., 2008. "Semiparametric Bayesian circular statistics," Computational Statistics & Data Analysis, Elsevier, vol. 52(10), pages 4722-4730, June.
    3. Sylvia. Richardson & Peter J. Green, 1997. "On Bayesian Analysis of Mixtures with an Unknown Number of Components (with discussion)," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 59(4), pages 731-792.
    4. Chang, Fang & Qiu, Weiliang & Zamar, Ruben H. & Lazarus, Ross & Wang, Xiaogang, 2010. "clues: An R Package for Nonparametric Clustering Based on Local Shrinking," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i04).
    5. S. Richardson & P. J. Green, 1998. "Corrigendum: On Bayesian analysis of mixtures with an unknown number of components," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 60(3), pages 661-661.
    6. Ulric Lund, 1999. "Least circular distance regression for directional data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 26(6), pages 723-733.
    7. Kaushik Ghosh & Rao Jammalamadaka & Ram Tiwari, 2003. "Semiparametric Bayesian Techniques for Problems in Circular Data," Journal of Applied Statistics, Taylor & Francis Journals, vol. 30(2), pages 145-161.
    8. Matthew Stephens, 2000. "Dealing with label switching in mixture models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 62(4), pages 795-809.
    9. repec:dau:papers:123456789/6040 is not listed on IDEAS
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Andrade, Ana C.C. & Pereira, Gustavo H.A. & Artes, Rinaldo, 2023. "The circular quantile residual," Computational Statistics & Data Analysis, Elsevier, vol. 178(C).
    2. Arthur Pewsey & Eduardo García-Portugués, 2021. "Recent advances in directional statistics," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(1), pages 1-58, March.
    3. Dong, Aqi & Melnykov, Volodymyr, 2024. "Contaminated Kent mixture model for clustering non-spherical directional data with heavy tails or scatter," Statistics & Probability Letters, Elsevier, vol. 208(C).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yuan Fang & Dimitris Karlis & Sanjeena Subedi, 2022. "Infinite Mixtures of Multivariate Normal-Inverse Gaussian Distributions for Clustering of Skewed Data," Journal of Classification, Springer;The Classification Society, vol. 39(3), pages 510-552, November.
    2. McVinish, R. & Mengersen, K., 2008. "Semiparametric Bayesian circular statistics," Computational Statistics & Data Analysis, Elsevier, vol. 52(10), pages 4722-4730, June.
    3. You, Na & Dai, Hongsheng & Wang, Xueqin & Yu, Qingyun, 2024. "Sequential estimation for mixture of regression models for heterogeneous population," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    4. Naderi, Mehrdad & Mirfarah, Elham & Wang, Wan-Lun & Lin, Tsung-I, 2023. "Robust mixture regression modeling based on the normal mean-variance mixture distributions," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    5. Wan-Lun Wang, 2019. "Mixture of multivariate t nonlinear mixed models for multiple longitudinal data with heterogeneity and missing values," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 28(1), pages 196-222, March.
    6. Aßmann, Christian & Boysen-Hogrefe, Jens & Pape, Markus, 2012. "The directional identification problem in Bayesian factor analysis: An ex-post approach," Kiel Working Papers 1799, Kiel Institute for the World Economy (IfW Kiel).
    7. Ungolo, Francesco & Kleinow, Torsten & Macdonald, Angus S., 2020. "A hierarchical model for the joint mortality analysis of pension scheme data with missing covariates," Insurance: Mathematics and Economics, Elsevier, vol. 91(C), pages 68-84.
    8. Park, Byung-Jung & Zhang, Yunlong & Lord, Dominique, 2010. "Bayesian mixture modeling approach to account for heterogeneity in speed data," Transportation Research Part B: Methodological, Elsevier, vol. 44(5), pages 662-673, June.
    9. Wang, Ketong & Porter, Michael D., 2018. "Optimal Bayesian clustering using non-negative matrix factorization," Computational Statistics & Data Analysis, Elsevier, vol. 128(C), pages 395-411.
    10. Alessio Farcomeni & Antonio Punzo, 2020. "Robust model-based clustering with mild and gross outliers," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(4), pages 989-1007, December.
    11. Kozumi, Hideo, 2004. "Posterior analysis of latent competing risk models by parallel tempering," Computational Statistics & Data Analysis, Elsevier, vol. 46(3), pages 441-458, June.
    12. Im, Yunju & Tan, Aixin, 2021. "Bayesian subgroup analysis in regression using mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 162(C).
    13. Kazuhiko Kakamu, 2022. "Bayesian analysis of mixtures of lognormal distribution with an unknown number of components from grouped data," Papers 2210.05115, arXiv.org, revised Sep 2023.
    14. Murray, Paula M. & Browne, Ryan P. & McNicholas, Paul D., 2017. "Hidden truncation hyperbolic distributions, finite mixtures thereof, and their application for clustering," Journal of Multivariate Analysis, Elsevier, vol. 161(C), pages 141-156.
    15. Sylvia Frühwirth-Schnatter & Gertraud Malsiner-Walli, 2019. "From here to infinity: sparse finite versus Dirichlet process mixtures in model-based clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(1), pages 33-64, March.
    16. Moya, Blake & Walker, Stephen G., 2024. "Full uncertainty analysis for Bayesian nonparametric mixture models," Computational Statistics & Data Analysis, Elsevier, vol. 189(C).
    17. Jia-Chiun Pan & Chih-Min Liu & Hai-Gwo Hwu & Guan-Hua Huang, 2015. "Allocation Variable-Based Probabilistic Algorithm to Deal with Label Switching Problem in Bayesian Mixture Models," PLOS ONE, Public Library of Science, vol. 10(10), pages 1-23, October.
    18. Weber, Anett & Steiner, Winfried J., 2021. "Modeling price response from retail sales: An empirical comparison of models with different representations of heterogeneity," European Journal of Operational Research, Elsevier, vol. 294(3), pages 843-859.
    19. Jonathan Jaeger & Philippe Lambert, 2014. "Bayesian penalized smoothing approaches in models specified using differential equations with unknown error distributions," Journal of Applied Statistics, Taylor & Francis Journals, vol. 41(12), pages 2709-2726, December.
    20. Riccardo Rastelli & Michael Fop, 2020. "A stochastic block model for interaction lengths," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 14(2), pages 485-512, June.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:143:y:2020:i:c:s0167947319301975. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.