IDEAS home Printed from https://ideas.repec.org/a/spr/advdac/v17y2023i2d10.1007_s11634-022-00501-x.html
   My bibliography  Save this article

Modal clustering of matrix-variate data

Author

Listed:
  • Federico Ferraccioli

    (Università degli Studi di Padova)

  • Giovanna Menardi

    (Università degli Studi di Padova)

Abstract

The nonparametric formulation of density-based clustering, known as modal clustering, draws a correspondence between groups and the attraction domains of the modes of the density function underlying the data. Its probabilistic foundation allows for a natural, yet not trivial, generalization of the approach to the matrix-valued setting, increasingly widespread, for example, in longitudinal and multivariate spatio-temporal studies. In this work we introduce nonparametric estimators of matrix-variate distributions based on kernel methods, and analyze their asymptotic properties. Additionally, we propose a generalization of the mean-shift procedure for the identification of the modes of the estimated density. Given the intrinsic high dimensionality of matrix-variate data, we discuss some locally adaptive solutions to handle the problem. We test the procedure via extensive simulations, also with respect to some competitors, and illustrate its performance through two high-dimensional real data applications.

Suggested Citation

  • Federico Ferraccioli & Giovanna Menardi, 2023. "Modal clustering of matrix-variate data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 323-345, June.
  • Handle: RePEc:spr:advdac:v:17:y:2023:i:2:d:10.1007_s11634-022-00501-x
    DOI: 10.1007/s11634-022-00501-x
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s11634-022-00501-x
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s11634-022-00501-x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Vermunt, Jeroen K., 2007. "A hierarchical mixture model for clustering three-way data sets," Computational Statistics & Data Analysis, Elsevier, vol. 51(11), pages 5368-5376, July.
    2. Viroli, Cinzia, 2012. "On matrix-variate regression analysis," Journal of Multivariate Analysis, Elsevier, vol. 111(C), pages 296-309.
    3. Aliyari Ghassabeh, Youness, 2015. "A sufficient condition for the convergence of the mean shift algorithm with Gaussian kernel," Journal of Multivariate Analysis, Elsevier, vol. 135(C), pages 1-10.
    4. Kaye Basford & Geoffrey McLachlan, 1985. "The mixture method of clustering applied to three-way data," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 109-125, December.
    5. Amandine Schmutz & Julien Jacques & Charles Bouveyron & Laurence Chèze & Pauline Martin, 2020. "Clustering multivariate functional data in group-specific functional subspaces," Computational Statistics, Springer, vol. 35(3), pages 1101-1131, September.
    6. Hua Zhou & Lexin Li, 2014. "Regularized matrix regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 76(2), pages 463-483, March.
    7. Christopher R. Genovese & Marco Perone-Pacifico & Isabella Verdinelli & Larry Wasserman, 2016. "Non-parametric inference for density modes," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 78(1), pages 99-126, January.
    8. Duong, Tarn & Cowling, Arianna & Koch, Inge & Wand, M.P., 2008. "Feature significance for multivariate kernel density estimation," Computational Statistics & Data Analysis, Elsevier, vol. 52(9), pages 4225-4242, May.
    9. Tomarchio, Salvatore D. & Punzo, Antonio & Bagnato, Luca, 2020. "Two new matrix-variate distributions with application in model-based clustering," Computational Statistics & Data Analysis, Elsevier, vol. 152(C).
    10. Jacques, Julien & Preda, Cristian, 2014. "Model-based clustering for multivariate functional data," Computational Statistics & Data Analysis, Elsevier, vol. 71(C), pages 92-106.
    11. Caro-Lopera, Francisco J. & González Farías, Graciela & Balakrishnan, Narayanaswamy, 2016. "Matrix-variate distribution theory under elliptical models-4: Joint distribution of latent roots of covariance matrix and the largest and smallest latent roots," Journal of Multivariate Analysis, Elsevier, vol. 145(C), pages 224-235.
    12. Giovanna Menardi, 2016. "A Review on Modal Clustering," International Statistical Review, International Statistical Institute, vol. 84(3), pages 413-433, December.
    13. Sarkar, Shuchismita & Zhu, Xuwen & Melnykov, Volodymyr & Ingrassia, Salvatore, 2020. "On parsimonious models for modeling matrix data," Computational Statistics & Data Analysis, Elsevier, vol. 142(C).
    14. Shanshan Ding & R. Dennis Cook, 2018. "Matrix variate regressions and envelope models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 80(2), pages 387-408, March.
    15. Maurizio Vichi & Roberto Rocci & Henk A.L. Kiers, 2007. "Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches," Journal of Classification, Springer;The Classification Society, vol. 24(1), pages 71-98, June.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Salvatore D. Tomarchio & Paul D. McNicholas & Antonio Punzo, 2021. "Matrix Normal Cluster-Weighted Models," Journal of Classification, Springer;The Classification Society, vol. 38(3), pages 556-575, October.
    2. Alessandro Casa & Giovanna Menardi, 2022. "Nonparametric semi-supervised classification with application to signal detection in high energy physics," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 31(3), pages 531-550, September.
    3. Pieter C. Schoonees & Patrick J. F. Groenen & Michel Velden, 2022. "Least-squares bilinear clustering of three-way data," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(4), pages 1001-1037, December.
    4. Golovkine, Steven & Klutchnikoff, Nicolas & Patilea, Valentin, 2022. "Clustering multivariate functional data using unsupervised binary trees," Computational Statistics & Data Analysis, Elsevier, vol. 168(C).
    5. Simon Blanchard & Wayne DeSarbo, 2013. "A New Zero-Inflated Negative Binomial Methodology for Latent Category Identification," Psychometrika, Springer;The Psychometric Society, vol. 78(2), pages 322-340, April.
    6. José E. Chacón, 2020. "The Modal Age of Statistics," International Statistical Review, International Statistical Institute, vol. 88(1), pages 122-141, April.
    7. Philip A. White & Alan E. Gelfand, 2021. "Multivariate functional data modeling with time-varying clustering," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 30(3), pages 586-602, September.
    8. Lingzhe Guo & Reza Modarres, 2020. "Testing the equality of matrix distributions," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 29(2), pages 289-307, June.
    9. José E. Chacón, 2019. "Mixture model modal clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(2), pages 379-404, June.
    10. Donatella Vicari & Paolo Giordani, 2023. "CPclus: Candecomp/Parafac Clustering Model for Three-Way Data," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 432-465, July.
    11. Leonardo Salvatore Alaimo & Francesco Amato & Filomena Maggino & Alfonso Piscitelli & Emiliano Seri, 2023. "A Comparison of Migrant Integration Policies via Mixture of Matrix-Normals," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 165(2), pages 473-494, January.
    12. Fang, Kuangnan & Chen, Yuanxing & Ma, Shuangge & Zhang, Qingzhao, 2022. "Biclustering analysis of functionals via penalized fusion," Journal of Multivariate Analysis, Elsevier, vol. 189(C).
    13. Xuwen Zhu & Yana Melnykov, 2022. "On Finite Mixture Modeling of Change-point Processes," Journal of Classification, Springer;The Classification Society, vol. 39(1), pages 3-22, March.
    14. Konstantin Eckle & Nicolai Bissantz & Holger Dette & Katharina Proksch & Sabrina Einecke, 2018. "Multiscale inference for a multivariate density with applications to X-ray astronomy," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 70(3), pages 647-689, June.
    15. Alex Sharp & Glen Chalatov & Ryan P. Browne, 2023. "A dual subspace parsimonious mixture of matrix normal distributions," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(3), pages 801-822, September.
    16. Rhoden, Imke & Weller, Daniel & Voit, Ann-Katrin, 2021. "Spatio-temporal dynamics of European innovation: An exploratory approach via multivariate functional data cluster analysis," Ruhr Economic Papers 926, RWI - Leibniz-Institut für Wirtschaftsforschung, Ruhr-University Bochum, TU Dortmund University, University of Duisburg-Essen.
    17. Wei Hu & Tianyu Pan & Dehan Kong & Weining Shen, 2021. "Nonparametric matrix response regression with application to brain imaging data analysis," Biometrics, The International Biometric Society, vol. 77(4), pages 1227-1240, December.
    18. Amovin-Assagba, Martial & Gannaz, Irène & Jacques, Julien, 2022. "Outlier detection in multivariate functional data through a contaminated mixture model," Computational Statistics & Data Analysis, Elsevier, vol. 174(C).
    19. Chacón, José E. & Fernández Serrano, Javier, 2024. "Bayesian taut splines for estimating the number of modes," Computational Statistics & Data Analysis, Elsevier, vol. 196(C).
    20. Mirosław Krzyśko & Łukasz Smaga, 2017. "An Application Of Functional Multivariate Regression Model To Multiclass Classification," Statistics in Transition New Series, Polish Statistical Association, vol. 18(3), pages 433-442, September.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:advdac:v:17:y:2023:i:2:d:10.1007_s11634-022-00501-x. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.