IDEAS home Printed from https://ideas.repec.org/a/eee/csdana/v53y2009i8p3194-3208.html
   My bibliography  Save this article

Clustering and disjoint principal component analysis

Author

Listed:
  • Vichi, Maurizio
  • Saporta, Gilbert

Abstract

A constrained principal component analysis, which aims at a simultaneous clustering of objects and a partitioning of variables, is proposed. The new methodology allows us to identify components with maximum variance, each one a linear combination of a subset of variables. All the subsets form a partition of variables. Simultaneously, a partition of objects is also computed maximizing the between cluster variance. The methodology is formulated in a semi-parametric least-squares framework as a quadratic mixed continuous and integer problem. An alternating least-squares algorithm is proposed to solve the clustering and disjoint PCA. Two applications are given to show the features of the methodology.

Suggested Citation

  • Vichi, Maurizio & Saporta, Gilbert, 2009. "Clustering and disjoint principal component analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3194-3208, June.
  • Handle: RePEc:eee:csdana:v:53:y:2009:i:8:p:3194-3208
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167-9473(08)00293-4
    Download Restriction: Full text for ScienceDirect subscribers only.
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Henry Kaiser, 1958. "The varimax criterion for analytic rotation in factor analysis," Psychometrika, Springer;The Psychometric Society, vol. 23(3), pages 187-200, September.
    2. Willem Heiser & Patrick Groenen, 1997. "Cluster differences scaling with a within-clusters loss component and a fuzzy successive approximation strategy to avoid local minima," Psychometrika, Springer;The Psychometric Society, vol. 62(1), pages 63-83, March.
    3. Vichi, Maurizio & Kiers, Henk A. L., 2001. "Factorial k-means analysis for two-way data," Computational Statistics & Data Analysis, Elsevier, vol. 37(1), pages 49-64, July.
    4. Maurizio Vichi & Roberto Rocci & Henk A.L. Kiers, 2007. "Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches," Journal of Classification, Springer;The Classification Society, vol. 24(1), pages 71-98, June.
    5. Geert Soete & Willem Heiser, 1993. "A latent class unfolding model for analyzing single stimulus preference ratings," Psychometrika, Springer;The Psychometric Society, vol. 58(4), pages 545-565, December.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Blasius, J. & Greenacre, M. & Groenen, P.J.F. & van de Velden, M., 2009. "Special issue on correspondence analysis and related methods," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3103-3106, June.
    2. Cristina Tortora & Mireille Gettler Summa & Marina Marino & Francesco Palumbo, 2016. "Factor probabilistic distance clustering (FPDC): a new clustering method," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 10(4), pages 441-464, December.
    3. Alfonso Iodice D’Enza & Francesco Palumbo, 2013. "Iterative factor clustering of binary data," Computational Statistics, Springer, vol. 28(2), pages 789-807, April.
    4. Nickolay T. Trendafilov & Tsegay Gebrehiwot Gebru, 2016. "Recipes for sparse LDA of horizontal data," METRON, Springer;Sapienza Università di Roma, vol. 74(2), pages 207-221, August.
    5. Vanessa Kuentz-Simonet & Amaury Labenne & Tina Rambonilaza, 2017. "Using ClustOfVar to Construct Quality of Life Indicators for Vulnerability Assessment Municipality Trajectories in Southwest France from 1999 to 2009," Social Indicators Research: An International and Interdisciplinary Journal for Quality-of-Life Measurement, Springer, vol. 131(3), pages 973-997, April.
    6. Kohei Adachi & Nickolay T. Trendafilov, 2018. "Sparsest factor analysis for clustering variables: a matrix decomposition approach," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 12(3), pages 559-585, September.
    7. Maurizio Vichi, 2017. "Disjoint factor analysis with cross-loadings," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(3), pages 563-591, September.
    8. Naoto Yamashita, 2023. "Principal component analysis constrained by layered simple structures," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(2), pages 347-367, June.
    9. Nerea González-García & Ana Belén Nieto-Librero & Purificación Galindo-Villardón, 2023. "CenetBiplot: a new proposal of sparse and orthogonal biplots methods by means of elastic net CSVD," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 17(1), pages 5-19, March.
    10. Zhang, Bo & Chen, G.Q. & Xia, X.H. & Li, S.C. & Chen, Z.M. & Ji, Xi, 2012. "Environmental emissions by Chinese industry: Exergy-based unifying assessment," Energy Policy, Elsevier, vol. 45(C), pages 490-501.
    11. José Fernando Romero Cañizares & Purificación Vicente Galindo & Yannis Phillis & Evangelos Grigoroudis, 2022. "Graphical sustainability analysis using disjoint biplots," Operational Research, Springer, vol. 22(2), pages 1575-1596, April.
    12. Carlos Martin-Barreiro & John A. Ramirez-Figueroa & Ana B. Nieto-Librero & Víctor Leiva & Ana Martin-Casado & M. Purificación Galindo-Villardón, 2021. "A New Algorithm for Computing Disjoint Orthogonal Components in the Three-Way Tucker Model," Mathematics, MDPI, vol. 9(3), pages 1-22, January.
    13. Bauer, Jan O. & Drabant, Bernhard, 2023. "Regression based thresholds in principal loading analysis," Journal of Multivariate Analysis, Elsevier, vol. 193(C).
    14. Adelaide Freitas & Eloísa Macedo & Maurizio Vichi, 2021. "An empirical comparison of two approaches for CDPCA in high-dimensional data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(3), pages 1007-1031, September.
    15. Lazhar Labiod & Mohamed Nadif, 2021. "Efficient regularized spectral data embedding," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 15(1), pages 99-119, March.
    16. Carlo Cavicchia & Maurizio Vichi & Giorgia Zaccaria, 2023. "Hierarchical disjoint principal component analysis," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 107(3), pages 537-574, September.
    17. Carlos Martin-Barreiro & John A. Ramirez-Figueroa & Xavier Cabezas & Victor Leiva & Ana Martin-Casado & M. Purificación Galindo-Villardón, 2021. "A New Algorithm for Computing Disjoint Orthogonal Components in the Parallel Factor Analysis Model with Simulations and Applications to Real-World Data," Mathematics, MDPI, vol. 9(17), pages 1-22, August.
    18. Yannis Yatracos, 2013. "Detecting Clusters in the Data from Variance Decompositions of Its Projections," Journal of Classification, Springer;The Classification Society, vol. 30(1), pages 30-55, April.
    19. Gianluca Sottile & Giada Adelfio, 2019. "Clusters of effects curves in quantile regression models," Computational Statistics, Springer, vol. 34(2), pages 551-569, June.
    20. Fidele Tugizimana & Paul A Steenkamp & Lizelle A Piater & Ian A Dubery, 2014. "Multi-Platform Metabolomic Analyses of Ergosterol-Induced Dynamic Changes in Nicotiana tabacum Cells," PLOS ONE, Public Library of Science, vol. 9(1), pages 1-18, January.
    21. Donatella Vicari & Paolo Giordani, 2023. "CPclus: Candecomp/Parafac Clustering Model for Three-Way Data," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 432-465, July.
    22. Nickolay Trendafilov, 2014. "From simple structure to sparse components: a review," Computational Statistics, Springer, vol. 29(3), pages 431-454, June.
    23. Mitzi Cubilla-Montilla & Ana Belén Nieto-Librero & M. Purificación Galindo-Villardón & Carlos A. Torres-Cubilla, 2021. "Sparse HJ Biplot: A New Methodology via Elastic Net," Mathematics, MDPI, vol. 9(11), pages 1-15, June.
    24. Jérome SARACCO & Marie CHAVENT & Vanessa KUENTZ, 2010. "Clustering of categorical variables around latent variables," Cahiers du GREThA (2007-2019) 2010-02, Groupe de Recherche en Economie Théorique et Appliquée (GREThA).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. DeSarbo, Wayne S. & Selin Atalay, A. & Blanchard, Simon J., 2009. "A three-way clusterwise multidimensional unfolding procedure for the spatial representation of context dependent preferences," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3217-3230, June.
    2. J. Vera & Rodrigo Macías & Willem Heiser, 2013. "Cluster Differences Unfolding for Two-Way Two-Mode Preference Rating Data," Journal of Classification, Springer;The Classification Society, vol. 30(3), pages 370-396, October.
    3. Van Mechelen, Iven & Schepers, Jan, 2007. "A unifying model involving a categorical and/or dimensional reduction for multimode data," Computational Statistics & Data Analysis, Elsevier, vol. 52(1), pages 537-549, September.
    4. Roberto Rocci & Stefano Gattone & Maurizio Vichi, 2011. "A New Dimension Reduction Method: Factor Discriminant K-means," Journal of Classification, Springer;The Classification Society, vol. 28(2), pages 210-226, July.
    5. Naoto Yamashita & Shin-ichi Mayekawa, 2015. "A new biplot procedure with joint classification of objects and variables by fuzzy c-means clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(3), pages 243-266, September.
    6. Vera, J. Fernando & Macas, Rodrigo & Heiser, Willem J., 2009. "A dual latent class unfolding model for two-way two-mode preference rating data," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3231-3244, June.
    7. J. Vera & Rodrigo Macías & Willem Heiser, 2009. "A Latent Class Multidimensional Scaling Model for Two-Way One-Mode Continuous Rating Dissimilarity Data," Psychometrika, Springer;The Psychometric Society, vol. 74(2), pages 297-315, June.
    8. José Fernando Romero Cañizares & Purificación Vicente Galindo & Yannis Phillis & Evangelos Grigoroudis, 2022. "Graphical sustainability analysis using disjoint biplots," Operational Research, Springer, vol. 22(2), pages 1575-1596, April.
    9. Vichi, Maurizio & Kiers, Henk A. L., 2001. "Factorial k-means analysis for two-way data," Computational Statistics & Data Analysis, Elsevier, vol. 37(1), pages 49-64, July.
    10. Dirk Depril & Iven Mechelen & Tom Wilderjans, 2012. "Lowdimensional Additive Overlapping Clustering," Journal of Classification, Springer;The Classification Society, vol. 29(3), pages 297-320, October.
    11. Donatella Vicari & Paolo Giordani, 2023. "CPclus: Candecomp/Parafac Clustering Model for Three-Way Data," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 432-465, July.
    12. Paolo Giordani & Henk Kiers, 2012. "FINDCLUS: Fuzzy INdividual Differences CLUStering," Journal of Classification, Springer;The Classification Society, vol. 29(2), pages 170-198, July.
    13. Roberto Rocci & Maurizio Vichi, 2005. "Three-Mode Component Analysis with Crisp or Fuzzy Partition of Units," Psychometrika, Springer;The Psychometric Society, vol. 70(4), pages 715-736, December.
    14. Timmerman, Marieke E. & Ceulemans, Eva & Kiers, Henk A.L. & Vichi, Maurizio, 2010. "Factorial and reduced K-means reconsidered," Computational Statistics & Data Analysis, Elsevier, vol. 54(7), pages 1858-1871, July.
    15. Luca Greco & Antonio Lucadamo & Pietro Amenta, 2020. "An Impartial Trimming Approach for Joint Dimension and Sample Reduction," Journal of Classification, Springer;The Classification Society, vol. 37(3), pages 769-788, October.
    16. Michio Yamamoto & Heungsun Hwang, 2017. "Dimension-Reduced Clustering of Functional Data via Subspace Separation," Journal of Classification, Springer;The Classification Society, vol. 34(2), pages 294-326, July.
    17. Bonhomme, Stphane & Robin, Jean-Marc, 2009. "Consistent noisy independent component analysis," Journal of Econometrics, Elsevier, vol. 149(1), pages 12-25, April.
    18. Fernando Castelló-Sirvent & Pablo Pinazo-Dallenbach, 2021. "Corruption Shock in Mexico: fsQCA Analysis of Entrepreneurial Intention in University Students," Mathematics, MDPI, vol. 9(14), pages 1-31, July.
    19. Rodríguez-Fuentes, Carlos Javier & Hernández-López, Montserrat, 1997. "Análisis de diferencias estructurales interregionales determinantes en el impacto de la política monetaria," Estudios de Economia Aplicada, Estudios de Economia Aplicada, vol. 7, pages 141-157, Junio.
    20. Ivaldi, Enrico, 2013. "Proposal of a country risk index based on a factorial analysis - Una proposta di indice di rischio paese basato sull’analisi fattoriale: una applicazione ai paesi del sud del Mediterraneo e ai paesi d," Economia Internazionale / International Economics, Camera di Commercio Industria Artigianato Agricoltura di Genova, vol. 66(2), pages 231-249.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:csdana:v:53:y:2009:i:8:p:3194-3208. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/csda .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.