IDEAS home Printed from https://ideas.repec.org/a/spr/jclass/v34y2017i2d10.1007_s00357-017-9232-z.html
   My bibliography  Save this article

Dimension-Reduced Clustering of Functional Data via Subspace Separation

Author

Listed:
  • Michio Yamamoto

    (Kyoto University Graduate School of Medicine)

  • Heungsun Hwang

    (McGill University)

Abstract

We propose a new method for finding an optimal cluster structure of functions as well as an optimal subspace for clustering simultaneously. The proposed method aims to minimize a distance between functional objects and their projections with the imposition of clustering penalties. It includes existing approaches to functional cluster analysis and dimension reduction, such as functional principal component k-means (Yamamoto, 2012) and functional factorial k-means (Yamamoto and Terada, 2014), as special cases. We show that these existing methods can perform poorly when a disturbing structure exists and that the proposed method can overcome this drawback by using subspace separation. A novel model selection procedure has been proposed, which can also be applied to other joint analyses of dimension reduction and clustering. We apply the proposed method to artificial and real data to demonstrate its performance as compared to the extant approaches.

Suggested Citation

  • Michio Yamamoto & Heungsun Hwang, 2017. "Dimension-Reduced Clustering of Functional Data via Subspace Separation," Journal of Classification, Springer;The Classification Society, vol. 34(2), pages 294-326, July.
  • Handle: RePEc:spr:jclass:v:34:y:2017:i:2:d:10.1007_s00357-017-9232-z
    DOI: 10.1007/s00357-017-9232-z
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s00357-017-9232-z
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s00357-017-9232-z?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Robert Tibshirani & Guenther Walther & Trevor Hastie, 2001. "Estimating the number of clusters in a data set via the gap statistic," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 63(2), pages 411-423.
    2. Friedman, Jerome H. & Hastie, Trevor & Tibshirani, Rob, 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 33(i01).
    3. Ocaña, F. A. & Aguilera, A. M. & Valderrama, M. J., 1999. "Functional Principal Components Analysis by Choice of Norm," Journal of Multivariate Analysis, Elsevier, vol. 71(2), pages 262-276, November.
    4. Vichi, Maurizio & Kiers, Henk A. L., 2001. "Factorial k-means analysis for two-way data," Computational Statistics & Data Analysis, Elsevier, vol. 37(1), pages 49-64, July.
    5. Mazumder, Rahul & Friedman, Jerome H. & Hastie, Trevor, 2011. "SparseNet: Coordinate Descent With Nonconvex Penalties," Journal of the American Statistical Association, American Statistical Association, vol. 106(495), pages 1125-1138.
    6. Lawrence Hubert & Phipps Arabie, 1985. "Comparing partitions," Journal of Classification, Springer;The Classification Society, vol. 2(1), pages 193-218, December.
    7. Reiss, Philip T. & Ogden, R. Todd, 2007. "Functional Principal Component Regression and Functional Partial Least Squares," Journal of the American Statistical Association, American Statistical Association, vol. 102, pages 984-996, September.
    8. Maurizio Vichi & Roberto Rocci & Henk A.L. Kiers, 2007. "Simultaneous Component and Clustering Models for Three-way Data: Within and Between Approaches," Journal of Classification, Springer;The Classification Society, vol. 24(1), pages 71-98, June.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Virta, Joni & Li, Bing & Nordhausen, Klaus & Oja, Hannu, 2020. "Independent component analysis for multivariate functional data," Journal of Multivariate Analysis, Elsevier, vol. 176(C).
    2. Amandine Schmutz & Julien Jacques & Charles Bouveyron & Laurence Chèze & Pauline Martin, 2020. "Clustering multivariate functional data in group-specific functional subspaces," Computational Statistics, Springer, vol. 35(3), pages 1101-1131, September.
    3. Weikuan Jia & Dean Zhao & Ling Ding & Yuanjie Zheng, 2019. "A Reliable Small Sample Classification Algorithm by Elman Neural Network Based on PLS and GA," Journal of Classification, Springer;The Classification Society, vol. 36(2), pages 306-321, July.
    4. Matthieu Marbac & Mohammed Sedki & Tienne Patin, 2020. "Variable Selection for Mixed Data Clustering: Application in Human Population Genomics," Journal of Classification, Springer;The Classification Society, vol. 37(1), pages 124-142, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roberto Rocci & Stefano Gattone & Maurizio Vichi, 2011. "A New Dimension Reduction Method: Factor Discriminant K-means," Journal of Classification, Springer;The Classification Society, vol. 28(2), pages 210-226, July.
    2. Naoto Yamashita & Shin-ichi Mayekawa, 2015. "A new biplot procedure with joint classification of objects and variables by fuzzy c-means clustering," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 9(3), pages 243-266, September.
    3. Fordellone, Mario & Vichi, Maurizio, 2020. "Finding groups in structural equation modeling through the partial least squares algorithm," Computational Statistics & Data Analysis, Elsevier, vol. 147(C).
    4. Donatella Vicari & Paolo Giordani, 2023. "CPclus: Candecomp/Parafac Clustering Model for Three-Way Data," Journal of Classification, Springer;The Classification Society, vol. 40(2), pages 432-465, July.
    5. Luca Greco & Antonio Lucadamo & Pietro Amenta, 2020. "An Impartial Trimming Approach for Joint Dimension and Sample Reduction," Journal of Classification, Springer;The Classification Society, vol. 37(3), pages 769-788, October.
    6. Li, Pai-Ling & Chiou, Jeng-Min, 2011. "Identifying cluster number for subspace projected functional data clustering," Computational Statistics & Data Analysis, Elsevier, vol. 55(6), pages 2090-2103, June.
    7. Yaeji Lim & Hee-Seok Oh & Ying Kuen Cheung, 2019. "Multiscale Clustering for Functional Data," Journal of Classification, Springer;The Classification Society, vol. 36(2), pages 368-391, July.
    8. Dong Liu & Changwei Zhao & Yong He & Lei Liu & Ying Guo & Xinsheng Zhang, 2023. "Simultaneous cluster structure learning and estimation of heterogeneous graphs for matrix‐variate fMRI data," Biometrics, The International Biometric Society, vol. 79(3), pages 2246-2259, September.
    9. DeSarbo, Wayne S. & Selin Atalay, A. & Blanchard, Simon J., 2009. "A three-way clusterwise multidimensional unfolding procedure for the spatial representation of context dependent preferences," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3217-3230, June.
    10. Jin, Shaobo & Moustaki, Irini & Yang-Wallentin, Fan, 2018. "Approximated penalized maximum likelihood for exploratory factor analysis: an orthogonal case," LSE Research Online Documents on Economics 88118, London School of Economics and Political Science, LSE Library.
    11. Vichi, Maurizio & Saporta, Gilbert, 2009. "Clustering and disjoint principal component analysis," Computational Statistics & Data Analysis, Elsevier, vol. 53(8), pages 3194-3208, June.
    12. J. Fernando Vera & Rodrigo Macías, 2021. "On the Behaviour of K-Means Clustering of a Dissimilarity Matrix by Means of Full Multidimensional Scaling," Psychometrika, Springer;The Psychometric Society, vol. 86(2), pages 489-513, June.
    13. Ciarleglio, Adam & Todd Ogden, R., 2016. "Wavelet-based scalar-on-function finite mixture regression models," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 86-96.
    14. Zhiguang Huo & Li Zhu & Tianzhou Ma & Hongcheng Liu & Song Han & Daiqing Liao & Jinying Zhao & George Tseng, 2020. "Two-Way Horizontal and Vertical Omics Integration for Disease Subtype Discovery," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(1), pages 1-22, April.
    15. Nathalia Castellanos & Dhruv Desai & Sebastian Frank & Stefano Pasquali & Dhagash Mehta, 2024. "Can an unsupervised clustering algorithm reproduce a categorization system?," Papers 2408.10340, arXiv.org.
    16. Siwei Xia & Yuehan Yang & Hu Yang, 2022. "Sparse Laplacian Shrinkage with the Graphical Lasso Estimator for Regression Problems," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 31(1), pages 255-277, March.
    17. van de Velden, M. & Iodice D' Enza, A. & Palumbo, F., 2014. "Cluster Correspondence Analysis," Econometric Institute Research Papers EI 2014-24, Erasmus University Rotterdam, Erasmus School of Economics (ESE), Econometric Institute.
    18. Michio Yamamoto, 2012. "Clustering of functional data in a low-dimensional subspace," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 6(3), pages 219-247, October.
    19. Nilsen Gro & Borgan Ørnulf & LiestØl Knut & Lingjærde Ole Christian, 2013. "Identifying clusters in genomics data by recursive partitioning," Statistical Applications in Genetics and Molecular Biology, De Gruyter, vol. 12(5), pages 637-652, October.
    20. Liu, Wenchen & Tang, Yincai & Wu, Xianyi, 2020. "Separating variables to accelerate non-convex regularized optimization," Computational Statistics & Data Analysis, Elsevier, vol. 147(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:jclass:v:34:y:2017:i:2:d:10.1007_s00357-017-9232-z. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.