IDEAS home Printed from https://ideas.repec.org/a/eee/jmvana/v132y2014icp9-24.html
   My bibliography  Save this article

The analysis of distance of grouped data with categorical variables: Categorical canonical variate analysis

Author

Listed:
  • Le Roux, Niël J.
  • Gardner-Lubbe, Sugnet
  • Gower, John C.

Abstract

We use generalised biplots to develop the important special case of (i) when all variables are categorical and (ii) the samples fall into K recognised groups. We term this Categorical Canonical Variate Analysis (CatCVA), because it has similar characteristics to Rao’s Canonical Variate Analysis (CVA), especially its visual aspects. It allows centroids of groups to be exhibited in increasing numbers of dimensions, together with information on within-group sample variation. Variables are represented by category-level-points (CLPs) which are a counterpart of numerically calibrated biplot axes for quantitative variables. Mechanisms are provided for relating the samples to their category levels, for giving convex regions to help predict categories, and for adding new samples. Inter-sample distance may be measured by any Euclidean embeddable distance. Computation is minimised by working in the K−1 dimensional space containing the group centroids.

Suggested Citation

  • Le Roux, Niël J. & Gardner-Lubbe, Sugnet & Gower, John C., 2014. "The analysis of distance of grouped data with categorical variables: Categorical canonical variate analysis," Journal of Multivariate Analysis, Elsevier, vol. 132(C), pages 9-24.
  • Handle: RePEc:eee:jmvana:v:132:y:2014:i:c:p:9-24
    DOI: 10.1016/j.jmva.2014.07.014
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0047259X14001717
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.jmva.2014.07.014?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. J. Gower & P. Legendre, 1986. "Metric and Euclidean properties of dissimilarity coefficients," Journal of Classification, Springer;The Classification Society, vol. 3(1), pages 5-48, March.
    2. Gardner, Sugnet & Gower, John C. & le Roux, N.J., 2006. "A synthesis of canonical variate analysis, generalised canonical correlation and Procrustes analysis," Computational Statistics & Data Analysis, Elsevier, vol. 50(1), pages 107-134, January.
    3. John Gower & Niel Roux & Sugnet Gardner-Lubbe, 2014. "The Canonical Analysis of Distance," Journal of Classification, Springer;The Classification Society, vol. 31(1), pages 107-128, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. John C. Gower & Niël J. Le Roux & Sugnet Gardner-Lubbe, 2022. "Properties of individual differences scaling and its interpretation," Statistical Papers, Springer, vol. 63(4), pages 1221-1245, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Guohuan Su & Adam Mertel & Sébastien Brosse & Justin M. Calabrese, 2023. "Species invasiveness and community invasibility of North American freshwater fish fauna revealed via trait-based analysis," Nature Communications, Nature, vol. 14(1), pages 1-12, December.
    2. Balepur, Prashant Narayan, 1998. "Impacts of Computer-Mediated Communication on Travel and Communication Patterns: The Davis Community Network Study," Institute of Transportation Studies, Research Reports, Working Papers, Proceedings qt6cb1f85c, Institute of Transportation Studies, UC Berkeley.
    3. Douglas L. Steinley & M. J. Brusco, 2019. "Using an Iterative Reallocation Partitioning Algorithm to Verify Test Multidimensionality," Journal of Classification, Springer;The Classification Society, vol. 36(3), pages 397-413, October.
    4. Anna Maria D’Arcangelis & Giulia Rotundo, 2016. "Complex Networks in Finance," Lecture Notes in Economics and Mathematical Systems, in: Pasquale Commendatore & Mariano Matilla-García & Luis M. Varela & Jose S. Cánovas (ed.), Complex Networks and Dynamics, pages 209-235, Springer.
    5. Carla Coltharp & Rene P Kessler & Jie Xiao, 2012. "Accurate Construction of Photoactivated Localization Microscopy (PALM) Images for Quantitative Measurements," PLOS ONE, Public Library of Science, vol. 7(12), pages 1-15, December.
    6. S. T. Buckland & Y. Yuan & E. Marcon, 2017. "Measuring temporal trends in biodiversity," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 101(4), pages 461-474, October.
    7. Matthijs Warrens, 2008. "On the Indeterminacy of Resemblance Measures for Binary (Presence/Absence) Data," Journal of Classification, Springer;The Classification Society, vol. 25(1), pages 125-136, June.
    8. Stefano Bonnini & Getnet Melak Assegie & Kamila Trzcinska, 2024. "Review about the Permutation Approach in Hypothesis Testing," Mathematics, MDPI, vol. 12(17), pages 1-29, August.
    9. Kivimäki, Ilkka & Shimbo, Masashi & Saerens, Marco, 2014. "Developments in the theory of randomized shortest paths with a comparison of graph node distances," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 393(C), pages 600-616.
    10. Alfonso Gutierrez-Lopez & Carlos Chávez & Carlos Díaz-Delgado, 2022. "Autocorrelation Ratio as a Measure of Inertia for the Classification of Extreme Events," Mathematics, MDPI, vol. 10(12), pages 1-15, June.
    11. Raffaella Calabrese & Galina Andreeva & Jake Ansell, 2019. "“Birds of a Feather” Fail Together: Exploring the Nature of Dependency in SME Defaults," Risk Analysis, John Wiley & Sons, vol. 39(1), pages 71-84, January.
    12. Jean-Baptiste Hasse, 2022. "Systemic risk: a network approach," Empirical Economics, Springer, vol. 63(1), pages 313-344, July.
    13. Yoshio Takane & Heungsun Hwang & Hervé Abdi, 2008. "Regularized Multiple-Set Canonical Correlation Analysis," Psychometrika, Springer;The Psychometric Society, vol. 73(4), pages 753-775, December.
    14. Heiberg, Jonas & Truffer, Bernhard & Binz, Christian, 2022. "Assessing transitions through socio-technical configuration analysis – a methodological framework and a case study in the water sector," Research Policy, Elsevier, vol. 51(1).
    15. Florian Schreiber, 2017. "Identification of customer groups in the German term life market: a benefit segmentation," Annals of Operations Research, Springer, vol. 254(1), pages 365-399, July.
    16. Joris Knoben & Leon A. G. Oerlemans & Annefleur R. Krijkamp & Keith G. Provan, 2018. "What Do They Know? The Antecedents of Information Accuracy Differentials in Interorganizational Networks," Organization Science, INFORMS, vol. 29(3), pages 471-488, June.
    17. Francesco Cannarile & Michele Compare & Francesco Di Maio & Enrico Zio, 2018. "A clustering approach for mining reliability big data for asset management," Journal of Risk and Reliability, , vol. 232(2), pages 140-150, April.
    18. Rosetta Lombardo, 2016. "Is there also a North–South Divide in the Diffusion of Crime? A Cluster Analysis of Italian Provinces," Review of Development Economics, Wiley Blackwell, vol. 20(2), pages 443-455, May.
    19. Jean-Baptiste Hasse, 2020. "Systemic Risk: a Network Approach," Working Papers halshs-02893780, HAL.
    20. Vallejo-Arboleda, Amparo & Vicente-Villardon, Jose L. & Galindo-Villardon, M.P., 2007. "Canonical STATIS: Biplot analysis of multi-table group structured data based on STATIS-ACT methodology," Computational Statistics & Data Analysis, Elsevier, vol. 51(9), pages 4193-4205, May.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:jmvana:v:132:y:2014:i:c:p:9-24. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/wps/find/journaldescription.cws_home/622892/description#description .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.