IDEAS home Printed from https://ideas.repec.org/a/taf/japsta/v38y2011i5p1021-1032.html
   My bibliography  Save this article

Classification with discrete and continuous variables via general mixed-data models

Author

Listed:
  • A. R. de Leon
  • A. Soo
  • T. Williamson

Abstract

We study the problem of classifying an individual into one of several populations based on mixed nominal, continuous, and ordinal data. Specifically, we obtain a classification procedure as an extension to the so-called location linear discriminant function, by specifying a general mixed-data model for the joint distribution of the mixed discrete and continuous variables. We outline methods for estimating misclassification error rates. Results of simulations of the performance of proposed classification rules in various settings vis-à-vis a robust mixed-data discrimination method are reported as well. We give an example utilizing data on croup in children.

Suggested Citation

  • A. R. de Leon & A. Soo & T. Williamson, 2011. "Classification with discrete and continuous variables via general mixed-data models," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(5), pages 1021-1032, February.
  • Handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1021-1032
    DOI: 10.1080/02664761003758976
    as

    Download full text from publisher

    File URL: http://hdl.handle.net/10.1080/02664761003758976
    Download Restriction: Access to full text is restricted to subscribers.

    File URL: https://libkey.io/10.1080/02664761003758976?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wai-Yin Poon & Sik-Yum Lee, 1987. "Maximum likelihood estimation of multivariate polyserial and polychoric correlation coefficients," Psychometrika, Springer;The Psychometric Society, vol. 52(3), pages 409-430, September.
    2. Marian Núñez & Angel Villarroya & José María Oller, 2003. "Minimum Distance Probability Discriminant Analysis for Mixed Variables," Biometrics, The International Biometric Society, vol. 59(2), pages 248-253, June.
    3. Ming Tan & Yinsheng Qu & J. Sunil Rao, 1999. "Robustness of the Latent Variable Model for Correlated Binary Data," Biometrics, The International Biometric Society, vol. 55(1), pages 258-263, March.
    4. W. Krzanowski, 1993. "The location model for mixtures of categorical and continuous variables," Journal of Classification, Springer;The Classification Society, vol. 10(1), pages 25-49, January.
    5. de Leon, A. R. & Carrière, K. C., 2005. "A generalized Mahalanobis distance for mixed data," Journal of Multivariate Analysis, Elsevier, vol. 92(1), pages 174-185, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Leila Amiri & Mojtaba Khazaei & Mojtaba Ganjali, 2017. "General location model with factor analyzer covariance matrix structure and its applications," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(3), pages 593-609, September.
    2. Amparo Baíllo & Aurea Grané, 2021. "Subsampling and Aggregation: A Solution to the Scalability Problem in Distance-Based Prediction for Mixed-Type Data," Mathematics, MDPI, vol. 9(18), pages 1-17, September.
    3. Bhat, Chandra R., 2015. "A new generalized heterogeneous data model (GHDM) to jointly model mixed types of dependent variables," Transportation Research Part B: Methodological, Elsevier, vol. 79(C), pages 50-77.
    4. Miguel Angel Ortíz-Barrios & Matias Garcia-Constantino & Chris Nugent & Isaac Alfaro-Sarmiento, 2022. "A Novel Integration of IF-DEMATEL and TOPSIS for the Classifier Selection Problem in Assistive Technology Adoption for People with Dementia," IJERPH, MDPI, vol. 19(3), pages 1-31, January.
    5. Alban Mbina Mbina & Guy Martial Nkiet & Fulgence Eyi Obiang, 2019. "Variable selection in discriminant analysis for mixed continuous-binary variables and several groups," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 13(3), pages 773-795, September.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Mortier, F. & Robin, S. & Lassalvy, S. & Baril, C.P. & Bar-Hen, A., 2006. "Prediction of Euclidean distances with discrete and continuous outcomes," Journal of Multivariate Analysis, Elsevier, vol. 97(8), pages 1799-1814, September.
    2. Leila Amiri & Mojtaba Khazaei & Mojtaba Ganjali, 2018. "A mixture latent variable model for modeling mixed data in heterogeneous populations and its applications," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 102(1), pages 95-115, January.
    3. Thomas Bittmann & Jens‐Peter Loy & Sven Anders, 2020. "Product differentiation and cost pass‐through: industry‐wide versus firm‐specific cost shocks," Australian Journal of Agricultural and Resource Economics, Australian Agricultural and Resource Economics Society, vol. 64(4), pages 1184-1209, October.
    4. Wai Chan & Peter Bentler, 1998. "Covariance structure analysis of ordinal ipsative data," Psychometrika, Springer;The Psychometric Society, vol. 63(4), pages 369-399, December.
    5. Florian Schuberth & Jörg Henseler & Theo K. Dijkstra, 2018. "Partial least squares path modeling using ordinal categorical indicators," Quality & Quantity: International Journal of Methodology, Springer, vol. 52(1), pages 9-35, January.
    6. Li, Zhengtao & Folmer, Henk & Xue, Jianhong, 2014. "To what extent does air pollution affect happiness? The case of the Jinchuan mining area, China," Ecological Economics, Elsevier, vol. 99(C), pages 88-99.
    7. Colin O. Wu & Gang Zheng & Minjung Kwak, 2013. "A Joint Regression Analysis for Genetic Association Studies with Outcome Stratified Samples," Biometrics, The International Biometric Society, vol. 69(2), pages 417-426, June.
    8. Sik-Yum Lee & Wai-Yin Poon & P. Bentler, 1989. "Simultaneous analysis of multivariate polytomous variates in several groups," Psychometrika, Springer;The Psychometric Society, vol. 54(1), pages 63-73, March.
    9. de Leon, A.R., 2005. "Pairwise likelihood approach to grouped continuous model and its extension," Statistics & Probability Letters, Elsevier, vol. 75(1), pages 49-57, November.
    10. Hao Bai & Yuan Zhong & Xin Gao & Wei Xu, 2020. "Multivariate Mixed Response Model with Pairwise Composite-Likelihood Method," Stats, MDPI, vol. 3(3), pages 1-18, July.
    11. Myrsini Katsikatsou & Irini Moustaki, 2016. "Pairwise Likelihood Ratio Tests and Model Selection Criteria for Structural Equation Models with Ordinal Variables," Psychometrika, Springer;The Psychometric Society, vol. 81(4), pages 1046-1068, December.
    12. Sudha Bishnoi & Nadhir Al-Ansari & Mujahid Khan & Salim Heddam & Anurag Malik, 2022. "Classification of Cotton Genotypes with Mixed Continuous and Categorical Variables: Application of Machine Learning Models," Sustainability, MDPI, vol. 14(20), pages 1-17, October.
    13. Leung, Chi-Ying, 2003. "The effect of across-location heteroscedasticity on the classification of mixed categorical and continuous data," Journal of Multivariate Analysis, Elsevier, vol. 84(2), pages 369-386, February.
    14. E. Bahrami Samani & M. Ganjali, 2011. "Bayesian latent variable model for mixed continuous and ordinal responses with possibility of missing responses," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(6), pages 1103-1116, March.
    15. Adolfo Esparcia & Joan Guárdia Olmos, 2001. "The Relationship of the Degree of Exposure to a Technological Disaster and Emotional Response: A Structural Model Approach," Quality & Quantity: International Journal of Methodology, Springer, vol. 35(2), pages 161-171, May.
    16. Schröder, Michael & Dornau, Robert, 2000. "Do Forecasters use Monetary Models? An Empirical Analysis of Exchange Rate Expectations," CoFE Discussion Papers 00/14, University of Konstanz, Center of Finance and Econometrics (CoFE).
    17. Schröder, Michael & Dornau, Robert, 1999. "What's on their mind: do exchange rate forecasters stick to theoretical models?," ZEW Discussion Papers 99-08, ZEW - Leibniz Centre for European Economic Research.
    18. Nerlove, Marc & Schuermann, Til, 1997. "Businessmen's Expectations Are Neither Rational nor Adaptive," ZEW Discussion Papers 97-01, ZEW - Leibniz Centre for European Economic Research.
    19. Edward J. Bedrick & Jodi Lapidus & Joseph F. Powell, 2000. "Estimating the Mahalanobis Distance from Mixed Continuous and Discrete Data," Biometrics, The International Biometric Society, vol. 56(2), pages 394-401, June.
    20. Layal Christine Lettry, 2023. "Clustering the Swiss Pension Register," FSES Working Papers 529, Faculty of Economics and Social Sciences, University of Freiburg/Fribourg Switzerland.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:taf:japsta:v:38:y:2011:i:5:p:1021-1032. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Chris Longhurst (email available below). General contact details of provider: http://www.tandfonline.com/CJAS20 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.