IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v81y2013i3p388-406.html
   My bibliography  Save this article

Proportional Odds Models with High‐Dimensional Data Structure

Author

Listed:
  • Faisal Maqbool Zahid
  • Gerhard Tutz

Abstract

The proportional odds model is the most widely used model when the response has ordered categories. In the case of high‐dimensional predictor structure, the common maximum likelihood approach typically fails when all predictors are included. A boosting technique pomBoost is proposed to fit the model by implicitly selecting the influential predictors. The approach distinguishes between metric and categorical predictors. In the case of categorical predictors, where each predictor relates to a set of parameters, the objective is to select simultaneously all the associated parameters. In addition, the approach distinguishes between nominal and ordinal predictors. In the case of ordinal predictors, the proposed technique uses the ordering of the ordinal predictors by penalizing the difference between the parameters of adjacent categories. The technique has also a provision to consider some mandatory predictors (if any) that must be part of the final sparse model. The performance of the proposed boosting algorithm is evaluated in a simulation study and applications with respect to mean squared error and prediction error. Hit rates and false alarm rates are used to judge the performance of pomBoost for selection of the relevant predictors. Le modèle des odds proportionnels (rapports des chances proportionnels) est le modèle le plus couramment utilisé dans l'analyse de réponses de type ordinal. En présence d'un grand nombre de covariables possibles, l'approche par maximum de vraisemblance usuelle est typiquement mise en échec si toutes les covariables sont prises en compte. Une méthode de type boosting, pomBoost, est proposée, par laquelle le modèle est estimé via une sélection implicite des prédicteurs les plus pertinentes. Cette approche fait la distinction entre variables métriques et catégorielles. Dans le cas de variables catégorielles, l'objectif est une sélection simultanée d'un ensemble de prédicteurs. La méthode fait la distinction, de surcroït, entre variables nominales et ordinales. Dans ce dernier cas, la relation d'ordre intervient dans le calcul de la pénalisation. La méthode permet également d'imposer la présence de certaines covariables dans le modèle final. Les performances de l'algorithme de boosting sont évaluées, du point de vue de l'erreur quadratique moyenne et de l'erreur de prédiction, au moyen d'une étude de simulation et d'applications à des données empiriques. Les taux de succès et de fausse alarme sont considérés pour l'évaluation des performances de pomBoost dans la sélection des prédicteurs.

Suggested Citation

  • Faisal Maqbool Zahid & Gerhard Tutz, 2013. "Proportional Odds Models with High‐Dimensional Data Structure," International Statistical Review, International Statistical Institute, vol. 81(3), pages 388-406, December.
  • Handle: RePEc:bla:istatr:v:81:y:2013:i:3:p:388-406
    DOI: 10.1111/insr.12032
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/insr.12032
    Download Restriction: no

    File URL: https://libkey.io/10.1111/insr.12032?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Jan Gertheiss & Gerhard Tutz, 2009. "Penalized Regression with Ordinal Predictors," International Statistical Review, International Statistical Institute, vol. 77(3), pages 345-365, December.
    2. Lukas Meier & Sara Van De Geer & Peter Bühlmann, 2008. "The group lasso for logistic regression," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 70(1), pages 53-71, February.
    3. Jelle J. Goeman & Saskia le Cessie, 2006. "A Goodness-of-Fit Test for Multinomial Logistic Regression," Biometrics, The International Biometric Society, vol. 62(4), pages 980-985, December.
    4. Faisal Zahid & Gerhard Tutz, 2013. "Multinomial logit models with implicit variable selection," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(4), pages 393-416, December.
    5. Gerhard Tutz & Harald Binder, 2006. "Generalized Additive Modeling with Implicit Variable Selection by Likelihood-Based Boosting," Biometrics, The International Biometric Society, vol. 62(4), pages 961-971, December.
    6. Simonoff, Jeffrey S., 1987. "Probability estimation via smoothing in sparse contingency tables with ordered categories," Statistics & Probability Letters, Elsevier, vol. 5(1), pages 55-63, January.
    7. Jan Gertheiss & Sara Hogger & Cornelia Oberhauser & Gerhard Tutz, 2011. "Selection of ordinally scaled independent variables with applications to international classification of functioning core sets," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 60(3), pages 377-395, May.
    8. Faisal M. Zahid & Shahla Ramzan, 2012. "Ordinal ridge regression with categorical predictors," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(1), pages 161-171, March.
    9. Gareth M. James & Peter Radchenko, 2009. "A generalized Dantzig selector with shrinkage tuning," Biometrika, Biometrika Trust, vol. 96(2), pages 323-337.
    10. Fan J. & Li R., 2001. "Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 96, pages 1348-1360, December.
    11. Ming Yuan & Yi Lin, 2006. "Model selection and estimation in regression with grouped variables," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 68(1), pages 49-67, February.
    12. Hans Nyquist, 1991. "Restricted Estimation of Generalized Linear Models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 40(1), pages 133-141, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Ejike R. Ugba & Daniel Mörlein & Jan Gertheiss, 2021. "Smoothing in Ordinal Regression: An Application to Sensory Data," Stats, MDPI, vol. 4(3), pages 1-18, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Faisal Zahid & Gerhard Tutz, 2013. "Multinomial logit models with implicit variable selection," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(4), pages 393-416, December.
    2. Gerhard Tutz & Gunther Schauberger, 2015. "A Penalty Approach to Differential Item Functioning in Rasch Models," Psychometrika, Springer;The Psychometric Society, vol. 80(1), pages 21-43, March.
    3. Gerhard Tutz & Jan Gertheiss, 2014. "Rating Scales as Predictors—The Old Question of Scale Level and Some Answers," Psychometrika, Springer;The Psychometric Society, vol. 79(3), pages 357-376, July.
    4. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    5. Hess, Wolfgang & Persson, Maria & Rubenbauer, Stephanie & Gertheiss, Jan, 2013. "Using Lasso-Type Penalties to Model Time-Varying Covariate Effects in Panel Data Regressions – A Novel Approach Illustrated by the ‘Death of Distance’ in International Trade," Working Paper Series 961, Research Institute of Industrial Economics.
    6. Fabian Scheipl & Thomas Kneib & Ludwig Fahrmeir, 2013. "Penalized likelihood and Bayesian function selection in regression models," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 97(4), pages 349-385, October.
    7. Reher, Leonie & Runst, Petrik & Thomä, Jörg & Bizer, Kilian, 2024. "Measuring non-R&D drivers of innovation: The case of SMEs in lagging regions," ifh Working Papers 45/2024, Volkswirtschaftliches Institut für Mittelstand und Handwerk an der Universität Göttingen (ifh).
    8. Faisal Zahid & Gerhard Tutz, 2013. "Ridge estimation for multinomial logit models with symmetric side constraints," Computational Statistics, Springer, vol. 28(3), pages 1017-1034, June.
    9. Wei, Fengrong & Zhu, Hongxiao, 2012. "Group coordinate descent algorithms for nonconvex penalized regression," Computational Statistics & Data Analysis, Elsevier, vol. 56(2), pages 316-326.
    10. Zanhua Yin, 2020. "Variable selection for sparse logistic regression," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 83(7), pages 821-836, October.
    11. A. Karagrigoriou & C. Koukouvinos & K. Mylona, 2010. "On the advantages of the non-concave penalized likelihood model selection method with minimum prediction errors in large-scale medical studies," Journal of Applied Statistics, Taylor & Francis Journals, vol. 37(1), pages 13-24.
    12. Lichun Wang & Yuan You & Heng Lian, 2015. "Convergence and sparsity of Lasso and group Lasso in high-dimensional generalized linear models," Statistical Papers, Springer, vol. 56(3), pages 819-828, August.
    13. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    14. Canhong Wen & Zhenduo Li & Ruipeng Dong & Yijin Ni & Wenliang Pan, 2023. "Simultaneous Dimension Reduction and Variable Selection for Multinomial Logistic Regression," INFORMS Journal on Computing, INFORMS, vol. 35(5), pages 1044-1060, September.
    15. Yongxiu Cao & Jian Huang & Yanyan Liu & Xingqiu Zhao, 2016. "Sieve estimation of Cox models with latent structures," Biometrics, The International Biometric Society, vol. 72(4), pages 1086-1097, December.
    16. Yanfang Zhang & Chuanhua Wei & Xiaolin Liu, 2022. "Group Logistic Regression Models with l p,q Regularization," Mathematics, MDPI, vol. 10(13), pages 1-15, June.
    17. Young Joo Yoon & Cheolwoo Park & Erik Hofmeister & Sangwook Kang, 2012. "Group variable selection in cardiopulmonary cerebral resuscitation data for veterinary patients," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(7), pages 1605-1621, January.
    18. Haibin Zhang & Juan Wei & Meixia Li & Jie Zhou & Miantao Chao, 2014. "On proximal gradient method for the convex problems regularized with the group reproducing kernel norm," Journal of Global Optimization, Springer, vol. 58(1), pages 169-188, January.
    19. Minh Pham & Xiaodong Lin & Andrzej Ruszczyński & Yu Du, 2021. "An outer–inner linearization method for non-convex and nondifferentiable composite regularization problems," Journal of Global Optimization, Springer, vol. 81(1), pages 179-202, September.
    20. Bang, Sungwan & Jhun, Myoungshic, 2012. "Simultaneous estimation and factor selection in quantile regression via adaptive sup-norm regularization," Computational Statistics & Data Analysis, Elsevier, vol. 56(4), pages 813-826.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:81:y:2013:i:3:p:388-406. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.