IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v77y2009i3p345-365.html
   My bibliography  Save this article

Penalized Regression with Ordinal Predictors

Author

Listed:
  • Jan Gertheiss
  • Gerhard Tutz

Abstract

Ordered categorial predictors are a common case in regression modelling. In contrast to the case of ordinal response variables, ordinal predictors have been largely neglected in the literature. In this paper, existing methods are reviewed and the use of penalized regression techniques is proposed. Based on dummy coding two types of penalization are explicitly developed; the first imposes a difference penalty, the second is a ridge type refitting procedure. Also a Bayesian motivation is provided. The concept is generalized to the case of non‐normal outcomes within the framework of generalized linear models by applying penalized likelihood estimation. Simulation studies and real world data serve for illustration and to compare the approaches to methods often seen in practice, namely simple linear regression on the group labels and pure dummy coding. Especially the proposed difference penalty turns out to be highly competitive. Les variables indépendantes catégoriques ordinales sont un cas courant dans les modèles de régression. Contrairement au cas des variables dépendantes ordinales, les variables indépendantes ordinales ont été largement négligées par la recherche. Le présent article présente les méthodes existantes et propose l'utilisation de techniques de régression pénalisée. Deux types de pénalisation basés sur des variables dummy sont exposés; le premier impose une pénalité de différence, le second est une procédure basée sur une forme de régression ridge. D'autre part, une motivation baysienne est présentée. La méthode est également appliquée au cas de variables dépendantes non gaussiennes. Des études de simulation et des données réelles servent à illustrer et à comparer les nouvelles méthodes aux méthodes que l'on rencontre souvent dans la pratique ‐ à savoir les régressions linéaires sur les nombres entiers et sur des variables dummy sans penalité. Une pénalité de différence notamment a montré de bons résultats.

Suggested Citation

  • Jan Gertheiss & Gerhard Tutz, 2009. "Penalized Regression with Ordinal Predictors," International Statistical Review, International Statistical Institute, vol. 77(3), pages 345-365, December.
  • Handle: RePEc:bla:istatr:v:77:y:2009:i:3:p:345-365
    DOI: 10.1111/j.1751-5823.2009.00088.x
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/j.1751-5823.2009.00088.x
    Download Restriction: no

    File URL: https://libkey.io/10.1111/j.1751-5823.2009.00088.x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    2. James H. Albert & Siddhartha Chib, 2001. "Sequential Ordinal Modeling with Applications to Survival Data," Biometrics, The International Biometric Society, vol. 57(3), pages 829-836, September.
    3. H. Myoken & Y. Uchida, 1977. "The generalized ridge estimator and improved adjustments for regression parameters," Metrika: International Journal for Theoretical and Applied Statistics, Springer, vol. 24(1), pages 113-124, December.
    4. Ivy Liu & Alan Agresti, 2005. "The analysis of ordered categorical data: An overview and a survey of recent developments," TEST: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 14(1), pages 1-73, June.
    5. Donald R. Jensen & Donald E. Ramirez, 2008. "Anomalies in the Foundations of Ridge Regression," International Statistical Review, International Statistical Institute, vol. 76(1), pages 89-105, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Faisal Zahid & Gerhard Tutz, 2013. "Multinomial logit models with implicit variable selection," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 7(4), pages 393-416, December.
    2. Kathrin Leppek & Gun Woo Byeon & Wipapat Kladwang & Hannah K. Wayment-Steele & Craig H. Kerr & Adele F. Xu & Do Soon Kim & Ved V. Topkar & Christian Choe & Daphna Rothschild & Gerald C. Tiu & Roger We, 2022. "Combinatorial optimization of mRNA structure, stability, and translation for RNA-based therapeutics," Nature Communications, Nature, vol. 13(1), pages 1-22, December.
    3. Hess, Wolfgang & Persson, Maria & Rubenbauer, Stephanie & Gertheiss, Jan, 2013. "Using Lasso-Type Penalties to Model Time-Varying Covariate Effects in Panel Data Regressions - A Novel Approach Illustrated by the 'Death of Distance' in International Trade," Working Papers 2013:5, Lund University, Department of Economics.
    4. Gerhard Tutz & Jan Gertheiss, 2014. "Rating Scales as Predictors—The Old Question of Scale Level and Some Answers," Psychometrika, Springer;The Psychometric Society, vol. 79(3), pages 357-376, July.
    5. Ann Marsden & Hugh Sibly, 2017. "Third-degree price discrimination in a short-stay accommodation industry," Applied Economics, Taylor & Francis Journals, vol. 49(51), pages 5166-5182, November.
    6. Gerhard Tutz & Micha Schneider & Maria Iannario & Domenico Piccolo, 2017. "Mixture models for ordinal responses to account for uncertainty of choice," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 11(2), pages 281-305, June.
    7. Reher, Leonie & Runst, Petrik & Thomä, Jörg & Bizer, Kilian, 2024. "Measuring non-R&D drivers of innovation: The case of SMEs in lagging regions," ifh Working Papers 45/2024, Volkswirtschaftliches Institut für Mittelstand und Handwerk an der Universität Göttingen (ifh).
    8. Faisal Maqbool Zahid & Gerhard Tutz, 2013. "Proportional Odds Models with High‐Dimensional Data Structure," International Statistical Review, International Statistical Institute, vol. 81(3), pages 388-406, December.
    9. Faisal M. Zahid & Shahla Ramzan, 2012. "Ordinal ridge regression with categorical predictors," Journal of Applied Statistics, Taylor & Francis Journals, vol. 39(1), pages 161-171, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Alhamzawi, Rahim, 2016. "Bayesian model selection in ordinal quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 68-78.
    2. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    3. Claudia García-García & Catalina B. García-García & Román Salmerón, 2021. "Confronting collinearity in environmental regression models: evidence from world data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(3), pages 895-926, September.
    4. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    5. Baruch, Shmuel & Panayides, Marios & Venkataraman, Kumar, 2017. "Informed trading and price discovery before corporate events," Journal of Financial Economics, Elsevier, vol. 125(3), pages 561-588.
    6. Yize Zhao & Matthias Chung & Brent A. Johnson & Carlos S. Moreno & Qi Long, 2016. "Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1427-1439, October.
    7. Francis X. Diebold & Kamil Yilmaz, 2016. "Trans-Atlantic Equity Volatility Connectedness: U.S. and European Financial Institutions, 2004–2014," Journal of Financial Econometrics, Oxford University Press, vol. 14(1), pages 81-127.
    8. Jian Guo & Elizaveta Levina & George Michailidis & Ji Zhu, 2010. "Pairwise Variable Selection for High-Dimensional Model-Based Clustering," Biometrics, The International Biometric Society, vol. 66(3), pages 793-804, September.
    9. Franck Rapaport & Christina Leslie, 2010. "Determining Frequent Patterns of Copy Number Alterations in Cancer," PLOS ONE, Public Library of Science, vol. 5(8), pages 1-10, August.
    10. R. Salmerón & J. García & C. B. García & M. M. López Martín, 2017. "A note about the corrected VIF," Statistical Papers, Springer, vol. 58(3), pages 929-945, September.
    11. Lu Tang & Ling Zhou & Peter X. K. Song, 2019. "Fusion learning algorithm to combine partially heterogeneous Cox models," Computational Statistics, Springer, vol. 34(1), pages 395-414, March.
    12. Young‐Geun Choi & Lawrence P. Hanrahan & Derek Norton & Ying‐Qi Zhao, 2022. "Simultaneous spatial smoothing and outlier detection using penalized regression, with application to childhood obesity surveillance from electronic health records," Biometrics, The International Biometric Society, vol. 78(1), pages 324-336, March.
    13. Mamatzakis, Emmanuel C. & Tsionas, Mike G., 2021. "Making inference of British household's happiness efficiency: A Bayesian latent model," European Journal of Operational Research, Elsevier, vol. 294(1), pages 312-326.
    14. Molly C. Klanderman & Kathryn B. Newhart & Tzahi Y. Cath & Amanda S. Hering, 2020. "Fault isolation for a complex decentralized waste water treatment facility," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 69(4), pages 931-951, August.
    15. Wang, Li-Yu & Park, Cheolwoo & Yeon, Kyupil & Choi, Hosik, 2017. "Tracking concept drift using a constrained penalized regression combiner," Computational Statistics & Data Analysis, Elsevier, vol. 108(C), pages 52-69.
    16. Tomáš Plíhal, 2021. "Scheduled macroeconomic news announcements and Forex volatility forecasting," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(8), pages 1379-1397, December.
    17. Ryo Kato & Takahiro Hoshino, 2020. "Semiparametric Bayesian multiple imputation for regression models with missing mixed continuous–discrete covariates," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 72(3), pages 803-825, June.
    18. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    19. Bambio, Yiriyibin & Bouayad Agha, Salima, 2018. "Land tenure security and investment: Does strength of land right really matter in rural Burkina Faso?," World Development, Elsevier, vol. 111(C), pages 130-147.
    20. Fernández, D. & Arnold, R. & Pledger, S., 2016. "Mixture-based clustering for the ordered stereotype model," Computational Statistics & Data Analysis, Elsevier, vol. 93(C), pages 46-75.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:77:y:2009:i:3:p:345-365. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.