IDEAS home Printed from https://ideas.repec.org/a/bla/istatr/v76y2008i2p285-297.html
   My bibliography  Save this article

On the Practice of Rescaling Covariates

Author

Listed:
  • Sylvain Sardy

Abstract

Whether doing parametric or nonparametric regression with shrinkage, thresholding, penalized likelihood, Bayesian posterior estimators (e.g., ridge regression, lasso, principal component regression, waveshrink or Markov random field), it is common practice to rescale covariates by dividing by their respective standard errors ρ. The stated goal of this operation is to provide unitless covariates to compare like with like, especially when penalized likelihood or prior distributions are used. We contend that this vision is too simplistic. Instead, we propose to take into account a more essential component of the structure of the regression matrix by rescaling the covariates based on the diagonal elements of the covariance matrix Σ of the maximum‐likelihood estimator. We illustrate the differences between the standard ρ‐ and proposed Σ‐rescalings with various estimators and data sets. Que l'on utilise un modèle de régression paramétrique ou non‐paramétrique, par rétrécissement, seuillage, vraisemblance pénalisée ou Bayesien (ex. régression ridge, lasso, régression en composantes principales, waveshrink, champ Markovien), il est commun de standardiser les variables explicatives en les divisant par leurs écarts types ρ respectifs. Le but affiché de cette opération est de créer des variables sans unités pour pouvoir les comparer entre elles, en particulier quand l'estimateur est basé sur la vraisemblance pénalisée ou une distribution a priori. Nous attendons prouver que cette vision est trop simpliste. Nous proposons de plutôt considérer un élément plus essentiel de la matrice de régression en standardisant les variables explicatives à partir des éléments diagonaux de la matrice de covariance Σ de l'estimateur du maximum de vraisemblance. Nous illustrons les différences entre la standardisation ρ et la standarisation Σ avec des estimateurs et des données variés. Mots clés: champ markovien, distribution a priori ℓη, lasso, ondelettes, régression en composantes principales, régression ridge.

Suggested Citation

  • Sylvain Sardy, 2008. "On the Practice of Rescaling Covariates," International Statistical Review, International Statistical Institute, vol. 76(2), pages 285-297, August.
  • Handle: RePEc:bla:istatr:v:76:y:2008:i:2:p:285-297
    DOI: 10.1111/j.1751-5823.2008.00050.x
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/j.1751-5823.2008.00050.x
    Download Restriction: no

    File URL: https://libkey.io/10.1111/j.1751-5823.2008.00050.x?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Sardy, Sylvain & Tseng, Paul, 2004. "On the Statistical Analysis of Smoothing by Maximizing Dirty Markov Random Field Posterior Distributions," Journal of the American Statistical Association, American Statistical Association, vol. 99, pages 191-204, January.
    2. Donald R. Jensen & Donald E. Ramirez, 2008. "Anomalies in the Foundations of Ridge Regression," International Statistical Review, International Statistical Institute, vol. 76(1), pages 89-105, April.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Prasenjit Kapat & Prem K. Goel, 2010. "Anomalies in the Foundations of Ridge Regression: Some Clarifications," International Statistical Review, International Statistical Institute, vol. 78(2), pages 209-215, August.
    2. Sardy, Sylvain & Diaz-Rodriguez, Jairo & Giacobino, Caroline, 2022. "Thresholding tests based on affine LASSO to achieve non-asymptotic nominal level and high power under sparse and dense alternatives in high dimension," Computational Statistics & Data Analysis, Elsevier, vol. 173(C).
    3. Sylvain Sardy, 2009. "Adaptive Posterior Mode Estimation of a Sparse Sequence for Model Selection," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 36(4), pages 577-601, December.
    4. José García & Román Salmerón & Catalina García & María del Mar López Martín, 2016. "Standardization of Variables and Collinearity Diagnostic in Ridge Regression," International Statistical Review, International Statistical Institute, vol. 84(2), pages 245-266, August.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Claudia García-García & Catalina B. García-García & Román Salmerón, 2021. "Confronting collinearity in environmental regression models: evidence from world data," Statistical Methods & Applications, Springer;Società Italiana di Statistica, vol. 30(3), pages 895-926, September.
    2. R. Salmerón & J. García & C. B. García & M. M. López Martín, 2017. "A note about the corrected VIF," Statistical Papers, Springer, vol. 58(3), pages 929-945, September.
    3. Chavez-Demoulin, V. & Embrechts, P. & Sardy, S., 2014. "Extreme-quantile tracking for financial time series," Journal of Econometrics, Elsevier, vol. 181(1), pages 44-52.
    4. Rutger Jan Lange, 2020. "Bellman filtering for state-space models," Tinbergen Institute Discussion Papers 20-052/III, Tinbergen Institute, revised 19 May 2021.
    5. Sylvain Sardy & Paul Tseng, 2010. "Density Estimation by Total Variation Penalized Likelihood Driven by the Sparsity ℓ1 Information Criterion," Scandinavian Journal of Statistics, Danish Society for Theoretical Statistics;Finnish Statistical Society;Norwegian Statistical Association;Swedish Statistical Association, vol. 37(2), pages 321-337, June.
    6. Prasenjit Kapat & Prem K. Goel, 2010. "Anomalies in the Foundations of Ridge Regression: Some Clarifications," International Statistical Review, International Statistical Institute, vol. 78(2), pages 209-215, August.
    7. Neto, David, 2016. "Extracting volatility signal using maximum a posteriori estimation," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 461(C), pages 788-794.
    8. José García & Román Salmerón & Catalina García & María del Mar López Martín, 2016. "Standardization of Variables and Collinearity Diagnostic in Ridge Regression," International Statistical Review, International Statistical Institute, vol. 84(2), pages 245-266, August.
    9. Santiago Velilla, 2018. "A Note on Collinearity Diagnostics and Centering," The American Statistician, Taylor & Francis Journals, vol. 72(2), pages 140-146, April.
    10. Lamprinakou, Stamatina & Barahona, Mauricio & Flaxman, Seth & Filippi, Sarah & Gandy, Axel & McCoy, Emma J., 2023. "BART-based inference for Poisson processes," Computational Statistics & Data Analysis, Elsevier, vol. 180(C).
    11. David Neto & Sylvain Sardy, 2012. "Moments structure of ℓ 1 -stochastic volatility models," Quality & Quantity: International Journal of Methodology, Springer, vol. 46(6), pages 1947-1952, October.
    12. Jan Gertheiss & Gerhard Tutz, 2009. "Penalized Regression with Ordinal Predictors," International Statistical Review, International Statistical Institute, vol. 77(3), pages 345-365, December.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:istatr:v:76:y:2008:i:2:p:285-297. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: https://edirc.repec.org/data/isiiinl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.