IDEAS home Printed from https://ideas.repec.org/a/bla/stanee/v76y2022i3p347-368.html
   My bibliography  Save this article

Average ordinary least squares‐centered penalized regression: A more efficient way to address multicollinearity than ridge regression

Author

Listed:
  • Wei Wang
  • Linjian Li
  • Sheng Li
  • Fei Yin
  • Fang Liao
  • Tao Zhang
  • Xiaosong Li
  • Xiong Xiao
  • Yue Ma

Abstract

We developed a novel method to address multicollinearity in linear models called average ordinary least squares (OLS)‐centered penalized regression (AOPR). AOPR penalizes the cost function to shrink the estimators toward the weighted‐average OLS estimator. The commonly used ridge regression (RR) shrinks the estimators toward zero, that is, employs penalization prior β∼N(0,1/k) in the Bayesian view, which contradicts the common real prior β≠0. Therefore, RR selects small penalization coefficients to relieve such a contradiction and thus makes the penalizations inadequate. Mathematical derivations remind us that AOPR could increase the performance of RR and OLS regression. A simulation study shows that AOPR obtains more accurate estimators than OLS regression in most situations and more accurate estimators than RR when the signs of the true βs are identical and is slightly less accurate than RR when the signs of the true βs are different. Additionally, a case study shows that AOPR obtains more stable estimators and stronger statistical power and predictive ability than RR and OLS regression. Through these results, we recommend using AOPR to address multicollinearity more efficiently than RR and OLS regression, especially when the true βs have identical signs.

Suggested Citation

  • Wei Wang & Linjian Li & Sheng Li & Fei Yin & Fang Liao & Tao Zhang & Xiaosong Li & Xiong Xiao & Yue Ma, 2022. "Average ordinary least squares‐centered penalized regression: A more efficient way to address multicollinearity than ridge regression," Statistica Neerlandica, Netherlands Society for Statistics and Operations Research, vol. 76(3), pages 347-368, August.
  • Handle: RePEc:bla:stanee:v:76:y:2022:i:3:p:347-368
    DOI: 10.1111/stan.12263
    as

    Download full text from publisher

    File URL: https://doi.org/10.1111/stan.12263
    Download Restriction: no

    File URL: https://libkey.io/10.1111/stan.12263?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    2. Reid Priedhorsky & Ashlynn R Daughton & Martha Barnard & Fiona O’Connell & Dave Osthus, 2019. "Estimating influenza incidence using search query deceptiveness and generalized ridge regression," PLOS Computational Biology, Public Library of Science, vol. 15(10), pages 1-23, October.
    3. Xiaohong, Dai & Huajiang, Chen & Bagherzadeh, Seyed Amin & Shayan, Masoud & Akbari, Mohammad, 2020. "Statistical estimation the thermal conductivity of MWCNTs-SiO2/Water-EG nanofluid using the ridge regression method," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 537(C).
    4. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    5. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    2. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    3. Yize Zhao & Matthias Chung & Brent A. Johnson & Carlos S. Moreno & Qi Long, 2016. "Hierarchical Feature Selection Incorporating Known and Novel Biological Information: Identifying Genomic Features Related to Prostate Cancer Recurrence," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 111(516), pages 1427-1439, October.
    4. Jie Jian & Peijun Sang & Mu Zhu, 2024. "Two Gaussian Regularization Methods for Time-Varying Networks," Journal of Agricultural, Biological and Environmental Statistics, Springer;The International Biometric Society;American Statistical Association, vol. 29(4), pages 853-873, December.
    5. Tomáš Plíhal, 2021. "Scheduled macroeconomic news announcements and Forex volatility forecasting," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(8), pages 1379-1397, December.
    6. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    7. Murat Genç & M. Revan Özkale, 2021. "Usage of the GO estimator in high dimensional linear models," Computational Statistics, Springer, vol. 36(1), pages 217-239, March.
    8. Victor Chernozhukov & Christian Hansen & Yuan Liao, 2015. "A lava attack on the recovery of sums of dense and sparse signals," CeMMAP working papers CWP56/15, Centre for Microdata Methods and Practice, Institute for Fiscal Studies.
    9. Wang, Shixuan & Syntetos, Aris A. & Liu, Ying & Di Cairano-Gilfedder, Carla & Naim, Mohamed M., 2023. "Improving automotive garage operations by categorical forecasts using a large number of variables," European Journal of Operational Research, Elsevier, vol. 306(2), pages 893-908.
    10. Takumi Saegusa & Tianzhou Ma & Gang Li & Ying Qing Chen & Mei-Ling Ting Lee, 2020. "Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(3), pages 376-398, December.
    11. Lee Kyu Ha & Chakraborty Sounak & Sun Jianguo, 2011. "Bayesian Variable Selection in Semiparametric Proportional Hazards Model for High Dimensional Survival Data," The International Journal of Biostatistics, De Gruyter, vol. 7(1), pages 1-32, April.
    12. Devriendt, Sander & Antonio, Katrien & Reynkens, Tom & Verbelen, Roel, 2021. "Sparse regression with Multi-type Regularized Feature modeling," Insurance: Mathematics and Economics, Elsevier, vol. 96(C), pages 248-261.
    13. Korobilis, Dimitris, 2013. "Hierarchical shrinkage priors for dynamic regressions with many predictors," International Journal of Forecasting, Elsevier, vol. 29(1), pages 43-59.
    14. Florian Ziel, 2015. "Iteratively reweighted adaptive lasso for conditional heteroscedastic time series with applications to AR-ARCH type processes," Papers 1502.06557, arXiv.org, revised Dec 2015.
    15. Feng Hong & Lu Tian & Viswanath Devanarayan, 2023. "Improving the Robustness of Variable Selection and Predictive Performance of Regularized Generalized Linear Models and Cox Proportional Hazard Models," Mathematics, MDPI, vol. 11(3), pages 1-13, January.
    16. Wei Tang & Steven L Bressler & Chad M Sylvester & Gordon L Shulman & Maurizio Corbetta, 2012. "Measuring Granger Causality between Cortical Regions from Voxelwise fMRI BOLD Signals with LASSO," PLOS Computational Biology, Public Library of Science, vol. 8(5), pages 1-14, May.
    17. Massimiliano Caporin & Francesco Poli, 2017. "Building News Measures from Textual Data and an Application to Volatility Forecasting," Econometrics, MDPI, vol. 5(3), pages 1-46, August.
    18. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    19. Justin B. Post & Howard D. Bondell, 2013. "Factor Selection and Structural Identification in the Interaction ANOVA Model," Biometrics, The International Biometric Society, vol. 69(1), pages 70-79, March.
    20. Jiang, Liewen & Bondell, Howard D. & Wang, Huixia Judy, 2014. "Interquantile shrinkage and variable selection in quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 69(C), pages 208-219.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bla:stanee:v:76:y:2022:i:3:p:347-368. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Wiley Content Delivery (email available below). General contact details of provider: http://www.blackwellpublishing.com/journal.asp?ref=0039-0402 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.