IDEAS home Printed from https://ideas.repec.org/a/eee/insuma/v96y2021icp248-261.html
   My bibliography  Save this article

Sparse regression with Multi-type Regularized Feature modeling

Author

Listed:
  • Devriendt, Sander
  • Antonio, Katrien
  • Reynkens, Tom
  • Verbelen, Roel

Abstract

Within the statistical and machine learning literature, regularization techniques are often used to construct sparse (predictive) models. Most regularization strategies only work for data where all predictors are treated identically, such as Lasso regression for (continuous) predictors treated as linear effects. However, many predictive problems involve different types of predictors and require a tailored regularization term. We propose a multi-type Lasso penalty that acts on the objective function as a sum of subpenalties, one for each type of predictor. As such, we allow for predictor selection and level fusion within a predictor in a data-driven way, simultaneous with the parameter estimation process. We develop a new estimation strategy for convex predictive models with this multi-type penalty. Using the theory of proximal operators, our estimation procedure is computationally efficient, partitioning the overall optimization problem into easier to solve subproblems, specific for each predictor type and its associated penalty. Earlier research applies approximations to non-differentiable penalties to solve the optimization problem. The proposed SMuRF algorithm removes the need for approximations and achieves a higher accuracy and computational efficiency. This is demonstrated with an extensive simulation study and the analysis of a case-study on insurance pricing analytics.

Suggested Citation

  • Devriendt, Sander & Antonio, Katrien & Reynkens, Tom & Verbelen, Roel, 2021. "Sparse regression with Multi-type Regularized Feature modeling," Insurance: Mathematics and Economics, Elsevier, vol. 96(C), pages 248-261.
  • Handle: RePEc:eee:insuma:v:96:y:2021:i:c:p:248-261
    DOI: 10.1016/j.insmatheco.2020.11.010
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0167668720301608
    Download Restriction: Full text for ScienceDirect subscribers only

    File URL: https://libkey.io/10.1016/j.insmatheco.2020.11.010?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wang, Hansheng & Leng, Chenlei, 2008. "A note on adaptive group lasso," Computational Statistics & Data Analysis, Elsevier, vol. 52(12), pages 5277-5286, August.
    2. Klein, Nadja & Denuit, Michel & Lang, Stefan & Kneib, Thomas, 2014. "Nonlife ratemaking and risk management with Bayesian generalized additive models for location, scale, and shape," Insurance: Mathematics and Economics, Elsevier, vol. 55(C), pages 225-249.
    3. Zou, Hui, 2006. "The Adaptive Lasso and Its Oracle Properties," Journal of the American Statistical Association, American Statistical Association, vol. 101, pages 1418-1429, December.
    4. Robert Tibshirani & Michael Saunders & Saharon Rosset & Ji Zhu & Keith Knight, 2005. "Sparsity and smoothness via the fused lasso," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(1), pages 91-108, February.
    5. Edward W. (Jed) Frees & Glenn Meyers & A. David Cummings, 2014. "Insurance Ratemaking and a Gini Index," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 81(2), pages 335-366, June.
    6. Denuit, Michel & Lang, Stefan, 2004. "Non-life rate-making with Bayesian GAMs," Insurance: Mathematics and Economics, Elsevier, vol. 35(3), pages 627-647, December.
    7. Howard D. Bondell & Brian J. Reich, 2009. "Simultaneous Factor Selection and Collapsing Levels in ANOVA," Biometrics, The International Biometric Society, vol. 65(1), pages 169-177, March.
    8. Klein, Nadja & Denuit, Michel & Lang, Stefan & Kneib, Thomas, 2014. "Nonlife ratemaking and risk management with Bayesian generalized additive models for location, scale, and shape," LIDAM Reprints ISBA 2014006, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    9. Hui Zou & Trevor Hastie, 2005. "Addendum: Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(5), pages 768-768, November.
    10. Hui Zou & Trevor Hastie, 2005. "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 67(2), pages 301-320, April.
    11. Hans Nyquist, 1991. "Restricted Estimation of Generalized Linear Models," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 40(1), pages 133-141, March.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Evangelos Liaras & Michail Nerantzidis & Antonios Alexandridis, 2024. "Machine learning in accounting and finance research: a literature review," Review of Quantitative Finance and Accounting, Springer, vol. 63(4), pages 1431-1471, November.
    2. Emer Owens & Barry Sheehan & Martin Mullins & Martin Cunneen & Juliane Ressel & German Castignani, 2022. "Explainable Artificial Intelligence (XAI) in Insurance," Risks, MDPI, vol. 10(12), pages 1-50, December.
    3. Yang Qiao & Chou-Wen Wang & Wenjun Zhu, 2024. "Machine learning in long-term mortality forecasting," The Geneva Papers on Risk and Insurance - Issues and Practice, Palgrave Macmillan;The Geneva Association, vol. 49(2), pages 340-362, April.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Justin B. Post & Howard D. Bondell, 2013. "Factor Selection and Structural Identification in the Interaction ANOVA Model," Biometrics, The International Biometric Society, vol. 69(1), pages 70-79, March.
    2. Tutz, Gerhard & Pößnecker, Wolfgang & Uhlmann, Lorenz, 2015. "Variable selection in general multinomial logit models," Computational Statistics & Data Analysis, Elsevier, vol. 82(C), pages 207-222.
    3. Diego Vidaurre & Concha Bielza & Pedro Larrañaga, 2013. "A Survey of L1 Regression," International Statistical Review, International Statistical Institute, vol. 81(3), pages 361-387, December.
    4. Mkhadri, Abdallah & Ouhourane, Mohamed, 2013. "An extended variable inclusion and shrinkage algorithm for correlated variables," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 631-644.
    5. Tomáš Plíhal, 2021. "Scheduled macroeconomic news announcements and Forex volatility forecasting," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 40(8), pages 1379-1397, December.
    6. Loann David Denis Desboulets, 2018. "A Review on Variable Selection in Regression Analysis," Econometrics, MDPI, vol. 6(4), pages 1-27, November.
    7. Takumi Saegusa & Tianzhou Ma & Gang Li & Ying Qing Chen & Mei-Ling Ting Lee, 2020. "Variable Selection in Threshold Regression Model with Applications to HIV Drug Adherence Data," Statistics in Biosciences, Springer;International Chinese Statistical Association, vol. 12(3), pages 376-398, December.
    8. Pei Wang & Shunjie Chen & Sijia Yang, 2022. "Recent Advances on Penalized Regression Models for Biological Data," Mathematics, MDPI, vol. 10(19), pages 1-24, October.
    9. Ricardo P. Masini & Marcelo C. Medeiros & Eduardo F. Mendes, 2023. "Machine learning advances for time series forecasting," Journal of Economic Surveys, Wiley Blackwell, vol. 37(1), pages 76-111, February.
    10. Korobilis, Dimitris, 2013. "Hierarchical shrinkage priors for dynamic regressions with many predictors," International Journal of Forecasting, Elsevier, vol. 29(1), pages 43-59.
    11. Kenneth Lange & Eric C. Chi & Hua Zhou, 2014. "A Brief Survey of Modern Optimization for Statisticians," International Statistical Review, International Statistical Institute, vol. 82(1), pages 46-70, April.
    12. Xiaoping Liu & Xiao-Bai Li & Sumit Sarkar, 2023. "Cost-Restricted Feature Selection for Data Acquisition," Management Science, INFORMS, vol. 69(7), pages 3976-3992, July.
    13. Xiaofei Wu & Rongmei Liang & Hu Yang, 2022. "Penalized and constrained LAD estimation in fixed and high dimension," Statistical Papers, Springer, vol. 63(1), pages 53-95, February.
    14. Mogliani, Matteo & Simoni, Anna, 2021. "Bayesian MIDAS penalized regressions: Estimation, selection, and prediction," Journal of Econometrics, Elsevier, vol. 222(1), pages 833-860.
    15. Kwon, Sunghoon & Oh, Seungyoung & Lee, Youngjo, 2016. "The use of random-effect models for high-dimensional variable selection problems," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 401-412.
    16. Ismail Shah & Hina Naz & Sajid Ali & Amani Almohaimeed & Showkat Ahmad Lone, 2023. "A New Quantile-Based Approach for LASSO Estimation," Mathematics, MDPI, vol. 11(6), pages 1-13, March.
    17. Deprez, Laurens & Antonio, Katrien & Boute, Robert, 2023. "Empirical risk assessment of maintenance costs under full-service contracts," European Journal of Operational Research, Elsevier, vol. 304(2), pages 476-493.
    18. Howard D. Bondell & Brian J. Reich, 2009. "Simultaneous Factor Selection and Collapsing Levels in ANOVA," Biometrics, The International Biometric Society, vol. 65(1), pages 169-177, March.
    19. Yang, Hu & Yi, Danhui, 2015. "Studies of the adaptive network-constrained linear regression and its application," Computational Statistics & Data Analysis, Elsevier, vol. 92(C), pages 40-52.
    20. Margherita Giuzio & Sandra Paterlini, 2019. "Un-diversifying during crises: Is it a good idea?," Computational Management Science, Springer, vol. 16(3), pages 401-432, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:insuma:v:96:y:2021:i:c:p:248-261. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/inca/505554 .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.