IDEAS home Printed from https://ideas.repec.org/a/gam/jrisks/v11y2023i9p163-d1238092.html
   My bibliography  Save this article

Modelling Motor Insurance Claim Frequency and Severity Using Gradient Boosting

Author

Listed:
  • Carina Clemente

    (NOVA IMS—Information Management School, Universidade Nova de Lisboa, 1070-312 Lisbon, Portugal)

  • Gracinda R. Guerreiro

    (FCT NOVA, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal
    CMA-FCT-UNL, Universidade Nova de Lisboa, 2829-516 Caparica, Portugal)

  • Jorge M. Bravo

    (NOVA IMS—Information Management School, Universidade Nova de Lisboa, MagIC, 1070-312 Lisbon, Portugal
    Department of Economics, University Paris-Dauphine PSL, 75016 Paris, France
    CEFAGE-UE, 7000-809 Évora, Portugal
    BRU-ISCTE-IUL, 1649-026 Lisbon, Portugal)

Abstract

Modelling claim frequency and claim severity are topics of great interest in property-casualty insurance for supporting underwriting, ratemaking, and reserving actuarial decisions. Standard Generalized Linear Models (GLM) frequency–severity models assume a linear relationship between a function of the response variable and the predictors, independence between the claim frequency and severity, and assign full credibility to the data. To overcome some of these restrictions, this paper investigates the predictive performance of Gradient Boosting with decision trees as base learners to model the claim frequency and the claim severity distributions of an auto insurance big dataset and compare it with that obtained using a standard GLM model. The out-of-sample performance measure results show that the predictive performance of the Gradient Boosting Model (GBM) is superior to the standard GLM model in the Poisson claim frequency model. Differently, in the claim severity model, the classical GLM outperformed the Gradient Boosting Model. The findings suggest that gradient boost models can capture the non-linear relation between the response variable and feature variables and their complex interactions and thus are a valuable tool for the insurer in feature engineering and the development of a data-driven approach to risk management and insurance.

Suggested Citation

  • Carina Clemente & Gracinda R. Guerreiro & Jorge M. Bravo, 2023. "Modelling Motor Insurance Claim Frequency and Severity Using Gradient Boosting," Risks, MDPI, vol. 11(9), pages 1-20, September.
  • Handle: RePEc:gam:jrisks:v:11:y:2023:i:9:p:163-:d:1238092
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-9091/11/9/163/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-9091/11/9/163/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Frees, Edward W. & Shi, Peng & Valdez, Emiliano A., 2009. "Actuarial Applications of a Hierarchical Insurance Claims Model," ASTIN Bulletin, Cambridge University Press, vol. 39(1), pages 165-197, May.
    2. Yves Staudt & Joël Wagner, 2021. "Assessing the Performance of Random Forests for Modeling Claim Severity in Collision Car Insurance," Risks, MDPI, vol. 9(3), pages 1-28, March.
    3. Yi Yang & Wei Qian & Hui Zou, 2018. "Insurance Premium Prediction via Gradient Tree-Boosted Tweedie Compound Poisson Models," Journal of Business & Economic Statistics, Taylor & Francis Journals, vol. 36(3), pages 456-470, July.
    4. Edward Frees & Jie Gao & Marjorie Rosenberg, 2011. "Predicting the Frequency and Amount of Health Care Expenditures," North American Actuarial Journal, Taylor & Francis Journals, vol. 15(3), pages 377-392.
    5. Gao, Guangyuan & Li, Jiahong, 2023. "Dependence modeling of frequency-severity of insurance claims using waiting time," Insurance: Mathematics and Economics, Elsevier, vol. 109(C), pages 29-51.
    6. Meng, Shengwang & Gao, Yaqian & Huang, Yifan, 2022. "Actuarial intelligence in auto insurance: Claim frequency modeling with driving behavior features and improved boosted trees," Insurance: Mathematics and Economics, Elsevier, vol. 106(C), pages 115-127.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Xiaoshan Su & Manying Bai, 2020. "Stochastic gradient boosting frequency-severity model of insurance claims," PLOS ONE, Public Library of Science, vol. 15(8), pages 1-24, August.
    2. Jeong, Himchan & Valdez, Emiliano A., 2020. "Predictive compound risk models with dependence," Insurance: Mathematics and Economics, Elsevier, vol. 94(C), pages 182-195.
    3. Junhao Liu & Anita Mukherjee, 2021. "Medicaid and long‐term care: The effects of penalizing strategic asset transfers," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 88(1), pages 53-77, March.
    4. Pierre-Olivier Goffard & Patrick Laub, 2021. "Approximate Bayesian Computations to fit and compare insurance loss models," Working Papers hal-02891046, HAL.
    5. Kevin Kuo & Daniel Lupton, 2020. "Towards Explainability of Machine Learning Models in Insurance Pricing," Papers 2003.10674, arXiv.org.
    6. Peng Shi & Wei Zhang, 2011. "A copula regression model for estimating firm efficiency in the insurance industry," Journal of Applied Statistics, Taylor & Francis Journals, vol. 38(10), pages 2271-2287.
    7. Hua, Lei, 2015. "Tail negative dependence and its applications for aggregate loss modeling," Insurance: Mathematics and Economics, Elsevier, vol. 61(C), pages 135-145.
    8. Qianhong Lu & Xiaoqing Gan & Zhensheng Chen, 2023. "The Impact of Medical Insurance Payment Policy Reform on Medical Cost and Medical Burden in China," Sustainability, MDPI, vol. 15(3), pages 1-18, January.
    9. Anja Breuer & Yves Staudt, 2022. "Equalization Reserves for Reinsurance and Non-Life Undertakings in Switzerland," Risks, MDPI, vol. 10(3), pages 1-41, March.
    10. Tsyganov, Aleksander & Baskakov, Valery & Yazykov, Andrey & Sheparnev, Nikolay & Yanenko, Evgeny & Grysenkova, Yulia, 2019. "The impact of the bonus-malus system on the insurance ratemaking in the system of compulsory insurance of the responsibility of transport owners in Russia," Applied Econometrics, Russian Presidential Academy of National Economy and Public Administration (RANEPA), vol. 56, pages 123-141.
    11. Edward W. Frees & Gee Lee & Lu Yang, 2016. "Multivariate Frequency-Severity Regression Models in Insurance," Risks, MDPI, vol. 4(1), pages 1-36, February.
    12. Yaojun Zhang & Lanpeng Ji & Georgios Aivaliotis & Charles Taylor, 2023. "Bayesian CART models for insurance claims frequency," Papers 2303.01923, arXiv.org, revised Dec 2023.
    13. Ahmed, Hanan, 2022. "Extreme value statistics using related variables," Other publications TiSEM 246f0f13-701c-4c0d-8e09-e, Tilburg University, School of Economics and Management.
    14. Qian, Wei & Rolling, Craig A. & Cheng, Gang & Yang, Yuhong, 2022. "Combining forecasts for universally optimal performance," International Journal of Forecasting, Elsevier, vol. 38(1), pages 193-208.
    15. Christopher Blier-Wong & Hélène Cossette & Luc Lamontagne & Etienne Marceau, 2020. "Machine Learning in P&C Insurance: A Review for Pricing and Reserving," Risks, MDPI, vol. 9(1), pages 1-26, December.
    16. Bermúdez, Lluís & Karlis, Dimitris, 2011. "Bayesian multivariate Poisson models for insurance ratemaking," Insurance: Mathematics and Economics, Elsevier, vol. 48(2), pages 226-236, March.
    17. Avalosse, Hervé & Denuit, Michel & Lucas, Nathalie, 2020. "Hospital inpatients costs dynamics at older ages: A frequency-severity approach," LIDAM Discussion Papers ISBA 2020027, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
    18. Syuhada, Khreshna & Tjahjono, Venansius & Hakim, Arief, 2024. "Compound Poisson–Lindley process with Sarmanov dependence structure and its application for premium-based spectral risk forecasting," Applied Mathematics and Computation, Elsevier, vol. 467(C).
    19. Gao, Guangyuan & Li, Jiahong, 2023. "Dependence modeling of frequency-severity of insurance claims using waiting time," Insurance: Mathematics and Economics, Elsevier, vol. 109(C), pages 29-51.
    20. Dong-Young Lim, 2021. "A Neural Frequency-Severity Model and Its Application to Insurance Claims," Papers 2106.10770, arXiv.org, revised Feb 2024.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jrisks:v:11:y:2023:i:9:p:163-:d:1238092. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.