Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression

My bibliography Save this article

Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression

Author

Listed:

Jessica Pesantez-Narvaez
(Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)
Montserrat Guillen
(Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)
Manuela Alcañiz
(Department of Econometrics, Riskcenter-IREA, Universitat de Barcelona, 08034 Barcelona, Spain)

Registered:

Abstract

XGBoost is recognized as an algorithm with exceptional predictive capacity. Models for a binary response indicating the existence of accident claims versus no claims can be used to identify the determinants of traffic accidents. This study compared the relative performances of logistic regression and XGBoost approaches for predicting the existence of accident claims using telematics data. The dataset contained information from an insurance company about the individuals’ driving patterns—including total annual distance driven and percentage of total distance driven in urban areas. Our findings showed that logistic regression is a suitable model given its interpretability and good predictive capacity. XGBoost requires numerous model-tuning procedures to match the predictive performance of the logistic regression model and greater effort as regards to interpretation.

Suggested Citation

Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2019. "Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression," Risks, MDPI, vol. 7(2), pages 1-16, June.

Handle: RePEc:gam:jrisks:v:7:y:2019:i:2:p:70-:d:241617

Download full text from publisher

References listed on IDEAS

Roel Verbelen & Katrien Antonio & Gerda Claeskens, 2018. "Unravelling the predictive power of telematics data in car insurance pricing," Journal of the Royal Statistical Society Series C, Royal Statistical Society, vol. 67(5), pages 1275-1304, November.
- Roel Verbelen & Katrien Antonio & Gerda Claeskens, 2016. "Unraveling the predictive power of telematics data in car insurance pricing," Working Papers of Department of Decision Sciences and Information Management, Leuven 552745, KU Leuven, Faculty of Economics and Business (FEB), Department of Decision Sciences and Information Management, Leuven.
- Roel Verbelen & Katrien Antonio & Gerda Claeskens, 2018. "Unraveling the predictive power of telematics data in car insurance pricing," Working Papers Department of Accountancy, Finance and Insurance (AFI), Leuven 618916, KU Leuven, Faculty of Economics and Business (FEB), Department of Accountancy, Finance and Insurance (AFI), Leuven.
- Roel Verbelen & Katrien Antonio & Gerda Claeskens, 2018. "Unraveling the predictive power of telematics data in car insurance pricing," Working Papers of Department of Decision Sciences and Information Management, Leuven 618916, KU Leuven, Faculty of Economics and Business (FEB), Department of Decision Sciences and Information Management, Leuven.
- Roel Verbelen & Katrien Antonio & Gerda Claeskens, 2016. "Unraveling the predictive power of telematics data in car insurance pricing," Working Papers Department of Accountancy, Finance and Insurance (AFI), Leuven 552745, KU Leuven, Faculty of Economics and Business (FEB), Department of Accountancy, Finance and Insurance (AFI), Leuven.
Guangyuan Gao & Mario V. Wüthrich, 2019. "Convolutional Neural Network Classification of Telematics Car Driving Data," Risks, MDPI, vol. 7(1), pages 1-18, January.
Pieter-Tjerk de Boer & Dirk Kroese & Shie Mannor & Reuven Rubinstein, 2005. "A Tutorial on the Cross-Entropy Method," Annals of Operations Research, Springer, vol. 134(1), pages 19-67, February.
Simon C. K. Lee & Sheldon Lin, 2018. "Delta Boosting Machine with Application to General Insurance," North American Actuarial Journal, Taylor & Francis Journals, vol. 22(3), pages 405-425, July.
Hultkrantz, Lars & Nilsson, Jan-Eric & Arvidsson, Sara, 2012. "Voluntary internalization of speeding externalities with vehicle insurance," Transportation Research Part A: Policy and Practice, Elsevier, vol. 46(6), pages 926-937.
Jianhua Z. Huang & Lijian Yang, 2004. "Identification of non‐linear additive autoregressive models," Journal of the Royal Statistical Society Series B, Royal Statistical Society, vol. 66(2), pages 463-477, May.

Full references (including those not matched with items on IDEAS)

Citations

Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.

Cited by:

Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "A Synthetic Penalized Logitboost to Model Mortgage Lending with Imbalanced Data," Computational Economics, Springer;Society for Computational Economics, vol. 57(1), pages 281-309, January.
Trufin, Julien & Denuit, Michel, 2021. "Boosting cost-complexity pruned trees On Tweedie responses: the ABT machine," LIDAM Discussion Papers ISBA 2021015, Université catholique de Louvain, Institute of Statistics, Biostatistics and Actuarial Sciences (ISBA).
Meng, Shengwang & Gao, Yaqian & Huang, Yifan, 2022. "Actuarial intelligence in auto insurance: Claim frequency modeling with driving behavior features and improved boosted trees," Insurance: Mathematics and Economics, Elsevier, vol. 106(C), pages 115-127.
Zuleyka Díaz Martínez & José Fernández Menéndez & Luis Javier García Villalba, 2023. "Tariff Analysis in Automobile Insurance: Is It Time to Switch from Generalized Linear Models to Generalized Additive Models?," Mathematics, MDPI, vol. 11(18), pages 1-16, September.
Zhiyu Quan & Changyue Hu & Panyi Dong & Emiliano A. Valdez, 2024. "Improving Business Insurance Loss Models by Leveraging InsurTech Innovation," Papers 2401.16723, arXiv.org.
Tao, Hai & Alawi, Omer A. & Kamar, Haslinda Mohamed & Nafea, Ahmed Adil & AL-Ani, Mohammed M. & Abba, Sani I. & Salami, Babatunde Abiodun & Oudah, Atheer Y. & Mohammed, Mustafa K.A., 2024. "Development of integrative data intelligence models for thermo-economic performances prediction of hybrid organic rankine plants," Energy, Elsevier, vol. 292(C).
Viktor Stojkoski & Petar Jolakoski & Igor Ivanovski, 2021. "The short‐run impact of COVID‐19 on the activity in the insurance industry in the Republic of North Macedonia," Risk Management and Insurance Review, American Risk and Insurance Association, vol. 24(3), pages 221-242, September.
- Viktor Stojkoski & Petar Jolakoski & Igor Ivanovski, 2020. "The short-run impact of COVID-19 on the activity in the insurance industry in the Republic of North Macedonia," Papers 2011.10826, arXiv.org.
Nemanja Milanović & Miloš Milosavljević & Slađana Benković & Dušan Starčević & Željko Spasenić, 2020. "An Acceptance Approach for Novel Technologies in Car Insurance," Sustainability, MDPI, vol. 12(24), pages 1-15, December.
Jessica Pesantez-Narvaez & Montserrat Guillen & Manuela Alcañiz, 2021. "RiskLogitboost Regression for Rare Events in Binary Response: An Econometric Approach," Mathematics, MDPI, vol. 9(5), pages 1-21, March.
Francis Duval & Jean‐Philippe Boucher & Mathieu Pigeon, 2023. "Enhancing claim classification with feature extraction from anomaly‐detection‐derived routine and peculiarity profiles," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 90(2), pages 421-458, June.
Nelson Kemboi Yego & Juma Kasozi & Joseph Nkurunziza, 2021. "A Comparative Analysis of Machine Learning Models for the Prediction of Insurance Uptake in Kenya," Data, MDPI, vol. 6(11), pages 1-17, November.
Thomas Poufinas & Periklis Gogas & Theophilos Papadimitriou & Emmanouil Zaganidis, 2023. "Machine Learning in Forecasting Motor Insurance Claims," Risks, MDPI, vol. 11(9), pages 1-19, September.

Most related items

These are the items that most often cite the same works as this one and are cited by the same works as this one.

Christopher Blier-Wong & Hélène Cossette & Luc Lamontagne & Etienne Marceau, 2020. "Machine Learning in P&C Insurance: A Review for Pricing and Reserving," Risks, MDPI, vol. 9(1), pages 1-26, December.
Jaiswal, Rachana & Gupta, Shashank & Tiwari, Aviral Kumar, 2024. "Big data and machine learning-based decision support system to reshape the vaticination of insurance claims," Technological Forecasting and Social Change, Elsevier, vol. 209(C).
Donatella Porrini & Giulio Fusco & Cosimo Magazzino, 2020. "Black boxes and market efficiency: the effect on premiums in the Italian motor-vehicle insurance market," European Journal of Law and Economics, Springer, vol. 49(3), pages 455-472, June.
Zhiyu Quan & Changyue Hu & Panyi Dong & Emiliano A. Valdez, 2024. "Improving Business Insurance Loss Models by Leveraging InsurTech Innovation," Papers 2401.16723, arXiv.org.
Shengkun Xie, 2021. "Improving Explainability of Major Risk Factors in Artificial Neural Networks for Auto Insurance Rate Regulation," Risks, MDPI, vol. 9(7), pages 1-21, July.
Yujiao Yang & Qiongxia Song, 2014. "Jump detection in time series nonparametric regression models: a polynomial spline approach," Annals of the Institute of Statistical Mathematics, Springer;The Institute of Statistical Mathematics, vol. 66(2), pages 325-344, April.
Alfiero, Simona & Battisti, Enrico & Ηadjielias, Elias, 2022. "Black box technology, usage-based insurance, and prediction of purchase behavior: Evidence from the auto insurance sector," Technological Forecasting and Social Change, Elsevier, vol. 183(C).
Xi Chen & Enlu Zhou, 2015. "Population model-based optimization," Journal of Global Optimization, Springer, vol. 63(1), pages 125-148, September.
Germ`a Coenders & N'uria Arimany Serrat, 2023. "Accounting statement analysis at industry level. A gentle introduction to the compositional approach," Papers 2305.16842, arXiv.org, revised Mar 2025.
Lvyang Qiu & Shuyu Li & Yunsick Sung, 2021. "3D-DCDAE: Unsupervised Music Latent Representations Learning Method Based on a Deep 3D Convolutional Denoising Autoencoder for Music Genre Classification," Mathematics, MDPI, vol. 9(18), pages 1-17, September.
Kevin Kuo & Daniel Lupton, 2020. "Towards Explainability of Machine Learning Models in Insurance Pricing," Papers 2003.10674, arXiv.org.
Akimoto, Youhei & Auger, Anne & Hansen, Nikolaus, 2022. "An ODE method to prove the geometric convergence of adaptive stochastic algorithms," Stochastic Processes and their Applications, Elsevier, vol. 145(C), pages 269-307.
Anastasia Spiliopoulou & Ioannis Papamichail & Markos Papageorgiou & Yannis Tyrinopoulos & John Chrysoulakis, 2017. "Macroscopic traffic flow model calibration using different optimization algorithms," Operational Research, Springer, vol. 17(1), pages 145-164, April.
Deng, Xiangtian & Zhang, Yi & Jiang, Yi & Zhang, Yi & Qi, He, 2024. "A novel operation method for renewable building by combining distributed DC energy system and deep reinforcement learning," Applied Energy, Elsevier, vol. 353(PB).
Zhang, Yali & Shang, Pengjian, 2019. "Multivariate multiscale distribution entropy of financial time series," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 515(C), pages 72-80.
Chu, Ba, 2023. "A distance-based test of independence between two multivariate time series," Journal of Multivariate Analysis, Elsevier, vol. 195(C).
Chen, Yan & Zhang, Lei & Zhao, Yulu & Xu, Bing, 2022. "Implementation of penalized survival models in churn prediction of vehicle insurance," Journal of Business Research, Elsevier, vol. 153(C), pages 162-171.
Montserrat Guillen & Jens Perch Nielsen & Ana M. Pérez‐Marín, 2021. "Near‐miss telematics in motor insurance," Journal of Risk & Insurance, The American Risk and Insurance Association, vol. 88(3), pages 569-589, September.
A. Gouda & T. Szántai, 2008. "Rare event probabilities in stochastic networks," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 16(4), pages 441-461, December.
Liang Huang & Juanjuan Zhu & Mulan Qiu & Xiaoxiang Li & Shasha Zhu, 2022. "CA-BASNet: A Building Extraction Network in High Spatial Resolution Remote Sensing Images," Sustainability, MDPI, vol. 14(18), pages 1-15, September.

More about this item

Keywords

dichotomous response; predictive model; tree boosting; GLM; machine learning;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jrisks:v:7:y:2019:i:2:p:70-:d:241617. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Predicting Motor Insurance Claims Using Telematics Data—XGBoost versus Logistic Regression

Author

Abstract

Suggested Citation

Download full text from publisher

References listed on IDEAS

Citations

Most related items

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data