IDEAS home Printed from https://ideas.repec.org/a/vrs/demode/v9y2021i1p62-81n1.html
   My bibliography  Save this article

Explaining predictive models using Shapley values and non-parametric vine copulas

Author

Listed:
  • Aas Kjersti

    (Norwegian Computing Center)

  • Nagler Thomas

    (Leiden University)

  • Jullum Martin

    (Norwegian Computing Center)

  • Løland Anders

    (Norwegian Computing Center)

Abstract

In this paper the goal is to explain predictions from complex machine learning models. One method that has become very popular during the last few years is Shapley values. The original development of Shapley values for prediction explanation relied on the assumption that the features being described were independent. If the features in reality are dependent this may lead to incorrect explanations. Hence, there have recently been attempts of appropriately modelling/estimating the dependence between the features. Although the previously proposed methods clearly outperform the traditional approach assuming independence, they have their weaknesses. In this paper we propose two new approaches for modelling the dependence between the features. Both approaches are based on vine copulas, which are flexible tools for modelling multivariate non-Gaussian distributions able to characterise a wide range of complex dependencies. The performance of the proposed methods is evaluated on simulated data sets and a real data set. The experiments demonstrate that the vine copula approaches give more accurate approximations to the true Shapley values than their competitors.

Suggested Citation

  • Aas Kjersti & Nagler Thomas & Jullum Martin & Løland Anders, 2021. "Explaining predictive models using Shapley values and non-parametric vine copulas," Dependence Modeling, De Gruyter, vol. 9(1), pages 62-81, January.
  • Handle: RePEc:vrs:demode:v:9:y:2021:i:1:p:62-81:n:1
    DOI: 10.1515/demo-2021-0103
    as

    Download full text from publisher

    File URL: https://doi.org/10.1515/demo-2021-0103
    Download Restriction: no

    File URL: https://libkey.io/10.1515/demo-2021-0103?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Wright, Marvin N. & Ziegler, Andreas, 2017. "ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 77(i01).
    2. Anastasios Panagiotelis & Claudia Czado & Harry Joe, 2012. "Pair Copula Constructions for Multivariate Discrete Data," Journal of the American Statistical Association, Taylor & Francis Journals, vol. 107(499), pages 1063-1072, September.
    3. Cooke, R.M. & Kurowicka, D. & Wilson, K., 2015. "Sampling, conditionalizing, counting, merging, searching regular vines," Journal of Multivariate Analysis, Elsevier, vol. 138(C), pages 4-18.
    4. Stan Lipovetsky & Michael Conklin, 2001. "Analysis of regression in game theory approach," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 17(4), pages 319-330, October.
    5. Hobæk Haff, Ingrid & Aas, Kjersti & Frigessi, Arnoldo & Lacal, Virginia, 2016. "Structure learning in Bayesian Networks using regular vines," Computational Statistics & Data Analysis, Elsevier, vol. 101(C), pages 186-208.
    6. Chang, Bo & Joe, Harry, 2019. "Prediction based on conditional distributions of vine copulas," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 45-63.
    7. Aas, Kjersti & Czado, Claudia & Frigessi, Arnoldo & Bakken, Henrik, 2009. "Pair-copula constructions of multiple dependence," Insurance: Mathematics and Economics, Elsevier, vol. 44(2), pages 182-198, April.
    8. Fan, Jianqing & Yao, Qiwei & Tong, Howell, 1996. "Estimation of conditional densities and sensitivity measures in nonlinear dynamical systems," LSE Research Online Documents on Economics 6704, London School of Economics and Political Science, LSE Library.
    9. Holmes, Michael P. & Gray, Alexander G. & Isbell Jr., Charles Lee, 2010. "Fast kernel conditional density estimation: A dual-tree Monte Carlo approach," Computational Statistics & Data Analysis, Elsevier, vol. 54(7), pages 1707-1718, July.
    10. Stöber, Jakob & Hong, Hyokyoung Grace & Czado, Claudia & Ghosh, Pulak, 2015. "Comorbidity of chronic diseases in the elderly: Patterns identified by a copula design for mixed responses," Computational Statistics & Data Analysis, Elsevier, vol. 88(C), pages 28-39.
    11. Nagler, Thomas & Czado, Claudia, 2016. "Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas," Journal of Multivariate Analysis, Elsevier, vol. 151(C), pages 69-89.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Roger M. Cooke & Harry Joe & Bo Chang, 2020. "Vine copula regression for observational studies," AStA Advances in Statistical Analysis, Springer;German Statistical Society, vol. 104(2), pages 141-167, June.
    2. Genest Christian & Scherer Matthias, 2019. "The world of vines: An interview with Claudia Czado," Dependence Modeling, De Gruyter, vol. 7(1), pages 169-180, January.
    3. Chang, Bo & Joe, Harry, 2019. "Prediction based on conditional distributions of vine copulas," Computational Statistics & Data Analysis, Elsevier, vol. 139(C), pages 45-63.
    4. Wang, Fan & Li, Heng & Dong, Chao, 2021. "Understanding near-miss count data on construction sites using greedy D-vine copula marginal regression," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    5. Hobæk Haff, Ingrid & Aas, Kjersti & Frigessi, Arnoldo & Lacal, Virginia, 2016. "Structure learning in Bayesian Networks using regular vines," Computational Statistics & Data Analysis, Elsevier, vol. 101(C), pages 186-208.
    6. Shi, Peng & Zhao, Zifeng, 2024. "Enhanced pricing and management of bundled insurance risks with dependence-aware prediction using pair copula construction," Journal of Econometrics, Elsevier, vol. 240(1).
    7. Zilko, Aurelius A. & Kurowicka, Dorota, 2016. "Copula in a multivariate mixed discrete–continuous model," Computational Statistics & Data Analysis, Elsevier, vol. 103(C), pages 28-55.
    8. Brida Juan Gabriel & Moreno Leonardo & Scaglione Miriam, 2024. "Modeling multivariate tourism expenditure using vine copula: empirical findings from of Fribourg-Switzerland," Quality & Quantity: International Journal of Methodology, Springer, vol. 58(5), pages 4093-4116, October.
    9. Wattanawongwan, Suttisak & Mues, Christophe & Okhrati, Ramin & Choudhry, Taufiq & So, Mee Chi, 2023. "Modelling credit card exposure at default using vine copula quantile regression," European Journal of Operational Research, Elsevier, vol. 311(1), pages 387-399.
    10. Saeide Sefidi & Mojtaba Ganjali & Taban Baghfalaki, 2022. "Analysis of ordinal and continuous longitudinal responses using pair copula construction," METRON, Springer;Sapienza Università di Roma, vol. 80(2), pages 255-280, August.
    11. Zhu, Kailun & Kurowicka, Dorota, 2022. "Regular vines with strongly chordal pattern of (conditional) independence," Computational Statistics & Data Analysis, Elsevier, vol. 172(C).
    12. Ozonder, Gozde & Miller, Eric J., 2021. "Longitudinal investigation of skeletal activity episode timing decisions – A copula approach," Journal of choice modelling, Elsevier, vol. 40(C).
    13. Aristidis Nikoloulopoulos & Harry Joe, 2015. "Factor Copula Models for Item Response Data," Psychometrika, Springer;The Psychometric Society, vol. 80(1), pages 126-150, March.
    14. Nagler Thomas & Schellhase Christian & Czado Claudia, 2017. "Nonparametric estimation of simplified vine copula models: comparison of methods," Dependence Modeling, De Gruyter, vol. 5(1), pages 99-120, January.
    15. Brechmann, Eike & Czado, Claudia & Paterlini, Sandra, 2014. "Flexible dependence modeling of operational risk losses and its impact on total capital requirements," Journal of Banking & Finance, Elsevier, vol. 40(C), pages 271-285.
    16. Hirofumi Michimae & Takeshi Emura, 2022. "Bayesian ridge estimators based on copula-based joint prior distributions for regression coefficients," Computational Statistics, Springer, vol. 37(5), pages 2741-2769, November.
    17. Weiping Zhang & MengMeng Zhang & Yu Chen, 2020. "A Copula-Based GLMM Model for Multivariate Longitudinal Data with Mixed-Types of Responses," Sankhya B: The Indian Journal of Statistics, Springer;Indian Statistical Institute, vol. 82(2), pages 353-379, November.
    18. Kraus, Daniel & Czado, Claudia, 2017. "D-vine copula based quantile regression," Computational Statistics & Data Analysis, Elsevier, vol. 110(C), pages 1-18.
    19. Sahin, Özge & Czado, Claudia, 2022. "Vine copula mixture models and clustering for non-Gaussian data," Econometrics and Statistics, Elsevier, vol. 22(C), pages 136-158.
    20. Calabrese, Raffaella & Degl’Innocenti, Marta & Osmetti, Silvia Angela, 2017. "The effectiveness of TARP-CPP on the US banking industry: A new copula-based approach," European Journal of Operational Research, Elsevier, vol. 256(3), pages 1029-1037.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:vrs:demode:v:9:y:2021:i:1:p:62-81:n:1. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Peter Golla (email available below). General contact details of provider: https://www.degruyter.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.