IDEAS home Printed from https://ideas.repec.org/a/nat/natcom/v13y2022i1d10.1038_s41467-022-31384-3.html
   My bibliography  Save this article

Explaining a series of models by propagating Shapley values

Author

Listed:
  • Hugh Chen

    (University of Washington)

  • Scott M. Lundberg

    (Microsoft Research)

  • Su-In Lee

    (University of Washington)

Abstract

Local feature attribution methods are increasingly used to explain complex machine learning models. However, current methods are limited because they are extremely expensive to compute or are not capable of explaining a distributed series of models where each model is owned by a separate institution. The latter is particularly important because it often arises in finance where explanations are mandated. Here, we present Generalized DeepSHAP (G-DeepSHAP), a tractable method to propagate local feature attributions through complex series of models based on a connection to the Shapley value. We evaluate G-DeepSHAP across biological, health, and financial datasets to show that it provides equally salient explanations an order of magnitude faster than existing model-agnostic attribution techniques and demonstrate its use in an important distributed series of models setting.

Suggested Citation

  • Hugh Chen & Scott M. Lundberg & Su-In Lee, 2022. "Explaining a series of models by propagating Shapley values," Nature Communications, Nature, vol. 13(1), pages 1-15, December.
  • Handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-31384-3
    DOI: 10.1038/s41467-022-31384-3
    as

    Download full text from publisher

    File URL: https://www.nature.com/articles/s41467-022-31384-3
    File Function: Abstract
    Download Restriction: no

    File URL: https://libkey.io/10.1038/s41467-022-31384-3?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Mukund Sundararajan & Amir Najmi, 2019. "The many Shapley values for model explanation," Papers 1908.08474, arXiv.org, revised Feb 2020.
    2. Christina Curtis & Sohrab P. Shah & Suet-Feung Chin & Gulisa Turashvili & Oscar M. Rueda & Mark J. Dunning & Doug Speed & Andy G. Lynch & Shamith Samarajiwa & Yinyin Yuan & Stefan Gräf & Gavin Ha & Gh, 2012. "The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups," Nature, Nature, vol. 486(7403), pages 346-352, June.
    3. Michael Doumpos & Constantin Zopounidis, 2007. "Model combination for credit risk assessment: A stacked generalization approach," Annals of Operations Research, Springer, vol. 151(1), pages 289-306, April.
    4. Stan Lipovetsky & Michael Conklin, 2001. "Analysis of regression in game theory approach," Applied Stochastic Models in Business and Industry, John Wiley & Sons, vol. 17(4), pages 319-330, October.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Winn-Nuñez, Emily T. & Griffin, Maryclare & Crawford, Lorin, 2024. "A simple approach for local and global variable importance in nonlinear regression models," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    2. van Zyl, Corne & Ye, Xianming & Naidoo, Raj, 2024. "Harnessing eXplainable artificial intelligence for feature selection in time series energy forecasting: A comparative analysis of Grad-CAM and SHAP," Applied Energy, Elsevier, vol. 353(PA).

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Buckmann, Marcus & Joseph, Andreas, 2022. "An interpretable machine learning workflow with an application to economic forecasting," Bank of England working papers 984, Bank of England.
    2. Aleix Prat & Fara Brasó-Maristany & Olga Martínez-Sáez & Esther Sanfeliu & Youli Xia & Meritxell Bellet & Patricia Galván & Débora Martínez & Tomás Pascual & Mercedes Marín-Aguilera & Anna Rodríguez &, 2023. "Circulating tumor DNA reveals complex biological features with clinical relevance in metastatic breast cancer," Nature Communications, Nature, vol. 14(1), pages 1-16, December.
    3. Liang, Weijuan & Zhang, Qingzhao & Ma, Shuangge, 2024. "Hierarchical false discovery rate control for high-dimensional survival analysis with interactions," Computational Statistics & Data Analysis, Elsevier, vol. 192(C).
    4. Borgonovo, Emanuele & Plischke, Elmar & Rabitti, Giovanni, 2024. "The many Shapley values for explainable artificial intelligence: A sensitivity analysis perspective," European Journal of Operational Research, Elsevier, vol. 318(3), pages 911-926.
    5. Alireza Rezazadeh & Yasamin Jafarian & Ali Kord, 2022. "Explainable Ensemble Machine Learning for Breast Cancer Diagnosis Based on Ultrasound Image Texture Features," Forecasting, MDPI, vol. 4(1), pages 1-13, February.
    6. Pera, Rebecca & Viglia, Giampaolo & Furlan, Roberto, 2016. "Who Am I? How Compelling Self-storytelling Builds Digital Personal Reputation," Journal of Interactive Marketing, Elsevier, vol. 35(C), pages 44-55.
    7. Alina Mihaela Dima & Simona Vasilache, 2016. "Credit Risk modeling for Companies Default Prediction using Neural Networks," Journal for Economic Forecasting, Institute for Economic Forecasting, vol. 0(3), pages 127-143, September.
    8. Hu'e Sullivan & Hurlin Christophe & P'erignon Christophe & Saurin S'ebastien, 2022. "Measuring the Driving Forces of Predictive Performance: Application to Credit Scoring," Papers 2212.05866, arXiv.org, revised Jun 2023.
    9. Stan Lipovetsky, 2021. "Predictor Analysis in Group Decision Making," Stats, MDPI, vol. 4(1), pages 1-14, February.
    10. Emrah Arbak, 2017. "Identifying the provisioning policies of Belgian banks," Working Paper Research 326, National Bank of Belgium.
    11. Cao Son Tran & Dan Nicolau & Richi Nayak & Peter Verhoeven, 2021. "Modeling Credit Risk: A Category Theory Perspective," JRFM, MDPI, vol. 14(7), pages 1-21, July.
    12. Francisco Salas-Molina & Juan A. Rodriguez-Aguilar & Pablo Díaz-García, 2018. "Selecting cash management models from a multiobjective perspective," Annals of Operations Research, Springer, vol. 261(1), pages 275-288, February.
    13. Masayoshi Mase & Art B. Owen & Benjamin B. Seiler, 2021. "Cohort Shapley value for algorithmic fairness," Papers 2105.07168, arXiv.org.
    14. Ilyes Abid & Farid Mkaouar & Olfa Kaabia, 2018. "Dynamic analysis of the forecasting bankruptcy under presence of unobserved heterogeneity," Annals of Operations Research, Springer, vol. 262(2), pages 241-256, March.
    15. Viglia, Giampaolo & Abrate, Graziano, 2017. "When distinction does not pay off - Investigating the determinants of European agritourism prices," Journal of Business Research, Elsevier, vol. 80(C), pages 45-52.
    16. Jones, Stewart & Johnstone, David & Wilson, Roy, 2015. "An empirical evaluation of the performance of binary classifiers in the prediction of credit ratings changes," Journal of Banking & Finance, Elsevier, vol. 56(C), pages 72-85.
    17. Adam C. Weiner & Marc J. Williams & Hongyu Shi & Ignacio Vázquez-García & Sohrab Salehi & Nicole Rusk & Samuel Aparicio & Sohrab P. Shah & Andrew McPherson, 2024. "Inferring replication timing and proliferation dynamics from single-cell DNA sequencing data," Nature Communications, Nature, vol. 15(1), pages 1-19, December.
    18. Marco, Nicholas & Şentürk, Damla & Jeste, Shafali & DiStefano, Charlotte C. & Dickinson, Abigail & Telesca, Donatello, 2024. "Flexible regularized estimation in high-dimensional mixed membership models," Computational Statistics & Data Analysis, Elsevier, vol. 194(C).
    19. Camilla Tombari & Alessandro Zannini & Rebecca Bertolio & Silvia Pedretti & Matteo Audano & Luca Triboli & Valeria Cancila & Davide Vacca & Manuel Caputo & Sara Donzelli & Ilenia Segatto & Simone Vodr, 2023. "Mutant p53 sustains serine-glycine synthesis and essential amino acids intake promoting breast cancer growth," Nature Communications, Nature, vol. 14(1), pages 1-21, December.
    20. Wenguang Zhang & Ting Lei & Yu Gong & Jun Zhang & Yirong Wu, 2022. "Using Explainable Artificial Intelligence to Identify Key Characteristics of Deep Poverty for Each Household," Sustainability, MDPI, vol. 14(16), pages 1-21, August.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:nat:natcom:v:13:y:2022:i:1:d:10.1038_s41467-022-31384-3. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.nature.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.