IDEAS home Printed from https://ideas.repec.org/a/plo/pone00/0164568.html
   My bibliography  Save this article

Explaining Support Vector Machines: A Color Based Nomogram

Author

Listed:
  • Vanya Van Belle
  • Ben Van Calster
  • Sabine Van Huffel
  • Johan A K Suykens
  • Paulo Lisboa

Abstract

Problem setting: Support vector machines (SVMs) are very popular tools for classification, regression and other problems. Due to the large choice of kernels they can be applied with, a large variety of data can be analysed using these tools. Machine learning thanks its popularity to the good performance of the resulting models. However, interpreting the models is far from obvious, especially when non-linear kernels are used. Hence, the methods are used as black boxes. As a consequence, the use of SVMs is less supported in areas where interpretability is important and where people are held responsible for the decisions made by models. Objective: In this work, we investigate whether SVMs using linear, polynomial and RBF kernels can be explained such that interpretations for model-based decisions can be provided. We further indicate when SVMs can be explained and in which situations interpretation of SVMs is (hitherto) not possible. Here, explainability is defined as the ability to produce the final decision based on a sum of contributions which depend on one single or at most two input variables. Results: Our experiments on simulated and real-life data show that explainability of an SVM depends on the chosen parameter values (degree of polynomial kernel, width of RBF kernel and regularization constant). When several combinations of parameter values yield the same cross-validation performance, combinations with a lower polynomial degree or a larger kernel width have a higher chance of being explainable. Conclusions: This work summarizes SVM classifiers obtained with linear, polynomial and RBF kernels in a single plot. Linear and polynomial kernels up to the second degree are represented exactly. For other kernels an indication of the reliability of the approximation is presented. The complete methodology is available as an R package and two apps and a movie are provided to illustrate the possibilities offered by the method.

Suggested Citation

  • Vanya Van Belle & Ben Van Calster & Sabine Van Huffel & Johan A K Suykens & Paulo Lisboa, 2016. "Explaining Support Vector Machines: A Color Based Nomogram," PLOS ONE, Public Library of Science, vol. 11(10), pages 1-33, October.
  • Handle: RePEc:plo:pone00:0164568
    DOI: 10.1371/journal.pone.0164568
    as

    Download full text from publisher

    File URL: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0164568
    Download Restriction: no

    File URL: https://journals.plos.org/plosone/article/file?id=10.1371/journal.pone.0164568&type=printable
    Download Restriction: no

    File URL: https://libkey.io/10.1371/journal.pone.0164568?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Yousefzadeh Barri, Elnaz & Farber, Steven & Jahanshahi, Hadi & Beyazit, Eda, 2022. "Understanding transit ridership in an equity context through a comparison of statistical and machine learning algorithms," Journal of Transport Geography, Elsevier, vol. 105(C).
    2. Salman Khalid & Hyunho Hwang & Heung Soo Kim, 2021. "Real-World Data-Driven Machine-Learning-Based Optimal Sensor Selection Approach for Equipment Fault Detection in a Thermal Power Plant," Mathematics, MDPI, vol. 9(21), pages 1-27, November.
    3. Roberson Andrea, 2021. "Applying Machine Learning for Automatic Product Categorization," Journal of Official Statistics, Sciendo, vol. 37(2), pages 395-410, June.
    4. Dario Sansone & Anna Zhu, 2023. "Using Machine Learning to Create an Early Warning System for Welfare Recipients," Oxford Bulletin of Economics and Statistics, Department of Economics, University of Oxford, vol. 85(5), pages 959-992, October.
    5. Yu, Baojun & Li, Changming & Mirza, Nawazish & Umar, Muhammad, 2022. "Forecasting credit ratings of decarbonized firms: Comparative assessment of machine learning models," Technological Forecasting and Social Change, Elsevier, vol. 174(C).
    6. Wenninger, Simon & Kaymakci, Can & Wiethe, Christian, 2022. "Explainable long-term building energy consumption prediction using QLattice," Applied Energy, Elsevier, vol. 308(C).
    7. Hazlee Azil Illias & Wee Zhao Liang, 2018. "Identification of transformer fault based on dissolved gas analysis using hybrid support vector machine-modified evolutionary particle swarm optimisation," PLOS ONE, Public Library of Science, vol. 13(1), pages 1-15, January.
    8. Chris Reimann, 2024. "Predicting financial crises: an evaluation of machine learning algorithms and model explainability for early warning systems," Review of Evolutionary Political Economy, Springer, vol. 5(1), pages 51-83, June.
    9. Ma, Dingyuan & Li, Xiaodong & Lin, Borong & Zhu, Yimin, 2023. "An intelligent retrofit decision-making model for building program planning considering tacit knowledge and multiple objectives," Energy, Elsevier, vol. 263(PB).
    10. Li, Jing-Ping & Mirza, Nawazish & Rahat, Birjees & Xiong, Deping, 2020. "Machine learning and credit ratings prediction in the age of fourth industrial revolution," Technological Forecasting and Social Change, Elsevier, vol. 161(C).
    11. Arthur C. Santos & Wesley A. Souza & Gustavo V. Barbara & Marcelo F. Castoldi & Alessandro Goedtel, 2023. "Diagnostics of Early Faults in Wind Generator Bearings Using Hjorth Parameters," Sustainability, MDPI, vol. 15(20), pages 1-17, October.
    12. Alkhaleel, Basem A., 2024. "Machine learning applications in the resilience of interdependent critical infrastructure systems—A systematic literature review," International Journal of Critical Infrastructure Protection, Elsevier, vol. 44(C).
    13. Gründler, Klaus & Krieger, Tommy, 2021. "Using Machine Learning for measuring democracy: A practitioners guide and a new updated dataset for 186 countries from 1919 to 2019," European Journal of Political Economy, Elsevier, vol. 70(C).
    14. Achim Ahrens & Christian B. Hansen & Mark E. Schaffer, 2023. "pystacked: Stacking generalization and machine learning in Stata," Stata Journal, StataCorp LP, vol. 23(4), pages 909-931, December.
    15. Yi Yang & Yuting Bai & Xiaoyi Wang & Li Wang & Xuebo Jin & Qian Sun, 2020. "Group Decision-Making Support for Sustainable Governance of Algal Bloom in Urban Lakes," Sustainability, MDPI, vol. 12(4), pages 1-16, February.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:plo:pone00:0164568. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: plosone (email available below). General contact details of provider: https://journals.plos.org/plosone/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.