IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v213y2011i1p260-269.html
   My bibliography  Save this article

Detecting relevant variables and interactions in supervised classification

Author

Listed:
  • Carrizosa, Emilio
  • Martín-Barragán, Belén
  • Morales, Dolores Romero

Abstract

The widely used Support Vector Machine (SVM) method has shown to yield good results in Supervised Classification problems. When the interpretability is an important issue, then classification methods such as Classification and Regression Trees (CART) might be more attractive, since they are designed to detect the important predictor variables and, for each predictor variable, the critical values which are most relevant for classification. However, when interactions between variables strongly affect the class membership, CART may yield misleading information. Extending previous work of the authors, in this paper an SVM-based method is introduced. The numerical experiments reported show that our method is competitive against SVM and CART in terms of misclassification rates, and, at the same time, is able to detect critical values and variables interactions which are relevant for classification.

Suggested Citation

  • Carrizosa, Emilio & Martín-Barragán, Belén & Morales, Dolores Romero, 2011. "Detecting relevant variables and interactions in supervised classification," European Journal of Operational Research, Elsevier, vol. 213(1), pages 260-269, August.
  • Handle: RePEc:eee:ejores:v:213:y:2011:i:1:p:260-269
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377-2217(10)00219-5
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. David J. Hand & Heikki Mannila & Padhraic Smyth, 2001. "Principles of Data Mining," MIT Press Books, The MIT Press, edition 1, volume 1, number 026208290x, April.
    2. Martens, David & Baesens, Bart & Van Gestel, Tony & Vanthienen, Jan, 2007. "Comprehensible credit scoring models using rule extraction from support vector machines," European Journal of Operational Research, Elsevier, vol. 183(3), pages 1466-1476, December.
    3. P. C. Gilmore & R. E. Gomory, 1961. "A Linear Programming Approach to the Cutting-Stock Problem," Operations Research, INFORMS, vol. 9(6), pages 849-859, December.
    4. Bart Baesens & Rudy Setiono & Christophe Mues & Jan Vanthienen, 2003. "Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation," Management Science, INFORMS, vol. 49(3), pages 312-329, March.
    5. Van Gestel, Tony & Martens, David & Baesens, Bart & Feremans, Daniel & Huysmans, Johan & Vanthienen, Jan, 2007. "Forecasting and analyzing insurance companies' ratings," International Journal of Forecasting, Elsevier, vol. 23(3), pages 513-529.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Romero Morales, Dolores, 2020. "Sparsity in optimal randomized classification trees," European Journal of Operational Research, Elsevier, vol. 284(1), pages 255-272.
    2. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Morales, Dolores Romero, 2022. "On sparse optimal regression trees," European Journal of Operational Research, Elsevier, vol. 299(3), pages 1045-1054.
    3. He Jiang, 2023. "Robust forecasting in spatial autoregressive model with total variation regularization," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(2), pages 195-211, March.
    4. Maldonado, Sebastián & Pérez, Juan & Bravo, Cristián, 2017. "Cost-based feature selection for Support Vector Machines: An application in credit scoring," European Journal of Operational Research, Elsevier, vol. 261(2), pages 656-665.
    5. Benítez-Peña, Sandra & Carrizosa, Emilio & Guerrero, Vanesa & Jiménez-Gamero, M. Dolores & Martín-Barragán, Belén & Molero-Río, Cristina & Ramírez-Cobo, Pepa & Romero Morales, Dolores & Sillero-Denami, 2021. "On sparse ensemble methods: An application to short-term predictions of the evolution of COVID-19," European Journal of Operational Research, Elsevier, vol. 295(2), pages 648-663.
    6. Baumann, P. & Hochbaum, D.S. & Yang, Y.T., 2019. "A comparative study of the leading machine learning techniques and two new optimization algorithms," European Journal of Operational Research, Elsevier, vol. 272(3), pages 1041-1057.
    7. Martin-Barragan, Belen & Lillo, Rosa & Romo, Juan, 2014. "Interpretable support vector machines for functional data," European Journal of Operational Research, Elsevier, vol. 232(1), pages 146-155.
    8. He Jiang, 2022. "A novel robust structural quadratic forecasting model and applications," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(6), pages 1156-1180, September.
    9. Maurizio Maravalle & Federica Ricca & Bruno Simeone & Vincenzo Spinelli, 2015. "Carpal Tunnel Syndrome automatic classification: electromyography vs. ultrasound imaging," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 23(1), pages 100-123, April.
    10. Kaiquan Xu & Stephen Shaoyi Liao & Raymond Y. K. Lau & J. Leon Zhao, 2014. "Effective Active Learning Strategies for the Use of Large-Margin Classifiers in Semantic Annotation: An Optimal Parameter Discovery Perspective," INFORMS Journal on Computing, INFORMS, vol. 26(3), pages 461-483, August.
    11. Pedro Duarte Silva, A., 2017. "Optimization approaches to Supervised Classification," European Journal of Operational Research, Elsevier, vol. 261(2), pages 772-788.
    12. Gambella, Claudio & Ghaddar, Bissan & Naoum-Sawaya, Joe, 2021. "Optimization problems for machine learning: A survey," European Journal of Operational Research, Elsevier, vol. 290(3), pages 807-828.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Martin-Barragan, Belen & Lillo, Rosa & Romo, Juan, 2014. "Interpretable support vector machines for functional data," European Journal of Operational Research, Elsevier, vol. 232(1), pages 146-155.
    2. Carrizosa, Emilio & Nogales-Gómez, Amaya & Romero Morales, Dolores, 2017. "Clustering categories in support vector machines," Omega, Elsevier, vol. 66(PA), pages 28-37.
    3. Doumpos, Michael & Zopounidis, Constantin, 2011. "Preference disaggregation and statistical learning for multicriteria decision support: A review," European Journal of Operational Research, Elsevier, vol. 209(3), pages 203-214, March.
    4. Emilio Carrizosa & Belen Martin-Barragan & Dolores Romero Morales, 2010. "Binarized Support Vector Machines," INFORMS Journal on Computing, INFORMS, vol. 22(1), pages 154-167, February.
    5. Loterman, Gert & Brown, Iain & Martens, David & Mues, Christophe & Baesens, Bart, 2012. "Benchmarking regression algorithms for loss given default modeling," International Journal of Forecasting, Elsevier, vol. 28(1), pages 161-170.
    6. Dejaeger, Karel & Goethals, Frank & Giangreco, Antonio & Mola, Lapo & Baesens, Bart, 2012. "Gaining insight into student satisfaction using comprehensible data mining techniques," European Journal of Operational Research, Elsevier, vol. 218(2), pages 548-562.
    7. B Baesens & C Mues & D Martens & J Vanthienen, 2009. "50 years of data mining and OR: upcoming trends and challenges," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(1), pages 16-23, May.
    8. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Morales, Dolores Romero, 2022. "On sparse optimal regression trees," European Journal of Operational Research, Elsevier, vol. 299(3), pages 1045-1054.
    9. Dimitris Andriosopoulos & Michalis Doumpos & Panos M. Pardalos & Constantin Zopounidis, 2019. "Computational approaches and data analytics in financial services: A literature review," Journal of the Operational Research Society, Taylor & Francis Journals, vol. 70(10), pages 1581-1599, October.
    10. TOBBACK, Ellen & MARTENS, David & VAN GESTEL, Tony & BAESENS, Bart, 2012. "Forecasting loss given default models: Impact of account characteristics and the macroeconomic state," Working Papers 2012019, University of Antwerp, Faculty of Business and Economics.
    11. Emilio Carrizosa & Cristina Molero-Río & Dolores Romero Morales, 2021. "Mathematical optimization in classification and regression trees," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 29(1), pages 5-33, April.
    12. Lara Marie Demajo & Vince Vella & Alexiei Dingli, 2020. "Explainable AI for Interpretable Credit Scoring," Papers 2012.03749, arXiv.org.
    13. Morteza Mashayekhi & Robin Gras, 2017. "Rule Extraction from Decision Trees Ensembles: New Algorithms Based on Heuristic Search and Sparse Group Lasso Methods," International Journal of Information Technology & Decision Making (IJITDM), World Scientific Publishing Co. Pte. Ltd., vol. 16(06), pages 1707-1727, November.
    14. Lessmann, Stefan & Voß, Stefan, 2009. "A reference model for customer-centric data mining with support vector machines," European Journal of Operational Research, Elsevier, vol. 199(2), pages 520-530, December.
    15. Finlay, Steven, 2011. "Multiple classifier architectures and their application to credit risk assessment," European Journal of Operational Research, Elsevier, vol. 210(2), pages 368-378, April.
    16. De Caigny, Arno & Coussement, Kristof & De Bock, Koen W., 2018. "A new hybrid classification algorithm for customer churn prediction based on logistic regression and decision trees," European Journal of Operational Research, Elsevier, vol. 269(2), pages 760-772.
    17. Berkin, Anil & Aerts, Walter & Van Caneghem, Tom, 2023. "Feasibility analysis of machine learning for performance-related attributional statements," International Journal of Accounting Information Systems, Elsevier, vol. 48(C).
    18. Blanquero, Rafael & Carrizosa, Emilio & Molero-Río, Cristina & Romero Morales, Dolores, 2020. "Sparsity in optimal randomized classification trees," European Journal of Operational Research, Elsevier, vol. 284(1), pages 255-272.
    19. E Lima & C Mues & B Baesens, 2009. "Domain knowledge integration in data mining using decision tables: case studies in churn prediction," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(8), pages 1096-1106, August.
    20. K. Coussement & K. W. Bock & S. Geuens, 2022. "A decision-analytic framework for interpretable recommendation systems with multiple input data sources: a case study for a European e-tailer," Annals of Operations Research, Springer, vol. 315(2), pages 671-694, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:213:y:2011:i:1:p:260-269. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.