IDEAS home Printed from https://ideas.repec.org/a/wut/journl/v1y2020p119-130id1436.html
   My bibliography  Save this article

On the binary classification problem in discriminant analysis using linear programming methods

Author

Listed:
  • Michael O. Olusola
  • Sydney I. Onyeagu

Abstract

This paper is centred on a binary classification problem in which it is desired to assign a new object with multivariate features to one of two distinct populations as based on historical sets of samples from two populations. A linear discriminant analysis framework has been proposed, called the minimised sum of deviations by proportion (MSDP) to model the binary classification problem. In the MSDP formulation, the sum of the proportion of exterior deviations is minimised subject to the group separation constraints, the normalisation constraint, the upper bound constraints on proportions of exterior deviations and the sign unrestriction vis-à-vis the non-negativity constraints. The two-phase method in linear programming is adopted as a solution technique to generate the discriminant function. The decision rule on group-membership prediction is constructed using the apparent error rate. The performance of the MSDP has been compared with some existing linear discriminant models using a previously published dataset on road casualties. The MSDP model was more promising and well suited for the imbalanced dataset on road casualties.

Suggested Citation

  • Michael O. Olusola & Sydney I. Onyeagu, 2020. "On the binary classification problem in discriminant analysis using linear programming methods," Operations Research and Decisions, Wroclaw University of Science and Technology, Faculty of Management, vol. 30(1), pages 119-130.
  • Handle: RePEc:wut:journl:v:1:y:2020:p:119-130:id:1436
    DOI: 10.37190/ord200107
    as

    Download full text from publisher

    File URL: https://ord.pwr.edu.pl/assets/papers_archive/1436%20-%20published.pdf
    Download Restriction: no

    File URL: https://libkey.io/10.37190/ord200107?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    References listed on IDEAS

    as
    1. Abramovich, Felix & Pensky, Marianna, 2019. "Classification with many classes: Challenges and pluses," Journal of Multivariate Analysis, Elsevier, vol. 174(C).
    2. Lam, Kim Fung & Choo, Eng Ung & Moy, Jane W., 1996. "Minimizing deviations from the group mean: A new linear programming approach for the two-group classification problem," European Journal of Operational Research, Elsevier, vol. 88(2), pages 358-367, January.
    3. Willy Gochet & Antonie Stam & V. Srinivasan & Shaoxiang Chen, 1997. "Multigroup Discriminant Analysis Using Linear Programming," Operations Research, INFORMS, vol. 45(2), pages 213-225, April.
    4. Houshmand A. Ziari & David J. Leatham & Paul N. Ellinger, 1997. "Development of Statistical Discriminant Mathematical Programming Model Via Resampling Estimation Techniques," American Journal of Agricultural Economics, Agricultural and Applied Economics Association, vol. 79(4), pages 1352-1362.
    5. Gaynanova, Irina & Wang, Tianying, 2019. "Sparse quadratic classification rules via linear dimension reduction," Journal of Multivariate Analysis, Elsevier, vol. 169(C), pages 278-299.
    6. J J Glen, 2001. "Classification accuracy in discriminant analysis: a mixed integer programming approach," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 52(3), pages 328-339, March.
    7. K Falangis & J J Glen, 2010. "Heuristics for feature selection in mathematical programming discriminant analysis models," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 61(5), pages 804-812, May.
    8. J. M. Liittschwager & C. Wang, 1978. "Integer Programming Solution of a Classification Problem," Management Science, INFORMS, vol. 24(14), pages 1515-1525, October.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Pedro Duarte Silva, A., 2017. "Optimization approaches to Supervised Classification," European Journal of Operational Research, Elsevier, vol. 261(2), pages 772-788.
    2. Zopounidis, Constantin & Doumpos, Michael, 2002. "Multicriteria classification and sorting methods: A literature review," European Journal of Operational Research, Elsevier, vol. 138(2), pages 229-246, April.
    3. J. J. Glen, 2004. "Dichotomous categorical variable formation in mathematical programming discriminant analysis models," Naval Research Logistics (NRL), John Wiley & Sons, vol. 51(4), pages 575-596, June.
    4. Soulef Smaoui & Belaid Aouni, 2017. "Fuzzy goal programming model for classification problems," Annals of Operations Research, Springer, vol. 251(1), pages 141-160, April.
    5. Eva K. Lee & Richard J. Gallagher & David A. Patterson, 2003. "A Linear Programming Approach to Discriminant Analysis with a Reserved-Judgment Region," INFORMS Journal on Computing, INFORMS, vol. 15(1), pages 23-41, February.
    6. Adem, Jan & Gochet, Willy, 2006. "Mathematical programming based heuristics for improving LP-generated classifiers for the multiclass supervised classification problem," European Journal of Operational Research, Elsevier, vol. 168(1), pages 181-199, January.
    7. Glen, J.J., 2006. "A comparison of standard and two-stage mathematical programming discriminant analysis methods," European Journal of Operational Research, Elsevier, vol. 171(2), pages 496-515, June.
    8. Loucopoulos, Constantine, 2001. "Three-group classification with unequal misclassification costs: a mathematical programming approach," Omega, Elsevier, vol. 29(3), pages 291-297, June.
    9. T. Nguyen D. & T. Do T. & B. Nguyen N. & Т. Нгуен Д. & Т. До Т. & Б. Нгуен Н., 2016. "Применение дискриминационной модели в управлении риском потребительских кредитов в коммерческом банке Вьетнама // Applying Discriminant Model to Manage Credit Risk for Consumer Loans in Vietnamese Com," Review of Business and Economics Studies // Review of Business and Economics Studies, Финансовый Университет // Financial University, vol. 4(4), pages 5-16.
    10. B Baesens & T Van Gestel & S Viaene & M Stepanova & J Suykens & J Vanthienen, 2003. "Benchmarking state-of-the-art classification algorithms for credit scoring," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 54(6), pages 627-635, June.
    11. Adem, Jan & Gochet, Willy, 2004. "Aggregating classifiers with mathematical programming," Computational Statistics & Data Analysis, Elsevier, vol. 47(4), pages 791-807, November.
    12. Roe, R.A. & Smeelen, M. & Hoefeld, C., 2005. "Outsourcing and organizational change : an employee perspective," Research Memorandum 045, Maastricht University, Maastricht Research School of Economics of Technology and Organization (METEOR).
    13. Lau, Kin-nam & Leung, Pui-lam & Tse, Ka-kit, 1999. "A mathematical programming approach to clusterwise regression model and its extensions," European Journal of Operational Research, Elsevier, vol. 116(3), pages 640-652, August.
    14. Mingue Sun, 2009. "Liquidity Risk and Financial Competition: A Mixed Integer Programming Model for Multiple-Class Discriminant Analysis," Working Papers 0102, College of Business, University of Texas at San Antonio.
    15. Lam, Kim Fung & Moy, Jane W., 2002. "Combining discriminant methods in solving classification problems in two-group discriminant analysis," European Journal of Operational Research, Elsevier, vol. 138(2), pages 294-301, April.
    16. Wayne DeSarbo & Vijay Mahajan, 1984. "Constrained classification: The use of a priori information in cluster analysis," Psychometrika, Springer;The Psychometric Society, vol. 49(2), pages 187-215, June.
    17. Doumpos, Michael & Zopounidis, Constantin, 2004. "A multicriteria classification approach based on pairwise comparisons," European Journal of Operational Research, Elsevier, vol. 158(2), pages 378-389, October.
    18. Hussein A. Abdou & John Pointon, 2011. "Credit Scoring, Statistical Techniques And Evaluation Criteria: A Review Of The Literature," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 18(2-3), pages 59-88, April.
    19. Orsenigo, Carlotta & Vercellis, Carlo, 2004. "Discrete support vector decision trees via tabu search," Computational Statistics & Data Analysis, Elsevier, vol. 47(2), pages 311-322, September.
    20. Khurshid Kiani, 2005. "Detecting Business Cycle Asymmetries Using Artificial Neural Networks and Time Series Models," Computational Economics, Springer;Society for Computational Economics, vol. 26(1), pages 65-89, August.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:wut:journl:v:1:y:2020:p:119-130:id:1436. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Adam Kasperski (email available below). General contact details of provider: https://edirc.repec.org/data/iopwrpl.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.