IDEAS home Printed from https://ideas.repec.org/a/eee/ejores/v206y2010i3p528-539.html
   My bibliography  Save this article

A discrete particle swarm optimization method for feature selection in binary classification problems

Author

Listed:
  • Unler, Alper
  • Murat, Alper

Abstract

This paper investigates the feature subset selection problem for the binary classification problem using logistic regression model. We developed a modified discrete particle swarm optimization (PSO) algorithm for the feature subset selection problem. This approach embodies an adaptive feature selection procedure which dynamically accounts for the relevance and dependence of the features included the feature subset. We compare the proposed methodology with the tabu search and scatter search algorithms using publicly available datasets. The results show that the proposed discrete PSO algorithm is competitive in terms of both classification accuracy and computational performance.

Suggested Citation

  • Unler, Alper & Murat, Alper, 2010. "A discrete particle swarm optimization method for feature selection in binary classification problems," European Journal of Operational Research, Elsevier, vol. 206(3), pages 528-539, November.
  • Handle: RePEc:eee:ejores:v:206:y:2010:i:3:p:528-539
    as

    Download full text from publisher

    File URL: http://www.sciencedirect.com/science/article/pii/S0377-2217(10)00163-3
    Download Restriction: Full text for ScienceDirect subscribers only
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Sigurdur Ólafsson & Jaekyung Yang, 2005. "Intelligent Partitioning for Feature Selection," INFORMS Journal on Computing, INFORMS, vol. 17(3), pages 339-355, August.
    2. V. Robles & C. Bielza & P. Larrañaga & S. González & L. Ohno-Machado, 2008. "Optimizing logistic regression coefficients for discrimination and calibration using estimation of distribution algorithms," TOP: An Official Journal of the Spanish Society of Statistics and Operations Research, Springer;Sociedad de Estadística e Investigación Operativa, vol. 16(2), pages 345-366, December.
    3. Olafsson, Sigurdur & Li, Xiaonan & Wu, Shuning, 2008. "Operations research and data mining," European Journal of Operational Research, Elsevier, vol. 187(3), pages 1429-1448, June.
    4. Sikora, Riyaz & Piramuthu, Selwyn, 2007. "Framework for efficient feature selection in genetic algorithm based data mining," European Journal of Operational Research, Elsevier, vol. 180(2), pages 723-737, July.
    5. Meiri, Ronen & Zahavi, Jacob, 2006. "Using simulated annealing to optimize the feature selection problem in marketing applications," European Journal of Operational Research, Elsevier, vol. 171(3), pages 842-858, June.
    6. Winker, Peter & Gilli, Manfred, 2004. "Applications of optimization heuristics to estimation and modelling problems," Computational Statistics & Data Analysis, Elsevier, vol. 47(2), pages 211-223, September.
    7. Pacheco, Joaquín & Casado, Silvia & Núñez, Laura, 2009. "A variable selection method based on Tabu search for logistic regression models," European Journal of Operational Research, Elsevier, vol. 199(2), pages 506-511, December.
    8. Pacheco, Joaquin & Casado, Silvia & Nunez, Laura & Gomez, Olga, 2006. "Analysis of new variable selection methods for discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 51(3), pages 1463-1478, December.
    9. P. S. Bradley & O. L. Mangasarian & W. N. Street, 1998. "Feature Selection via Mathematical Programming," INFORMS Journal on Computing, INFORMS, vol. 10(2), pages 209-217, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Zouache, Djaafar & Moussaoui, Abdelouahab & Ben Abdelaziz, Fouad, 2018. "A cooperative swarm intelligence algorithm for multi-objective discrete optimization with application to the knapsack problem," European Journal of Operational Research, Elsevier, vol. 264(1), pages 74-88.
    2. Chen Zhang & Zhiwei Ni & Liping Ni & Na Tang, 2016. "Feature selection method based on multi-fractal dimension and harmony search algorithm and its application," International Journal of Systems Science, Taylor & Francis Journals, vol. 47(14), pages 3476-3486, October.
    3. Lee, In Gyu & Yoon, Sang Won & Won, Daehan, 2022. "A Mixed Integer Linear Programming Support Vector Machine for Cost-Effective Group Feature Selection: Branch-Cut-and-Price Approach," European Journal of Operational Research, Elsevier, vol. 299(3), pages 1055-1068.
    4. Yousif A. Alhaj & Abdelghani Dahou & Mohammed A. A. Al-qaness & Laith Abualigah & Aaqif Afzaal Abbasi & Nasser Ahmed Obad Almaweri & Mohamed Abd Elaziz & Robertas Damaševičius, 2022. "A Novel Text Classification Technique Using Improved Particle Swarm Optimization: A Case Study of Arabic Language," Future Internet, MDPI, vol. 14(7), pages 1-18, June.
    5. Wang, Xin & Liu, Xiaodong & Pedrycz, Witold & Zhu, Xiaolei & Hu, Guangfei, 2012. "Mining axiomatic fuzzy set association rules for classification problems," European Journal of Operational Research, Elsevier, vol. 218(1), pages 202-210.
    6. Alireza Pourdaryaei & Mohammad Mohammadi & Mazaher Karimi & Hazlie Mokhlis & Hazlee A. Illias & Seyed Hamidreza Aghay Kaboli & Shameem Ahmad, 2021. "Recent Development in Electricity Price Forecasting Based on Computational Intelligence Techniques in Deregulated Power Market," Energies, MDPI, vol. 14(19), pages 1, September.
    7. Bertolazzi, P. & Felici, G. & Festa, P. & Fiscon, G. & Weitschek, E., 2016. "Integer programming models for feature selection: New extensions and a randomized solution algorithm," European Journal of Operational Research, Elsevier, vol. 250(2), pages 389-399.
    8. Yi, Tao & Cheng, Xiaobin & Peng, Peng, 2022. "Two-stage optimal allocation of charging stations based on spatiotemporal complementarity and demand response: A framework based on MCS and DBPSO," Energy, Elsevier, vol. 239(PC).
    9. Moraes, Marcelo Botelho da Costa & Nagano, Marcelo Seido, 2014. "Evolutionary models in cash management policies with multiple assets," Economic Modelling, Elsevier, vol. 39(C), pages 1-7.
    10. Li, An-Da & He, Zhen & Wang, Qing & Zhang, Yang, 2019. "Key quality characteristics selection for imbalanced production data using a two-phase bi-objective feature selection method," European Journal of Operational Research, Elsevier, vol. 274(3), pages 978-989.
    11. Panagopoulos, Orestis P. & Pappu, Vijay & Xanthopoulos, Petros & Pardalos, Panos M., 2016. "Constrained subspace classifier for high dimensional datasets," Omega, Elsevier, vol. 59(PA), pages 40-46.
    12. Huang, Yuming & Ge, Bingfeng & Hipel, Keith W. & Fang, Liping & Zhao, Bin & Yang, Kewei, 2023. "Solving the inverse graph model for conflict resolution using a hybrid metaheuristic algorithm," European Journal of Operational Research, Elsevier, vol. 305(2), pages 806-819.
    13. Yu, Shiwei & Wei, Yi-Ming & Fan, Jingli & Zhang, Xian & Wang, Ke, 2012. "Exploring the regional characteristics of inter-provincial CO2 emissions in China: An improved fuzzy clustering analysis based on particle swarm optimization," Applied Energy, Elsevier, vol. 92(C), pages 552-562.
    14. Wang, Lizhi & Nikouei Mehr, Maryam, 2019. "An optimization approach to epistasis detection," European Journal of Operational Research, Elsevier, vol. 274(3), pages 1069-1076.
    15. Lin Xu & Maoliang Ling & Yujie Lu & Meng Shen, 2017. "Understanding Household Waste Separation Behaviour: Testing the Roles of Moral, Past Experience, and Perceived Policy Effectiveness within the Theory of Planned Behaviour," Sustainability, MDPI, vol. 9(4), pages 1-27, April.
    16. Toshiki Sato & Yuichi Takano & Ryuhei Miyashiro & Akiko Yoshise, 2016. "Feature subset selection for logistic regression via mixed integer optimization," Computational Optimization and Applications, Springer, vol. 64(3), pages 865-880, July.
    17. Bin, Wei & Qinke, Peng & Jing, Zhao & Xiao, Chen, 2012. "A binary particle swarm optimization algorithm inspired by multi-level organizational learning behavior," European Journal of Operational Research, Elsevier, vol. 219(2), pages 224-233.
    18. Wen, Hanguan & Liu, Xiufeng & Yang, Ming & Lei, Bo & Xu, Cheng & Chen, Zhe, 2024. "A novel approach for identifying customer groups for personalized demand-side management services using household socio-demographic data," Energy, Elsevier, vol. 286(C).
    19. Pendharkar, Parag C. & Troutt, Marvin D., 2011. "DEA based dimensionality reduction for classification problems satisfying strict non-satiety assumption," European Journal of Operational Research, Elsevier, vol. 212(1), pages 155-163, July.
    20. Mohammad Mahdi Mousavi & Jamal Ouenniche & Kaoru Tone, 2023. "A dynamic performance evaluation of distress prediction models," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(4), pages 756-784, July.
    21. Lin, Qiuzhen & Li, Jianqiang & Du, Zhihua & Chen, Jianyong & Ming, Zhong, 2015. "A novel multi-objective particle swarm optimization with multiple search strategies," European Journal of Operational Research, Elsevier, vol. 247(3), pages 732-744.
    22. Fouskakis, D., 2012. "Bayesian variable selection in generalized linear models using a combination of stochastic optimization methods," European Journal of Operational Research, Elsevier, vol. 220(2), pages 414-422.
    23. Aytug, Haldun, 2015. "Feature selection for support vector machines using Generalized Benders Decomposition," European Journal of Operational Research, Elsevier, vol. 244(1), pages 210-218.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Casado Yusta, Silvia & Nœ–ez Letamendía, Laura & Pacheco Bonrostro, Joaqu’n Antonio, 2018. "Predicting Corporate Failure: The GRASP-LOGIT Model || Predicci—n de la quiebra empresarial: el modelo GRASP-LOGIT," Revista de Métodos Cuantitativos para la Economía y la Empresa = Journal of Quantitative Methods for Economics and Business Administration, Universidad Pablo de Olavide, Department of Quantitative Methods for Economics and Business Administration, vol. 26(1), pages 294-314, Diciembre.
    2. Meisel, Stephan & Mattfeld, Dirk, 2010. "Synergies of Operations Research and Data Mining," European Journal of Operational Research, Elsevier, vol. 206(1), pages 1-10, October.
    3. Fouskakis, D., 2012. "Bayesian variable selection in generalized linear models using a combination of stochastic optimization methods," European Journal of Operational Research, Elsevier, vol. 220(2), pages 414-422.
    4. Pacheco, Joaquín & Casado, Silvia & Núñez, Laura, 2009. "A variable selection method based on Tabu search for logistic regression models," European Journal of Operational Research, Elsevier, vol. 199(2), pages 506-511, December.
    5. Brusco, Michael J., 2014. "A comparison of simulated annealing algorithms for variable selection in principal component analysis and discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 77(C), pages 38-53.
    6. Pacheco, Joaquín & Casado, Silvia & Porras, Santiago, 2013. "Exact methods for variable selection in principal component analysis: Guide functions and pre-selection," Computational Statistics & Data Analysis, Elsevier, vol. 57(1), pages 95-111.
    7. R Fildes & K Nikolopoulos & S F Crone & A A Syntetos, 2008. "Forecasting and operational research: a review," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 59(9), pages 1150-1172, September.
    8. Anzanello, Michel J. & Albin, Susan L. & Chaovalitwongse, Wanpracha A., 2012. "Multicriteria variable selection for classification of production batches," European Journal of Operational Research, Elsevier, vol. 218(1), pages 97-105.
    9. Ding‐Wen Tan & William Yeoh & Yee Ling Boo & Soung‐Yue Liew, 2013. "The Impact Of Feature Selection: A Data‐Mining Application In Direct Marketing," Intelligent Systems in Accounting, Finance and Management, John Wiley & Sons, Ltd., vol. 20(1), pages 23-38, January.
    10. J Yang & S Ólafsson, 2009. "Near-optimal feature selection for large databases," Journal of the Operational Research Society, Palgrave Macmillan;The OR Society, vol. 60(8), pages 1045-1055, August.
    11. Brusco, Michael J. & Steinley, Douglas, 2011. "Exact and approximate algorithms for variable selection in linear discriminant analysis," Computational Statistics & Data Analysis, Elsevier, vol. 55(1), pages 123-131, January.
    12. García-Alonso, Carlos R. & Torres-Jiménez, Mercedes & Hervás-Martínez, César, 2010. "Income prediction in the agrarian sector using product unit neural networks," European Journal of Operational Research, Elsevier, vol. 204(2), pages 355-365, July.
    13. Huaijun Wang & Ruomeng Ke & Junhuai Li & Yang An & Kan Wang & Lei Yu, 2018. "A correlation-based binary particle swarm optimization method for feature selection in human activity recognition," International Journal of Distributed Sensor Networks, , vol. 14(4), pages 15501477187, April.
    14. Paz, Alexander & Arteaga, Cristian & Cobos, Carlos, 2019. "Specification of mixed logit models assisted by an optimization framework," Journal of choice modelling, Elsevier, vol. 30(C), pages 50-60.
    15. Ye, Ya-Fen & Shao, Yuan-Hai & Deng, Nai-Yang & Li, Chun-Na & Hua, Xiang-Yu, 2017. "Robust Lp-norm least squares support vector regression with feature selection," Applied Mathematics and Computation, Elsevier, vol. 305(C), pages 32-52.
    16. Blueschke-Nikolaeva, V. & Blueschke, D. & Neck, R., 2012. "Optimal control of nonlinear dynamic econometric models: An algorithm and an application," Computational Statistics & Data Analysis, Elsevier, vol. 56(11), pages 3230-3240.
    17. Michael Fop & Pierre-Alexandre Mattei & Charles Bouveyron & Thomas Brendan Murphy, 2022. "Unobserved classes and extra variables in high-dimensional discriminant analysis," Advances in Data Analysis and Classification, Springer;German Classification Society - Gesellschaft für Klassifikation (GfKl);Japanese Classification Society (JCS);Classification and Data Analysis Group of the Italian Statistical Society (CLADAG);International Federation of Classification Societies (IFCS), vol. 16(1), pages 55-92, March.
    18. Schlereth, Christian & Stepanchuk, Tanja & Skiera, Bernd, 2010. "Optimization and analysis of the profitability of tariff structures with two-part tariffs," European Journal of Operational Research, Elsevier, vol. 206(3), pages 691-701, November.
    19. Asunur Cezar & Srinivasan Raghunathan & Sumit Sarkar, 2020. "Adversarial Classification: Impact of Agents’ Faking Cost on Firms and Agents," Production and Operations Management, Production and Operations Management Society, vol. 29(12), pages 2789-2807, December.
    20. Maurizio Boccia & Antonio Sforza & Claudio Sterle, 2020. "Simple Pattern Minimality Problems: Integer Linear Programming Formulations and Covering-Based Heuristic Solving Approaches," INFORMS Journal on Computing, INFORMS, vol. 32(4), pages 1049-1060, October.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:eee:ejores:v:206:y:2010:i:3:p:528-539. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Catherine Liu (email available below). General contact details of provider: http://www.elsevier.com/locate/eor .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.