IDEAS home Printed from https://ideas.repec.org/a/spr/annopr/v328y2023i1d10.1007_s10479-022-04933-8.html
   My bibliography  Save this article

Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: a comprehensive analysis

Author

Listed:
  • Kamyab Karimi

    (Kharazmi University)

  • Ali Ghodratnama

    (Kharazmi University)

  • Reza Tavakkoli-Moghaddam

    (University of Tehran)

Abstract

In recent decades, breast cancer has become one of the leading causes of mortality among women. This disease is not preventable because of its unknown causes; however, its early diagnosis increases patients’ recovery chances. Machine learning (ML) can be utilized to improve treatment outcomes in healthcare operations while diminishing costs and time. In this research, we suggest two novel feature selection (FS) methods based upon an imperialist competitive algorithm (ICA) and a bat algorithm (BA) and their combination with ML algorithms. This study aims to enhance diagnostic models’ efficiency and present a comprehensive analysis to help clinical physicians make more precise and reliable decisions. K-nearest neighbors (KNN), support vector machine (SVM), decision tree (DT), Naive Bayes, AdaBoost (AB), linear discriminant analysis (LDA), random forest (RF), logistic regression (LR), and artificial neural network (ANN) are some of the methods employed. Sensitivity, accuracy, precision, mean absolute error F-score, root mean square error, Kappa, and relative absolute error calculated the performance of the methods. This paper applied a distinctive integration of evaluation measures and ML algorithms using the wrapper feature selection based on ICA (WFSIC) and BA (WFSB) separately. We compared two proposed approaches for the performance of the classifiers. Also, we compared our best diagnostic model with previous works reported in the literature survey. Experimentations were performed on the Wisconsin diagnostic breast cancer (WDBC) dataset. Results reveal that the proposed framework that uses the BA with an accuracy of 99.12% surpasses the framework using the ICA and most previous works. Additionally, the RF classifier in the approach of FS based on BA emerges as the best model and outperforms others regarding its criteria. Besides, the results illustrate the role of our techniques in reducing the dataset dimensions up to 90% and increasing the performance of diagnostic models by over 99%. Moreover, the result demonstrates that there are more critical features than the optimum dataset obtained by proposed FS approaches that have been selected by most ML models, including the standard error of area, concavity, smoothness, perimeter, the worst of texture, compactness, radius, symmetry, smoothness, concavity, and the mean of concave points, fractal dimension, compactness, concavity that can remarkably affect the efficiency of breast cancer prediction. This study illustrates the role of our approaches in enhancing treatment outcomes in healthcare operations.

Suggested Citation

  • Kamyab Karimi & Ali Ghodratnama & Reza Tavakkoli-Moghaddam, 2023. "Two new feature selection methods based on learn-heuristic techniques for breast cancer prediction: a comprehensive analysis," Annals of Operations Research, Springer, vol. 328(1), pages 665-700, September.
  • Handle: RePEc:spr:annopr:v:328:y:2023:i:1:d:10.1007_s10479-022-04933-8
    DOI: 10.1007/s10479-022-04933-8
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10479-022-04933-8
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10479-022-04933-8?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Wang, Haifeng & Zheng, Bichen & Yoon, Sang Won & Ko, Hoo Sang, 2018. "A support vector machine-based ensemble algorithm for breast cancer diagnosis," European Journal of Operational Research, Elsevier, vol. 267(2), pages 687-699.
    2. Ya-Ju Fan & Wanpracha Chaovalitwongse, 2010. "Optimizing feature selection to improve medical diagnosis," Annals of Operations Research, Springer, vol. 174(1), pages 169-183, February.
    3. Catherine A. O’Brien & Aaron Pollett & Steven Gallinger & John E. Dick, 2007. "A human colon cancer cell capable of initiating tumour growth in immunodeficient mice," Nature, Nature, vol. 445(7123), pages 106-110, January.
    4. Bogumił Kamiński & Michał Jakubczyk & Przemysław Szufel, 2018. "A framework for sensitivity analysis of decision trees," Central European Journal of Operations Research, Springer;Slovak Society for Operations Research;Hungarian Operational Research Society;Czech Society for Operations Research;Österr. Gesellschaft für Operations Research (ÖGOR);Slovenian Society Informatika - Section for Operational Research;Croatian Operational Research Society, vol. 26(1), pages 135-159, March.
    5. Marina Johnson & Abdullah Albizri & Serhat Simsek, 2022. "Artificial intelligence in healthcare operations to enhance treatment outcomes: a framework to predict lung cancer prognosis," Annals of Operations Research, Springer, vol. 308(1), pages 275-305, January.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Deac Dan Stelian & Schebesch Klaus Bruno, 2018. "Market Forecasts and Client Behavioral Data: Towards Finding Adequate Model Complexity," Studia Universitatis „Vasile Goldis” Arad – Economics Series, Sciendo, vol. 28(3), pages 50-75, September.
    2. Astorino, Annabella & Avolio, Matteo & Fuduli, Antonio, 2022. "A maximum-margin multisphere approach for binary Multiple Instance Learning," European Journal of Operational Research, Elsevier, vol. 299(2), pages 642-652.
    3. Meshwa Rameshbhai Savalia & Jaiprakash Vinodkumar Verma, 2023. "Classifying Malignant and Benign Tumors of Breast Cancer: A Comparative Investigation Using Machine Learning Techniques," International Journal of Reliable and Quality E-Healthcare (IJRQEH), IGI Global, vol. 12(1), pages 1-19, January.
    4. Kazim Topuz & Behrooz Davazdahemami & Dursun Delen, 2024. "A Bayesian belief network-based analytics methodology for early-stage risk detection of novel diseases," Annals of Operations Research, Springer, vol. 341(1), pages 673-697, October.
    5. Onur Demiray & Evrim D. Gunes & Ercan Kulak & Emrah Dogan & Seyma Gorcin Karaketir & Serap Cifcili & Mehmet Akman & Sibel Sakarya, 2023. "Classification of patients with chronic disease by activation level using machine learning methods," Health Care Management Science, Springer, vol. 26(4), pages 626-650, December.
    6. Yikai Liu & Ruozheng Wu & Aimin Yang, 2023. "Research on Medical Problems Based on Mathematical Models," Mathematics, MDPI, vol. 11(13), pages 1-26, June.
    7. Blanquero, R. & Carrizosa, E. & Jiménez-Cordero, A. & Martín-Barragán, B., 2019. "Functional-bandwidth kernel for Support Vector Machine with Functional Data: An alternating optimization algorithm," European Journal of Operational Research, Elsevier, vol. 275(1), pages 195-207.
    8. Fabrizio De Caro & Amedeo Andreotti & Rodolfo Araneo & Massimo Panella & Antonello Rosato & Alfredo Vaccaro & Domenico Villacci, 2020. "A Review of the Enabling Methodologies for Knowledge Discovery from Smart Grids Data," Energies, MDPI, vol. 13(24), pages 1-25, December.
    9. Qian Zhang & Tianhao Li & Dengfeng Li & Wei Lu, 2024. "A goal-oriented reinforcement learning for optimal drug dosage control," Annals of Operations Research, Springer, vol. 338(2), pages 1403-1423, July.
    10. Yan Gu & Yanrong Chen & Lai Wei & Shuang Wu & Kaicheng Shen & Chengxiang Liu & Yan Dong & Yang Zhao & Yue Zhang & Chi Zhang & Wenling Zheng & Jiangyi He & Yunlong Wang & Yifei Li & Xiaoxin Zhao & Hong, 2021. "ABHD5 inhibits YAP-induced c-Met overexpression and colon cancer cell stemness via suppressing YAP methylation," Nature Communications, Nature, vol. 12(1), pages 1-15, December.
    11. Jouan, Gabriel & Arnardottir, Erna Sif & Islind, Anna Sigridur & Óskarsdóttir, María, 2024. "An algorithmic approach to identification of gray areas: Analysis of sleep scoring expert ensemble non agreement areas using a multinomial mixture model," European Journal of Operational Research, Elsevier, vol. 317(2), pages 352-365.
    12. Liang, Xijun & Zhang, Zhipeng & Song, Yunquan & Jian, Ling, 2022. "Kernel-based online regression with canal loss," European Journal of Operational Research, Elsevier, vol. 297(1), pages 268-279.
    13. Tomasz Hachaj & Marek R. Ogiela & Katarzyna Koptyra, 2018. "Human actions recognition from motion capture recordings using signal resampling and pattern recognition methods," Annals of Operations Research, Springer, vol. 265(2), pages 223-239, June.
    14. Talayeh Razzaghi & Ilya Safro & Joseph Ewing & Ehsan Sadrfaridpour & John D. Scott, 2019. "Predictive models for bariatric surgery risks with imbalanced medical datasets," Annals of Operations Research, Springer, vol. 280(1), pages 1-18, September.
    15. Dimitar Haralampiev Popov, 2022. "SME Viability Assessment Methodology: Combining Altman's Z-Score with Big Data," Bulgarian Economic Papers bep-2022-04, Faculty of Economics and Business Administration, Sofia University St Kliment Ohridski - Bulgaria // Center for Economic Theories and Policies at Sofia University St Kliment Ohridski, revised Jun 2022.
    16. Che Xu & Wenjun Chang & Weiyong Liu, 2023. "Data-driven decision model based on local two-stage weighted ensemble learning," Annals of Operations Research, Springer, vol. 325(2), pages 995-1028, June.
    17. Akampurira Paul & Mutebi Joe & Mugisha Brian & Muhaise Hussein & Kyomuhangi Rosette, 2024. "Exploring Dimensionality Reduction Techniques for Improved Breast Cancer Diagnosis," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(5), pages 808-824, May.
    18. Li, Yanying & Che, Jinxing & Yang, Youlong, 2018. "Subsampled support vector regression ensemble for short term electric load forecasting," Energy, Elsevier, vol. 164(C), pages 160-170.
    19. Erfan Mehmanchi & Andrés Gómez & Oleg A. Prokopyev, 2021. "Solving a class of feature selection problems via fractional 0–1 programming," Annals of Operations Research, Springer, vol. 303(1), pages 265-295, August.
    20. Chen, Weiyi & Zhang, Limao, 2022. "An automated machine learning approach for earthquake casualty rate and economic loss prediction," Reliability Engineering and System Safety, Elsevier, vol. 225(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:annopr:v:328:y:2023:i:1:d:10.1007_s10479-022-04933-8. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.