IDEAS home Printed from https://ideas.repec.org/a/bjc/journl/v11y2024i5p808-824.html
   My bibliography  Save this article

Exploring Dimensionality Reduction Techniques for Improved Breast Cancer Diagnosis

Author

Listed:
  • Akampurira Paul

    (Kampala International University, Uganda)

  • Mutebi Joe

    (Kampala International University, Uganda)

  • Mugisha Brian

    (Kampala International University, Uganda)

  • Muhaise Hussein

    (Kampala International University, Uganda)

  • Kyomuhangi Rosette

    (Kampala International University, Uganda)

Abstract

A crucial area of medical study is the diagnosis of breast cancer, where managing the inherent complexity of high-dimensional information poses a challenge in addition to precise identification. In order to improve diagnostic accuracy, this research investigates dimensionality reduction strategies. This study’s main goal was to improve the accuracy and interpret ability of breast cancer diagnosis by using dimensionality reduction techniques. The goal of the study is to find significant patterns for useful diagnostic models by examining how preprocessing methods affect a high-dimensional dataset. Starting with a dataset including 569 observations and 30 attributes, careful examination reveals imbalances in the dataset (63% benign, 37% malignant). We used Pearson correlation coefficients to detect and eliminate highly correlated features in order to address multi collinearity. A subsequent adjustment of the data using min-max normalization guarantees consistent weighting. Then, for thorough dimensionality reduction, Principal Component Analysis (PCA) is employed. Screep lots and biplots are used to visually represent data, highlighting how well-suited early principle components are for separating benign from malignant instances. Our findings confirm the effectiveness of the procedure by showing a significant 24% decrease in data dimensionality. This work highlights the critical role that dimensionality reduction plays in improving breast cancer diagnosis for more precise, effective, and understandable models, and it calls for further investigation of the specific findings.

Suggested Citation

  • Akampurira Paul & Mutebi Joe & Mugisha Brian & Muhaise Hussein & Kyomuhangi Rosette, 2024. "Exploring Dimensionality Reduction Techniques for Improved Breast Cancer Diagnosis," International Journal of Research and Scientific Innovation, International Journal of Research and Scientific Innovation (IJRSI), vol. 11(5), pages 808-824, May.
  • Handle: RePEc:bjc:journl:v:11:y:2024:i:5:p:808-824
    as

    Download full text from publisher

    File URL: https://www.rsisinternational.org/journals/ijrsi/digital-library/volume-11-issue-5/808-824.pdf
    Download Restriction: no

    File URL: https://rsisinternational.org/journals/ijrsi/articles/exploring-dimensionality-reduction-techniques-for-improved-breast-cancer-diagnosis/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Saba Bashir & Usman Qamar & Farhan Khan, 2015. "Heterogeneous classifiers fusion for dynamic breast cancer diagnosis using weighted vote based ensemble," Quality & Quantity: International Journal of Methodology, Springer, vol. 49(5), pages 2061-2076, September.
    2. Wang, Haifeng & Zheng, Bichen & Yoon, Sang Won & Ko, Hoo Sang, 2018. "A support vector machine-based ensemble algorithm for breast cancer diagnosis," European Journal of Operational Research, Elsevier, vol. 267(2), pages 687-699.
    3. Jagpreet Chhatwal & Oguzhan Alagoz & Elizabeth S. Burnside, 2010. "Optimal Breast Biopsy Decision-Making Based on Mammographic Features and Demographic Factors," Operations Research, INFORMS, vol. 58(6), pages 1577-1591, December.
    4. Bingtao Zhang & Peng Cao, 2019. "Classification of high dimensional biomedical data based on feature selection using redundant removal," PLOS ONE, Public Library of Science, vol. 14(4), pages 1-19, April.
    5. Joshua T. Vogelstein & Eric W. Bridgeford & Minh Tang & Da Zheng & Christopher Douville & Randal Burns & Mauro Maggioni, 2021. "Supervised dimensionality reduction for big data," Nature Communications, Nature, vol. 12(1), pages 1-9, December.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Meshwa Rameshbhai Savalia & Jaiprakash Vinodkumar Verma, 2023. "Classifying Malignant and Benign Tumors of Breast Cancer: A Comparative Investigation Using Machine Learning Techniques," International Journal of Reliable and Quality E-Healthcare (IJRQEH), IGI Global, vol. 12(1), pages 1-19, January.
    2. Eike Nohdurft & Elisa Long & Stefan Spinler, 2017. "Was Angelina Jolie Right? Optimizing Cancer Prevention Strategies Among BRCA Mutation Carriers," Decision Analysis, INFORMS, vol. 14(3), pages 139-169, September.
    3. Abdur Rasool & Chayut Bunterngchit & Luo Tiejian & Md. Ruhul Islam & Qiang Qu & Qingshan Jiang, 2022. "Improved Machine Learning-Based Predictive Models for Breast Cancer Diagnosis," IJERPH, MDPI, vol. 19(6), pages 1-19, March.
    4. Jing Li & Ming Dong & Yijiong Ren & Kaiqi Yin, 2015. "How patient compliance impacts the recommendations for colorectal cancer screening," Journal of Combinatorial Optimization, Springer, vol. 30(4), pages 920-937, November.
    5. Elliot Lee & Mariel Lavieri & Michael Volk & Yongcai Xu, 2015. "Applying reinforcement learning techniques to detect hepatocellular carcinoma under limited screening capacity," Health Care Management Science, Springer, vol. 18(3), pages 363-375, September.
    6. Baruch Keren & Joseph Pliskin, 2011. "Optimal timing of joint replacement using mathematical programming and stochastic programming models," Health Care Management Science, Springer, vol. 14(4), pages 361-369, November.
    7. Joanna Błajda & Edyta Barnaś & Anna Kucab, 2022. "Application of Personalized Education in the Mobile Medical App for Breast Self-Examination," IJERPH, MDPI, vol. 19(8), pages 1-21, April.
    8. Gemma Turon & Jason Hlozek & John G. Woodland & Ankur Kumar & Kelly Chibale & Miquel Duran-Frigola, 2023. "First fully-automated AI/ML virtual screening cascade implemented at a drug discovery centre in Africa," Nature Communications, Nature, vol. 14(1), pages 1-11, December.
    9. Sultan Almotairi & Elsayed Badr & Mustafa Abdul Salam & Hagar Ahmed, 2023. "Breast Cancer Diagnosis Using a Novel Parallel Support Vector Machine with Harris Hawks Optimization," Mathematics, MDPI, vol. 11(14), pages 1-25, July.
    10. Oguzhan Alagoz & Jagpreet Chhatwal & Elizabeth S. Burnside, 2013. "Optimal Policies for Reducing Unnecessary Follow-Up Mammography Exams in Breast Cancer Diagnosis," Decision Analysis, INFORMS, vol. 10(3), pages 200-224, September.
    11. Robert Kraig Helmeczi & Can Kavaklioglu & Mucahit Cevik & Davood Pirayesh Neghab, 2023. "A multi-objective constrained partially observable Markov decision process model for breast cancer screening," Operational Research, Springer, vol. 23(2), pages 1-42, June.
    12. Malek Ebadi & Raha Akhavan-Tabatabaei, 2021. "Personalized Cotesting Policies for Cervical Cancer Screening: A POMDP Approach," Mathematics, MDPI, vol. 9(6), pages 1-20, March.
    13. Astorino, Annabella & Avolio, Matteo & Fuduli, Antonio, 2022. "A maximum-margin multisphere approach for binary Multiple Instance Learning," European Journal of Operational Research, Elsevier, vol. 299(2), pages 642-652.
    14. M. Reza Skandari & Steven M. Shechter, 2021. "Patient-Type Bayes-Adaptive Treatment Plans," Operations Research, INFORMS, vol. 69(2), pages 574-598, March.
    15. Baldomero-Naranjo, Marta & Martínez-Merino, Luisa I. & Rodríguez-Chía, Antonio M., 2020. "Tightening big Ms in integer programming formulations for support vector machines with ramp loss," European Journal of Operational Research, Elsevier, vol. 286(1), pages 84-100.
    16. Liu, Qiang, 2021. "Reliability evaluation of two-stage evidence classification system considering preference and error," Reliability Engineering and System Safety, Elsevier, vol. 213(C).
    17. Onur Demiray & Evrim D. Gunes & Ercan Kulak & Emrah Dogan & Seyma Gorcin Karaketir & Serap Cifcili & Mehmet Akman & Sibel Sakarya, 2023. "Classification of patients with chronic disease by activation level using machine learning methods," Health Care Management Science, Springer, vol. 26(4), pages 626-650, December.
    18. Blanquero, R. & Carrizosa, E. & Jiménez-Cordero, A. & Martín-Barragán, B., 2019. "Functional-bandwidth kernel for Support Vector Machine with Functional Data: An alternating optimization algorithm," European Journal of Operational Research, Elsevier, vol. 275(1), pages 195-207.
    19. David D. Cho & Kurt M. Bretthauer & Jan Schoenfelder, 2023. "Patient-to-nurse ratios: Balancing quality, nurse turnover, and cost," Health Care Management Science, Springer, vol. 26(4), pages 807-826, December.
    20. P. K. Viswanathan & Sandeep Srivathsan & Wayne L. Winston, 2022. "Multiclass Discriminant Analysis using Ensemble Technique: Case Illustration from the Banking Industry," Journal of Emerging Market Finance, Institute for Financial Management and Research, vol. 21(1), pages 92-115, March.

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:bjc:journl:v:11:y:2024:i:5:p:808-824. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Dr. Renu Malsaria (email available below). General contact details of provider: https://rsisinternational.org/journals/ijrsi/ .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.