IDEAS home Printed from https://ideas.repec.org/a/gam/jijerp/v19y2022i14p8445-d859855.html
   My bibliography  Save this article

How Is the Lung Cancer Incidence Rate Associated with Environmental Risks? Machine-Learning-Based Modeling and Benchmarking

Author

Listed:
  • Kung-Min Wang

    (Department of Industrial Management, National Taiwan University of Science and Technology, Taipei 106, Taiwan)

  • Kun-Huang Chen

    (College of Management and Design, Ming-Chi University of Technology, Taipei 243, Taiwan)

  • Chrestella Ayu Hernanda

    (Department of Industrial Management, National Taiwan University of Science and Technology, Taipei 106, Taiwan)

  • Shih-Hsien Tseng

    (Department of Industrial Management, National Taiwan University of Science and Technology, Taipei 106, Taiwan)

  • Kung-Jeng Wang

    (Department of Industrial Management, National Taiwan University of Science and Technology, Taipei 106, Taiwan)

Abstract

The lung cancer threat has become a critical issue for public health. Research has been devoted to its clinical study but only a few studies have addressed the issue from a holistic perspective that included social, economic, and environmental dimensions. Therefore, in this study, risk factors or features, such as air pollution, tobacco use, socioeconomic status, employment status, marital status, and environment, were comprehensively considered when constructing a predictive model. These risk factors were analyzed and selected using stepwise regression and the variance inflation factor to eliminate the possibility of multicollinearity. To build efficient and informative prediction models of lung cancer incidence rates, several machine learning algorithms with cross-validation were adopted, namely, linear regression, support vector regression, random forest, K-nearest neighbor, and cubist model tree. A case study in Taiwan showed that the cubist model tree with feature selection was the best model with an RMSE of 3.310 and an R-squared of 0.960. Through these predictive models, we also found that apart from smoking, the average NO 2 concentration, employment percentage, and number of factories were also important factors that had significant impacts on the incidence of lung cancer. In addition, the random forest model without feature selection and with feature selection could support the interpretation of the most contributing variables. The predictive model proposed in the present study can help to precisely analyze and estimate lung cancer incidence rates so that effective preventative measures can be developed. Furthermore, the risk factors involved in the predictive model can help with the future analysis of lung cancer incidence rates from a holistic perspective.

Suggested Citation

  • Kung-Min Wang & Kun-Huang Chen & Chrestella Ayu Hernanda & Shih-Hsien Tseng & Kung-Jeng Wang, 2022. "How Is the Lung Cancer Incidence Rate Associated with Environmental Risks? Machine-Learning-Based Modeling and Benchmarking," IJERPH, MDPI, vol. 19(14), pages 1-19, July.
  • Handle: RePEc:gam:jijerp:v:19:y:2022:i:14:p:8445-:d:859855
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1660-4601/19/14/8445/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1660-4601/19/14/8445/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Hong-Bae Kim & Jae-Yong Shim & Byoungjin Park & Yong-Jae Lee, 2018. "Long-Term Exposure to Air Pollutants and Cancer Mortality: A Meta-Analysis of Cohort Studies," IJERPH, MDPI, vol. 15(11), pages 1-15, November.
    2. Da Hye Moon & Sung Ok Kwon & Sun-Young Kim & Woo Jin Kim, 2020. "Air Pollution and Incidence of Lung Cancer by Histological Type in Korean Adults: A Korean National Health Insurance Service Health Examinee Cohort Study," IJERPH, MDPI, vol. 17(3), pages 1-11, February.
    3. Kung-Jeng Wang & Chia-Min Lee & Gwo-Chi Hu & Kung-Min Wang, 2020. "Stroke to Dementia Associated with Environmental Risks—A Semi-Markov Model," IJERPH, MDPI, vol. 17(6), pages 1-13, March.
    4. Stayner, L. & Bena, J. & Sasco, A.J. & Smith, R. & Steenland, K. & Kreuzer, M. & Straif, K., 2007. "Lung cancer risk and workplace exposure to environmental tobacco smoke," American Journal of Public Health, American Public Health Association, vol. 97(3), pages 545-551.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Vanessa Santos-Sánchez & Juan Antonio Córdoba-Doña & Javier García-Pérez & Antonio Escolar-Pujolar & Lucia Pozzi & Rebeca Ramis, 2020. "Cancer Mortality and Deprivation in the Proximity of Polluting Industrial Facilities in an Industrial Region of Spain," IJERPH, MDPI, vol. 17(6), pages 1-15, March.
    2. Yue Wang & Yi Huang & Chen Li, 2023. "The Effects of Air Pollutants on Mortality in the Elderly at Different Ages: A Case of the Prefecture with Most Serious Aging in China," Sustainability, MDPI, vol. 15(22), pages 1-14, November.
    3. David Rojas-Rueda & Emily Morales-Zamora & Wael Abdullah Alsufyani & Christopher H. Herbst & Salem M. AlBalawi & Reem Alsukait & Mashael Alomran, 2021. "Environmental Risk Factors and Health: An Umbrella Review of Meta-Analyses," IJERPH, MDPI, vol. 18(2), pages 1-38, January.
    4. Miyoun Shin & Ok-Jin Kim & Seongwoo Yang & Seung-Ah Choe & Sun-Young Kim, 2022. "Different Mortality Risks of Long-Term Exposure to Particulate Matter across Different Cancer Sites," IJERPH, MDPI, vol. 19(6), pages 1-13, March.
    5. Jing Sui & Hui Xia & Qun Zhao & Guiju Sun & Yinyin Cai, 2022. "Long-Term Exposure to Fine Particulate Matter and the Risk of Chronic Liver Diseases: A Meta-Analysis of Observational Studies," IJERPH, MDPI, vol. 19(16), pages 1-13, August.
    6. Jan Gawełko & Marek Cierpiał-Wolan & Second Bwanakare & Michalina Czarnota, 2022. "Association between Air Pollution and Squamous Cell Lung Cancer in South-Eastern Poland," IJERPH, MDPI, vol. 19(18), pages 1-14, September.
    7. Shih-Chiang Hung & Hsiao-Yuan Cheng & Chen-Cheng Yang & Chia-I Lin & Chi-Kung Ho & Wen-Huei Lee & Fu-Jen Cheng & Chao-Jui Li & Hung-Yi Chuang, 2021. "The Association of White Blood Cells and Air Pollutants—A Population-Based Study," IJERPH, MDPI, vol. 18(5), pages 1-13, March.
    8. Xue Ni & Ning Xu & Qiang Wang, 2018. "Meta-Analysis and Systematic Review in Environmental Tobacco Smoke Risk of Female Lung Cancer by Research Type," IJERPH, MDPI, vol. 15(7), pages 1-19, June.
    9. Manuela Chiavarini & Patrizia Rosignoli & Beatrice Sorbara & Irene Giacchetta & Roberto Fabiani, 2024. "Benzene Exposure and Lung Cancer Risk: A Systematic Review and Meta-Analysis of Human Studies," IJERPH, MDPI, vol. 21(2), pages 1-19, February.
    10. Chien-Lung Chan & Chi-Chang Chang, 2022. "Big Data, Decision Models, and Public Health," IJERPH, MDPI, vol. 19(14), pages 1-9, July.
    11. Roberto Cazzolla Gatti, 2021. "Why We Will Continue to Lose Our Battle with Cancers If We Do Not Stop Their Triggers from Environmental Pollution," IJERPH, MDPI, vol. 18(11), pages 1-19, June.
    12. Ling Pan & Jing Sui & Ying Xu & Qun Zhao & Yinyin Cai & Guiju Sun & Hui Xia, 2023. "Effect of Fine Particulate Matter Exposure on Liver Enzymes: A Systematic Review and Meta-Analysis," IJERPH, MDPI, vol. 20(4), pages 1-11, February.
    13. Noémie Letellier & Sam E. Wing & Jiue-An Yang & Stacy W. Gray & Tarik Benmarhnia & Loretta Erhunmwunsee & Marta M. Jankowska, 2022. "The Role of Neighborhood Air Pollution Exposure on Somatic Non-Small Cell Lung Cancer Mutations in the Los Angeles Basin (2013–2018)," IJERPH, MDPI, vol. 19(17), pages 1-10, September.
    14. Bingkui Qiu & Min Zhou & Yang Qiu & Yuxiang Ma & Chaonan Ma & Jiating Tu & Siqi Li, 2021. "An Integration Method for Regional PM 2.5 Pollution Control Optimization Based on Meta-Analysis and Systematic Review," IJERPH, MDPI, vol. 19(1), pages 1-22, December.
    15. Omolola Okunromade & Jingjing Yin & Clara Ray & Atin Adhikari, 2022. "Air Quality and Cancer Prevalence Trends across the Sub-Saharan African Regions during 2005–2020," IJERPH, MDPI, vol. 19(18), pages 1-18, September.
    16. Grace Lordan, 2011. "Older but Not Wiser- Smokers and Passive Smoking Belief," Discussion Papers Series 431, School of Economics, University of Queensland, Australia.
    17. Katarzyna Milcarz & Leokadia Bak-Romaniszyn & Dorota Kaleta, 2017. "Environmental Tobacco Smoke Exposure and Smoke-Free Rules in Homes among Socially-Disadvantaged Populations in Poland," IJERPH, MDPI, vol. 14(4), pages 1-17, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jijerp:v:19:y:2022:i:14:p:8445-:d:859855. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.