IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i19p3566-d929588.html
   My bibliography  Save this article

A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection

Author

Listed:
  • Ali Asghar Heidari

    (School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, Tehran 1439957131, Iran)

  • Mehdi Akhoondzadeh

    (Photogrammetry and Remote Sensing Department, School of Surveying and Geospatial Engineering, College of Engineering, University of Tehran, North Amirabad Ave., Tehran 1439957131, Iran)

  • Huiling Chen

    (Department of Computer Science and Artificial Intelligence, Wenzhou University, Wenzhou 325035, China)

Abstract

The fine particulate matter (PM2.5) concentration has been a vital source of info and an essential indicator for measuring and studying the concentration of other air pollutants. It is crucial to realize more accurate predictions of PM2.5 and establish a high-accuracy PM2.5 prediction model due to their social impacts and cross-field applications in geospatial engineering. To further boost the accuracy of PM2.5 prediction results, this paper proposes a new wavelet PM2.5 prediction system (called WD-OSMSSA-KELM model) based on a new, improved variant of the salp swarm algorithm (OSMSSA), kernel extreme learning machine (KELM), wavelet decomposition, and Boruta-XGBoost (B-XGB) feature selection. First, we applied the B-XGB feature selection to realize the best features for predicting hourly PM2.5 concentrations. Then, we applied the wavelet decomposition (WD) algorithm to reach the multi-scale decomposition results and single-branch reconstruction of PM2.5 concentrations to mitigate the prediction error produced by time series data. In the next stage, we optimized the parameters of the KELM model under each reconstructed component. An improved version of the SSA is proposed to reach higher performance for the basic SSA optimizer and avoid local stagnation problems. In this work, we propose new operators based on oppositional-based learning and simplex-based search to mitigate the core problems of the conventional SSA. In addition, we utilized a time-varying parameter instead of the main parameter of the SSA. To further boost the exploration trends of SSA, we propose using the random leaders to guide the swarm towards new regions of the feature space based on a conditional structure. After optimizing the model, the optimized model was utilized to predict the PM2.5 concentrations, and different error metrics were applied to evaluate the model’s performance and accuracy. The proposed model was evaluated based on an hourly database, six air pollutants, and six meteorological features collected from the Beijing Municipal Environmental Monitoring Center. The experimental results show that the proposed WD-OLMSSA-KELM model can predict the PM2.5 concentration with superior performance (R: 0.995, RMSE: 11.906, MdAE: 2.424, MAPE: 9.768, KGE: 0.963, R 2 : 0.990) compared to the WD-CatBoost, WD-LightGBM, WD-Xgboost, and WD-Ridge methods.

Suggested Citation

  • Ali Asghar Heidari & Mehdi Akhoondzadeh & Huiling Chen, 2022. "A Wavelet PM2.5 Prediction System Using Optimized Kernel Extreme Learning with Boruta-XGBoost Feature Selection," Mathematics, MDPI, vol. 10(19), pages 1-35, September.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:19:p:3566-:d:929588
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/19/3566/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/19/3566/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Kursa, Miron B. & Rudnicki, Witold R., 2010. "Feature Selection with the Boruta Package," Journal of Statistical Software, Foundation for Open Access Statistics, vol. 36(i11).
    2. Jianzhou Wang & Tong Niu & Rui Wang, 2017. "Research and Application of an Air Quality Early Warning System Based on a Modified Least Squares Support Vector Machine and a Cloud Model," IJERPH, MDPI, vol. 14(3), pages 1-33, March.
    3. Zaher Mundher Yaseen & Hossam Faris & Nadhir Al-Ansari, 2020. "Hybridized Extreme Learning Machine Model with Salp Swarm Algorithm: A Novel Predictive Model for Hydrological Application," Complexity, Hindawi, vol. 2020, pages 1-14, February.
    4. Yaolin Lin & Jiale Zou & Wei Yang & Chun-Qing Li, 2018. "A Review of Recent Advances in Research on PM 2.5 in China," IJERPH, MDPI, vol. 15(3), pages 1-29, March.
    5. Ren, Hao & Li, Jun & Chen, Huiling & Li, ChenYang, 2021. "Adaptive levy-assisted salp swarm algorithm: Analysis and optimization case studies," Mathematics and Computers in Simulation (MATCOM), Elsevier, vol. 181(C), pages 380-409.
    6. Abbassi, Abdelkader & Abbassi, Rabeh & Heidari, Ali Asghar & Oliva, Diego & Chen, Huiling & Habib, Arslan & Jemli, Mohamed & Wang, Mingjing, 2020. "Parameters identification of photovoltaic cell models using enhanced exploratory salp chains-based approach," Energy, Elsevier, vol. 198(C).
    7. Guangyuan Xing & Er-long Zhao & Chengyuan Zhang & Jing Wu & Giancarlo Consolo, 2021. "A Decomposition-Ensemble Approach with Denoising Strategy for PM2.5 Concentration Forecasting," Discrete Dynamics in Nature and Society, Hindawi, vol. 2021, pages 1-13, April.
    8. Pei Du & Jianzhou Wang & Wendong Yang & Tong Niu, 2022. "A novel hybrid fine particulate matter (PM2.5) forecasting and its further application system: Case studies in China," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 41(1), pages 64-85, January.
    9. Fan, Junliang & Ma, Xin & Wu, Lifeng & Zhang, Fucang & Yu, Xiang & Zeng, Wenzhi, 2019. "Light Gradient Boosting Machine: An efficient soft computing model for estimating daily reference evapotranspiration with local and external meteorological data," Agricultural Water Management, Elsevier, vol. 225(C).
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Yamashiro, Hirochika & Nonaka, Hirofumi, 2021. "Estimation of processing time using machine learning and real factory data for optimization of parallel machine scheduling problem," Operations Research Perspectives, Elsevier, vol. 8(C).
    2. Yanzhao Wang & Jianfei Cao, 2023. "Examining the Effects of Socioeconomic Development on Fine Particulate Matter (PM2.5) in China’s Cities Based on Spatial Autocorrelation Analysis and MGWR Model," IJERPH, MDPI, vol. 20(4), pages 1-23, February.
    3. Tong, Jianfeng & Liu, Zhenxing & Zhang, Yong & Zheng, Xiujuan & Jin, Junyang, 2023. "Improved multi-gate mixture-of-experts framework for multi-step prediction of gas load," Energy, Elsevier, vol. 282(C).
    4. Asma Shaheen & Javed Iqbal, 2018. "Spatial Distribution and Mobility Assessment of Carcinogenic Heavy Metals in Soil Profiles Using Geostatistics and Random Forest, Boruta Algorithm," Sustainability, MDPI, vol. 10(3), pages 1-20, March.
    5. Gennadiy Stroykov & Alexey Y. Cherepovitsyn & Elizaveta A. Iamshchikova, 2020. "Powering Multiple Gas Condensate Wells in Russia’s Arctic: Power Supply Systems Based on Renewable Energy Sources," Resources, MDPI, vol. 9(11), pages 1-15, November.
    6. Ramón Ferri-García & María del Mar Rueda, 2022. "Variable selection in Propensity Score Adjustment to mitigate selection bias in online surveys," Statistical Papers, Springer, vol. 63(6), pages 1829-1881, December.
    7. Yvan Devaux & Lu Zhang & Andrew I. Lumley & Kanita Karaduzovic-Hadziabdic & Vincent Mooser & Simon Rousseau & Muhammad Shoaib & Venkata Satagopam & Muhamed Adilovic & Prashant Kumar Srivastava & Costa, 2024. "Development of a long noncoding RNA-based machine learning model to predict COVID-19 in-hospital mortality," Nature Communications, Nature, vol. 15(1), pages 1-12, December.
    8. Ghosh, Indranil & Chaudhuri, Tamal Datta & Alfaro-Cortés, Esteban & Gámez, Matías & García, Noelia, 2022. "A hybrid approach to forecasting futures prices with simultaneous consideration of optimality in ensemble feature selection and advanced artificial intelligence," Technological Forecasting and Social Change, Elsevier, vol. 181(C).
    9. Conor Waldock & Bernhard Wegscheider & Dario Josi & Bárbara Borges Calegari & Jakob Brodersen & Luiz Jardim de Queiroz & Ole Seehausen, 2024. "Deconstructing the geography of human impacts on species’ natural distribution," Nature Communications, Nature, vol. 15(1), pages 1-15, December.
    10. Nan Jia & Yinshuai Li & Ruishan Chen & Hongbo Yang, 2023. "A Review of Global PM 2.5 Exposure Research Trends from 1992 to 2022," Sustainability, MDPI, vol. 15(13), pages 1-15, July.
    11. Ook Lee & Hanseon Joo & Hayoung Choi & Minjong Cheon, 2022. "Proposing an Integrated Approach to Analyzing ESG Data via Machine Learning and Deep Learning Algorithms," Sustainability, MDPI, vol. 14(14), pages 1-14, July.
    12. Manuel J. García Rodríguez & Vicente Rodríguez Montequín & Francisco Ortega Fernández & Joaquín M. Villanueva Balsera, 2019. "Public Procurement Announcements in Spain: Regulations, Data Analysis, and Award Price Estimator Using Machine Learning," Complexity, Hindawi, vol. 2019, pages 1-20, November.
    13. Sangjin Kim & Jong-Min Kim, 2019. "Two-Stage Classification with SIS Using a New Filter Ranking Method in High Throughput Data," Mathematics, MDPI, vol. 7(6), pages 1-16, May.
    14. Arjan S. Gosal & Janine A. McMahon & Katharine M. Bowgen & Catherine H. Hoppe & Guy Ziv, 2021. "Identifying and Mapping Groups of Protected Area Visitors by Environmental Awareness," Land, MDPI, vol. 10(6), pages 1-14, May.
    15. Ahmed Ginidi & Sherif M. Ghoneim & Abdallah Elsayed & Ragab El-Sehiemy & Abdullah Shaheen & Attia El-Fergany, 2021. "Gorilla Troops Optimizer for Electrically Based Single and Double-Diode Models of Solar Photovoltaic Systems," Sustainability, MDPI, vol. 13(16), pages 1-28, August.
    16. Foutzopoulos, Giorgos & Pandis, Nikolaos & Tsagris, Michail, 2024. "Predicting full retirement attainment of NBA players," MPRA Paper 121540, University Library of Munich, Germany.
    17. Zhao-Yue Chen & Hervé Petetin & Raúl Fernando Méndez Turrubiates & Hicham Achebak & Carlos Pérez García-Pando & Joan Ballester, 2024. "Population exposure to multiple air pollutants and its compound episodes in Europe," Nature Communications, Nature, vol. 15(1), pages 1-11, December.
    18. Schrader, Silja & Graham, Sonia & Campbell, Rebecca & Height, Kaitlyn & Hawkes, Gina, 2024. "Grower attitudes and practices toward area-wide management of cropping weeds in Australia," Land Use Policy, Elsevier, vol. 137(C).
    19. Bram Janssens & Matthias Bogaert & Mathijs Maton, 2023. "Predicting the next Pogačar: a data analytical approach to detect young professional cycling talents," Annals of Operations Research, Springer, vol. 325(1), pages 557-588, June.
    20. Esangbedo, Moses Olabhele & Taiwo, Blessing Olamide & Abbas, Hawraa H. & Hosseini, Shahab & Sazid, Mohammed & Fissha, Yewuhalashet, 2024. "Enhancing the exploitation of natural resources for green energy: An application of LSTM-based meta-model for aluminum prices forecasting," Resources Policy, Elsevier, vol. 92(C).

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:19:p:3566-:d:929588. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.