IDEAS home Printed from https://ideas.repec.org/a/gam/jsusta/v13y2021i21p12071-d669956.html
   My bibliography  Save this article

PM 2.5 Concentration Prediction Based on Spatiotemporal Feature Selection Using XGBoost-MSCNN-GA-LSTM

Author

Listed:
  • Hongbin Dai

    (School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China)

  • Guangqiu Huang

    (School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China)

  • Huibin Zeng

    (School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China)

  • Fan Yang

    (School of Management, Xi’an University of Architecture and Technology, Xi’an 710055, China)

Abstract

With the rapid development of China’s industrialization, air pollution is becoming more and more serious. Predicting air quality is essential for identifying further preventive measures to avoid negative impacts. The existing prediction of atmospheric pollutant concentration ignores the problem of feature redundancy and spatio-temporal characteristics; the accuracy of the model is not high, the mobility of it is not strong. Therefore, firstly, extreme gradient lifting (XGBoost) is applied to extract features from PM 2.5 , then one-dimensional multi-scale convolution kernel (MSCNN) is used to extract local temporal and spatial feature relations from air quality data, and linear splicing and fusion is carried out to obtain the spatio-temporal feature relationship of multi-features. Finally, XGBoost and MSCNN combine the advantages of LSTM in dealing with time series. Genetic algorithm (GA) is applied to optimize the parameter set of long-term and short-term memory network (LSTM) network. The spatio-temporal relationship of multi-features is input into LSTM network, and then the long-term feature dependence of multi-feature selection is output to predict PM 2.5 concentration. A XGBoost-MSCGL of PM 2.5 concentration prediction model based on spatio-temporal feature selection is established. The data set comes from the hourly concentration data of six kinds of atmospheric pollutants and meteorological data in Fen-Wei Plain in 2020. To verify the effectiveness of the model, the XGBoost-MSCGL model is compared with the benchmark models such as multilayer perceptron (MLP), CNN, LSTM, XGBoost, CNN-LSTM with before and after using XGBoost feature selection. According to the forecast results of 12 cities, compared with the single model, the root mean square error (RMSE) decreased by about 39.07%, the average MAE decreased by about 42.18%, the average MAE decreased by about 49.33%, but R 2 increased by 23.7%. Compared with the model after feature selection, the root mean square error (RMSE) decreased by an average of about 15%. On average, the MAPE decreased by 16%, the MAE decreased by 21%, and R 2 increased by 2.6%. The experimental results show that the XGBoost-MSCGL prediction model offer a more comprehensive understanding, runs deeper levels, guarantees a higher prediction accuracy, and ensures a better generalization ability in the prediction of PM 2.5 concentration.

Suggested Citation

  • Hongbin Dai & Guangqiu Huang & Huibin Zeng & Fan Yang, 2021. "PM 2.5 Concentration Prediction Based on Spatiotemporal Feature Selection Using XGBoost-MSCNN-GA-LSTM," Sustainability, MDPI, vol. 13(21), pages 1-24, November.
  • Handle: RePEc:gam:jsusta:v:13:y:2021:i:21:p:12071-:d:669956
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2071-1050/13/21/12071/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2071-1050/13/21/12071/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Younoh Kim & James Manley & Vlad Radoias, 2017. "Medium- and long-term consequences of pollution on labor supply: evidence from Indonesia," IZA Journal of Labor Economics, Springer;Forschungsinstitut zur Zukunft der Arbeit GmbH (IZA), vol. 6(1), pages 1-15, December.
    2. Mengyi Ji & Yuying Jiang & Xiping Han & Luo Liu & Xinliang Xu & Zhi Qiao & Wei Sun, 2020. "Spatiotemporal Relationships between Air Quality and Multiple Meteorological Parameters in 221 Chinese Cities," Complexity, Hindawi, vol. 2020, pages 1-25, June.
    3. Zhao Yang & Yifan Wang & Jie Li & Liming Liu & Jiyang Ma & Yi Zhong, 2020. "Airport Arrival Flow Prediction considering Meteorological Factors Based on Deep-Learning Methods," Complexity, Hindawi, vol. 2020, pages 1-11, October.
    4. Dai, Yeming & Zhao, Pei, 2020. "A hybrid load forecasting model based on support vector machine with intelligent methods for feature selection and parameter optimization," Applied Energy, Elsevier, vol. 279(C).
    5. Osama Elsherbiny & Yangyang Fan & Lei Zhou & Zhengjun Qiu, 2021. "Fusion of Feature Selection Methods and Regression Algorithms for Predicting the Canopy Water Content of Rice Based on Hyperspectral Data," Agriculture, MDPI, vol. 11(1), pages 1-21, January.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Junfeng Kang & Xinyi Zou & Jianlin Tan & Jun Li & Hamed Karimian, 2023. "Short-Term PM 2.5 Concentration Changes Prediction: A Comparison of Meteorological and Historical Data," Sustainability, MDPI, vol. 15(14), pages 1-24, July.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Junfeng Kang & Xinyi Zou & Jianlin Tan & Jun Li & Hamed Karimian, 2023. "Short-Term PM 2.5 Concentration Changes Prediction: A Comparison of Meteorological and Historical Data," Sustainability, MDPI, vol. 15(14), pages 1-24, July.
    2. Li, Zhengtao & Hu, Bin, 2018. "Perceived health risk, environmental knowledge, and contingent valuation for improving air quality: New evidence from the Jinchuan mining area in China," Economics & Human Biology, Elsevier, vol. 31(C), pages 54-68.
    3. V. Y. Kondaiah & B. Saravanan, 2022. "Short-Term Load Forecasting with a Novel Wavelet-Based Ensemble Method," Energies, MDPI, vol. 15(14), pages 1-17, July.
    4. Bangzhu Zhu & Jingyi Zhang & Chunzhuo Wan & Julien Chevallier & Ping Wang, 2023. "An evolutionary cost‐sensitive support vector machine for carbon price trend forecasting," Journal of Forecasting, John Wiley & Sons, Ltd., vol. 42(4), pages 741-755, July.
    5. Liu, Cheng & Wang, Wei & Wang, Zhixia & Ding, Bei & Wu, Zhiqiang & Feng, Jingjing, 2024. "Data-driven modeling and fast adjustment for digital coded metasurfaces database: Application in adaptive electromagnetic energy harvesting," Applied Energy, Elsevier, vol. 365(C).
    6. He, Yan & Zhang, Hongli & Dong, Yingchao & Wang, Cong & Ma, Ping, 2024. "Residential net load interval prediction based on stacking ensemble learning," Energy, Elsevier, vol. 296(C).
    7. Siting Li & Huafeng Cai, 2024. "Short-Term Power Load Forecasting Using a VMD-Crossformer Model," Energies, MDPI, vol. 17(11), pages 1-18, June.
    8. Dai, Yeming & Yang, Xinyu & Leng, Mingming, 2022. "Forecasting power load: A hybrid forecasting method with intelligent data processing and optimized artificial intelligence," Technological Forecasting and Social Change, Elsevier, vol. 182(C).
    9. Adel H. Elmetwalli & Yasser S. A. Mazrou & Andrew N. Tyler & Peter D. Hunter & Osama Elsherbiny & Zaher Mundher Yaseen & Salah Elsayed, 2022. "Assessing the Efficiency of Remote Sensing and Machine Learning Algorithms to Quantify Wheat Characteristics in the Nile Delta Region of Egypt," Agriculture, MDPI, vol. 12(3), pages 1-21, February.
    10. Zhao, Zhenyu & Zhang, Yao & Yang, Yujia & Yuan, Shuguang, 2022. "Load forecasting via Grey Model-Least Squares Support Vector Machine model and spatial-temporal distribution of electric consumption intensity," Energy, Elsevier, vol. 255(C).
    11. Zhu, Jizhong & Dong, Hanjiang & Zheng, Weiye & Li, Shenglin & Huang, Yanting & Xi, Lei, 2022. "Review and prospect of data-driven techniques for load forecasting in integrated energy systems," Applied Energy, Elsevier, vol. 321(C).
    12. Chen, Xiaodong & Ge, Xinxin & Sun, Rongfu & Wang, Fei & Mi, Zengqiang, 2024. "A SVM based demand response capacity prediction model considering internal factors under composite program," Energy, Elsevier, vol. 300(C).
    13. Luis Sarmiento, 2020. "Waiting for My Sentence: Air Pollution and the Productivity of Court Rulings," Discussion Papers of DIW Berlin 1878, DIW Berlin, German Institute for Economic Research.
    14. Zulfiqar, M. & Kamran, M. & Rasheed, M.B. & Alquthami, T. & Milyani, A.H., 2023. "A hybrid framework for short term load forecasting with a navel feature engineering and adaptive grasshopper optimization in smart grid," Applied Energy, Elsevier, vol. 338(C).
    15. Rosato, Antonello & Panella, Massimo & Andreotti, Amedeo & Mohammed, Osama A. & Araneo, Rodolfo, 2021. "Two-stage dynamic management in energy communities using a decision system based on elastic net regularization," Applied Energy, Elsevier, vol. 291(C).
    16. Luis Sarmiento, 2022. "Air pollution and the productivity of high‐skill labor: evidence from court hearings," Scandinavian Journal of Economics, Wiley Blackwell, vol. 124(1), pages 301-332, January.
    17. Ahmad, Tanveer & Zhang, Dongdong & Huang, Chao, 2021. "Methodological framework for short-and medium-term energy, solar and wind power forecasting with stochastic-based machine learning approach to monetary and energy policy applications," Energy, Elsevier, vol. 231(C).
    18. Singh, Prachi & Dey, Sagnik, 2021. "Crop burning and forest fires: Long-term effect on adolescent height in India," Resource and Energy Economics, Elsevier, vol. 65(C).
    19. Banafshe Parizad & Hassan Ranjbarzadeh & Ali Jamali & Hamid Khayyam, 2024. "An Intelligent Hybrid Machine Learning Model for Sustainable Forecasting of Home Energy Demand and Electricity Price," Sustainability, MDPI, vol. 16(6), pages 1-17, March.
    20. Xiao Yu & Jianing Liang & Yanzhe Zhang, 2022. "Air Pollution and Settlement Intention: Evidence from the China Migrants Dynamic Survey," IJERPH, MDPI, vol. 19(8), pages 1-16, April.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jsusta:v:13:y:2021:i:21:p:12071-:d:669956. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.