IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i7p1744-d1116595.html
   My bibliography  Save this article

A Comparative Evaluation of Self-Attention Mechanism with ConvLSTM Model for Global Aerosol Time Series Forecasting

Author

Listed:
  • Dušan S. Radivojević

    (Vinča Institute of Nuclear Sciences-National Institute of the Republic of Serbia, University of Belgrade, 11351 Belgrade, Serbia)

  • Ivan M. Lazović

    (Vinča Institute of Nuclear Sciences-National Institute of the Republic of Serbia, University of Belgrade, 11351 Belgrade, Serbia)

  • Nikola S. Mirkov

    (Vinča Institute of Nuclear Sciences-National Institute of the Republic of Serbia, University of Belgrade, 11351 Belgrade, Serbia)

  • Uzahir R. Ramadani

    (Vinča Institute of Nuclear Sciences-National Institute of the Republic of Serbia, University of Belgrade, 11351 Belgrade, Serbia)

  • Dušan P. Nikezić

    (Vinča Institute of Nuclear Sciences-National Institute of the Republic of Serbia, University of Belgrade, 11351 Belgrade, Serbia)

Abstract

The attention mechanism in natural language processing and self-attention mechanism in vision transformers improved many deep learning models. An implementation of the self-attention mechanism with the previously developed ConvLSTM sequence-to-one model was done in order to make a comparative evaluation with statistical testing. First, the new ConvLSTM sequence-to-one model with a self-attention mechanism was developed and then the self-attention layer was removed in order to make comparison. The hyperparameters optimization process was conducted by grid search for integer and string type parameters, and with particle swarm optimization for float type parameters. A cross validation technique was used for better evaluating models with a predefined ratio of train-validation-test subsets. Both models with and without a self-attention layer passed defined evaluation criteria that means that models are able to generate the image of the global aerosol thickness and able to find patterns for changes in the time domain. The model obtained by an ablation study on the self-attention layer achieved better outcomes for Root Mean Square Error and Euclidean Distance in regards to developed ConvLSTM-SA model. As part of the statistical test, a Kruskal–Wallis H Test was done since it was determined that the data did not belong to the normal distribution and the obtained results showed that both models, with and without the SA layer, predict similar images with patterns at the pixel level to the original dataset. However, the model without the SA layer was more similar to the original dataset especially in the time domain at the pixel level. Based on the comparative evaluation with statistical testing, it was concluded that the developed ConvLSTM-SA model better predicts without an SA layer.

Suggested Citation

  • Dušan S. Radivojević & Ivan M. Lazović & Nikola S. Mirkov & Uzahir R. Ramadani & Dušan P. Nikezić, 2023. "A Comparative Evaluation of Self-Attention Mechanism with ConvLSTM Model for Global Aerosol Time Series Forecasting," Mathematics, MDPI, vol. 11(7), pages 1-13, April.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:7:p:1744-:d:1116595
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/7/1744/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/7/1744/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Nebojsa Bacanin & Miodrag Zivkovic & Catalin Stoean & Milos Antonijevic & Stefana Janicijevic & Marko Sarac & Ivana Strumberger, 2022. "Application of Natural Language Processing and Machine Learning Boosted with Swarm Intelligence for Spam Email Filtering," Mathematics, MDPI, vol. 10(22), pages 1-31, November.
    2. Nebojsa Bacanin & Ruxandra Stoean & Miodrag Zivkovic & Aleksandar Petrovic & Tarik A. Rashid & Timea Bezdan, 2021. "Performance of a Novel Chaotic Firefly Algorithm with Enhanced Exploration for Tackling Global Optimization Problems: Application for Dropout Regularization," Mathematics, MDPI, vol. 9(21), pages 1-33, October.
    3. Dušan P. Nikezić & Uzahir R. Ramadani & Dušan S. Radivojević & Ivan M. Lazović & Nikola S. Mirkov, 2022. "Deep Learning Model for Global Spatio-Temporal Image Prediction," Mathematics, MDPI, vol. 10(18), pages 1-15, September.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Dušan P. Nikezić & Dušan S. Radivojević & Ivan M. Lazović & Nikola S. Mirkov & Zoran J. Marković, 2024. "Transfer Learning with ResNet3D-101 for Global Prediction of High Aerosol Concentrations," Mathematics, MDPI, vol. 12(6), pages 1-11, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Jani Dugonik & Mirjam Sepesy Maučec & Domen Verber & Janez Brest, 2023. "Reduction of Neural Machine Translation Failures by Incorporating Statistical Machine Translation," Mathematics, MDPI, vol. 11(11), pages 1-22, May.
    2. Luka Jovanovic & Dejan Jovanovic & Nebojsa Bacanin & Ana Jovancai Stakic & Milos Antonijevic & Hesham Magd & Ravi Thirumalaisamy & Miodrag Zivkovic, 2022. "Multi-Step Crude Oil Price Prediction Based on LSTM Approach Tuned by Salp Swarm Algorithm with Disputation Operator," Sustainability, MDPI, vol. 14(21), pages 1-29, November.
    3. Devi Munandar & Budi Nurani Ruchjana & Atje Setiawan Abdullah & Hilman Ferdinandus Pardede, 2023. "Literature Review on Integrating Generalized Space-Time Autoregressive Integrated Moving Average (GSTARIMA) and Deep Neural Networks in Machine Learning for Climate Forecasting," Mathematics, MDPI, vol. 11(13), pages 1-25, July.
    4. Tao Xu & Zeng Gao & Yi Zhuang, 2023. "Fault Prediction of Control Clusters Based on an Improved Arithmetic Optimization Algorithm and BP Neural Network," Mathematics, MDPI, vol. 11(13), pages 1-28, June.
    5. Zaid Bin Mushtaq & Shoaib Mohd Nasti & Chaman Verma & Maria Simona Raboaca & Neerendra Kumar & Samiah Jan Nasti, 2022. "Super Resolution for Noisy Images Using Convolutional Neural Networks," Mathematics, MDPI, vol. 10(5), pages 1-18, February.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:7:p:1744-:d:1116595. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.