IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v10y2022i22p4267-d973219.html
   My bibliography  Save this article

COVID-19 Genome Sequence Analysis for New Variant Prediction and Generation

Author

Listed:
  • Amin Ullah

    (CORIS Institute, Oregon State University, Corvallis, OR 97331, USA)

  • Khalid Mahmood Malik

    (Department of Computer Science and Engineering, Oakland University, Rochester, MI 48309, USA)

  • Abdul Khader Jilani Saudagar

    (Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia)

  • Muhammad Badruddin Khan

    (Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia)

  • Mozaherul Hoque Abul Hasanat

    (Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia)

  • Abdullah AlTameem

    (Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia)

  • Mohammed AlKhathami

    (Information Systems Department, College of Computer and Information Sciences, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11564, Saudi Arabia)

  • Muhammad Sajjad

    (Color and Visual Computing Lab, Department of Computer Science, Norwegian University of Science and Technology (NTNU), 2815 Gjøvik, Norway
    Digital Image Processing Laboratory, Department of Computer Science, Islamia College Peshawar, Peshawar 25000, Pakistan)

Abstract

The new COVID-19 variants of concern are causing more infections and spreading much faster than their predecessors. Recent cases show that even vaccinated people are highly affected by these new variants. The proactive nucleotide sequence prediction of possible new variants of COVID-19 and developing better healthcare plans to address their spread require a unified framework for variant classification and early prediction. This paper attempts to answer the following research questions: can a convolutional neural network with self-attention by extracting discriminative features from nucleotide sequences be used to classify COVID-19 variants? Second, is it possible to employ uncertainty calculation in the predicted probability distribution to predict new variants? Finally, can synthetic approaches such as variational autoencoder-decoder networks be employed to generate a synthetic new variant from random noise? Experimental results show that the generated sequence is significantly similar to the original coronavirus and its variants, proving that our neural network can learn the mutation patterns from the old variants. Moreover, to our knowledge, we are the first to collect data for all COVID-19 variants for computational analysis. The proposed framework is extensively evaluated for classification, new variant prediction, and new variant generation tasks and achieves better performance for all tasks. Our code, data, and trained models are available on GitHub (https://github.com/Aminullah6264/COVID19, accessed on 16 September 2022).

Suggested Citation

  • Amin Ullah & Khalid Mahmood Malik & Abdul Khader Jilani Saudagar & Muhammad Badruddin Khan & Mozaherul Hoque Abul Hasanat & Abdullah AlTameem & Mohammed AlKhathami & Muhammad Sajjad, 2022. "COVID-19 Genome Sequence Analysis for New Variant Prediction and Generation," Mathematics, MDPI, vol. 10(22), pages 1-16, November.
  • Handle: RePEc:gam:jmathe:v:10:y:2022:i:22:p:4267-:d:973219
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/10/22/4267/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/10/22/4267/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Publio Darío Cortés-Carvajal & Mitzi Cubilla-Montilla & David Ricardo González-Cortés, 2022. "Estimation of the Instantaneous Reproduction Number and Its Confidence Interval for Modeling the COVID-19 Pandemic," Mathematics, MDPI, vol. 10(2), pages 1-30, January.
    2. Erwin J. Delgado & Xavier Cabezas & Carlos Martin-Barreiro & Víctor Leiva & Fernando Rojas, 2022. "An Equity-Based Optimization Model to Solve the Location Problem for Healthcare Centers Applied to Hospital Beds and COVID-19 Vaccination," Mathematics, MDPI, vol. 10(11), pages 1-24, May.
    Full references (including those not matched with items on IDEAS)

    Citations

    Citations are extracted by the CitEc Project, subscribe to its RSS feed for this item.
    as


    Cited by:

    1. Roland Bolboacă & Piroska Haller, 2023. "Performance Analysis of Long Short-Term Memory Predictive Neural Networks on Time Series Data," Mathematics, MDPI, vol. 11(6), pages 1-35, March.

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Fernando Rojas & Peter Wanke & Víctor Leiva & Mauricio Huerta & Carlos Martin-Barreiro, 2022. "Modeling Inventory Cost Savings and Supply Chain Success Factors: A Hybrid Robust Compromise Multi-Criteria Approach," Mathematics, MDPI, vol. 10(16), pages 1-18, August.
    2. Raydonal Ospina & João A. M. Gondim & Víctor Leiva & Cecilia Castro, 2023. "An Overview of Forecast Analysis with ARIMA Models during the COVID-19 Pandemic: Methodology and Case Study in Brazil," Mathematics, MDPI, vol. 11(14), pages 1-18, July.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:10:y:2022:i:22:p:4267-:d:973219. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.