Sentiment analysis and topic modeling of COVID-19 tweets of India

My bibliography Save this article

Sentiment analysis and topic modeling of COVID-19 tweets of India

Author

Listed:

Manju Bhardwaj
(Maitreyi College)
Priya Mishra
(Maitreyi College)
Shikha Badhani
(Maitreyi College)
Sunil K. Muttoo
(University of Delhi)

Registered:

Abstract

Social media platforms provide an opportunity to the users to express their views and emotions on any topic. Various researchers have successfully used the content posted on these platforms to capture the emotions of the people about the given event or topic. During COVID-19 pandemic, Indians extensively used Twitter owing to an increased need for virtual interaction. In this work, we analyse the tweets posted in India during COVID-19 outbreak to understand how individuals in India reacted to the pandemic. We identified the timelines of three major COVID-19 waves from May 2020 to March 2022 and retrieved 13,818 tweets from COV19Tweets dataset available at IEEE DataPort for the respective duration of each of the three waves. Lexicon based sentiment analysis of the tweets indicated a positive mindset of the Indian population during the pandemic. Further, visual analysis through word clouds revealed that a few words were common for all waves whereas some words were wave-specific. It was observed that the words used in tweets cannot be compulsorily associated with positive or negative emotions, as the context or the set of words taken together may be a better indicator. Hence, machine learning approach was followed for the identification of sentiments by extracting BoW (Bag-of-Words) and TF–IDF (Term Frequency–Inverse Document Frequency) features from the tweet text. Comparative performance analysis of the four classification algorithms, namely, Decision Tree (DT), Logistic Regression (LR), Naive Bayes (NB), and Support Vector Machines (SVM) and two ensemble methods Adaboost and Random Forest revealed that LR applied to BoW featureset was the best performer. Finally, we performed Latent Dirichlet Allocation (LDA) based topic modeling on the COVID-19 tweets to identify topics of discussion in each of the waves. The topics evolved from informative messages related to the pandemic during the first wave, to wider discussions related to the impact of COVID-19 on nifty, tourism, etc. for the second wave, and the omicron virus, availability of beds, and ventilators in the third wave. This study can be of great interest to governments, as they may undertake similar studies to understand human behavior when natural calamities or pandemics occur at the local or global levels. The automated capture of public sentiments and identification of topics may expedite the appropriate execution of preventive measures taken by governments and address the concerns of citizens almost instantly.

Suggested Citation

Manju Bhardwaj & Priya Mishra & Shikha Badhani & Sunil K. Muttoo, 2024. "Sentiment analysis and topic modeling of COVID-19 tweets of India," International Journal of System Assurance Engineering and Management, Springer;The Society for Reliability, Engineering Quality and Operations Management (SREQOM),India, and Division of Operation and Maintenance, Lulea University of Technology, Sweden, vol. 15(5), pages 1756-1776, May.

Handle: RePEc:spr:ijsaem:v:15:y:2024:i:5:d:10.1007_s13198-023-02082-0
DOI: 10.1007/s13198-023-02082-0

Download full text from publisher

As the access to this document is restricted, you may want to search for a different version of it.

More about this item

Keywords

COVID-19; Machine learning; Natural Language Processing (NLP); Sentiment analysis; Twitter data; Visualization; Topic modeling; Latent Dirichlet Allocation; Lexicon; Bag-Of-Words;
All these keywords.

Statistics

Access and download statistics

Corrections

All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:ijsaem:v:15:y:2024:i:5:d:10.1007_s13198-023-02082-0. See general information about how to correct material in RePEc.

If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

We have no bibliographic references for this item. You can help adding them by using this form .

If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

Please note that corrections may take a couple of weeks to filter through the various RePEc services.

IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.

Browse Econ Literature

More features

Sentiment analysis and topic modeling of COVID-19 tweets of India

Author

Abstract

Suggested Citation

Download full text from publisher

More about this item

Keywords

Statistics

Corrections

More services and features

MyIDEAS

Author registration

Rankings

RePEc Genealogy

RePEc Biblio

MPRA

New papers by email

EconAcademics

Plagiarism

About RePEc

RePEc home

Blog

Help/FAQ

RePEc team

Participating archives

Privacy statement

Help us

Corrections

Volunteers

Get papers listed

Open a RePEc archive

Get RePEc data