In this project I use the Sentiment Analyses VADER model on 2 datasets, one consisting of Tweets and another of article headings to analyze patterns related to the COVID-19 pandemic.
Included in this folder are 4 ipynb files, 1 py file and 3 sub folders.
In this file includes the Tweets class and a demo for the class.
In this file I go over filtering the dataset and collecting only what I want.
In this file I use the Tweets class to get, pre-process (or not) and translate the tweets from the filtered dataset.
In this file I use the VADER sentiment analyzer on the 2 datasets.
I use the multiplex library to plot several time series graphs.
I use the wordcloud and matplotlib libraries to plot several word clouds.
This is the folder that contains the dataset used in each stage.
The article dataset with the compound sentiment score.
The twitter dataset with the compound sentiment score .
Article dataset in 6 separate country csv files or 1 complete csv file.
The files stored here are filtered versions of the pancealab files. These files are generated in the Collect Tweet ID.ipynb notebook.
6 flags used as masks for the word clouds, taken from wikipedia.
The files stored here were taken from the pancealab github repo.
The two sub folders here NotProcessed and Processed contain their share of the Tweet json files. These files are generated in the Get Tweets.ipynb notebook.
Visualizations are saved in this folder.
See readme in folder.