TNM108 Project - Twitter Sentiment Analysis

This is a project made for the university course TNM108 - Machine Learning for Social Media at Linköpings University 2022.

The project is made by Anna Jonsson and Amanda Bigelius, and the goal is to make a Twitter Sentiment Analysis Algorithm.

In the end, the project resulted in two different solutions. One solution where TextBlob, a lexicon-based method, was used, and one where Logistic Regression was used.

Twitter Sentiment Analysis using TextBlob

The algorithm will be heavily based on Nikita Silaparasetty's code from this tutorial

Her repository for the tutorial can be found here

Our modifications and thoughts

Our first modification was to move all the API_KEYS to a separate file in order to be able to uplead the code on GitHub.

We also added our own list of stopwords since the NLTK stopwords removed some words we found important for the classification.

We added a way to check the most frequent words from the tweets, without the query and only using words longer than 2 characters. Later on we added filtered out the NLTK stopwords on our most common words, since the analysis was done and these stopwords weren't relevant when looking at the word frequency. Then we displayed it as a bar plot.

Lastly we added a simple GUI to make it more intuitive for the user where to put the query.

Our assignment was to make a algorithm using machine learning, and although TextBlob is a good tool, it doesn't cover our needs for this assignment.

Graphical User Interface

The GUI has been made with the library PySimpleGUI, and this stackoverflow answer was very helpful.

Requirements 🛠️

In order for this algorithm to work you need to have python installed on your computer, as well as the following libraries:

Install libraries using pip

To install the libraries using pip, write the following command lines one by one:

Tweepy: pip install tweepy
Matplotlib: pip install matplotlib
Pandas: pip install pandas
TextBlob: pip install -U textblob as well as python -m textblob.download_corpora to download the necessary NLTK corpora.
WordCloud: pip install wordcloud
Better Profanity: pip install better_profanity
PySimpleGUI: pip install pysimplegui
NLTK: pip install nltk
Collection: pip install collection

Twitter Sentiment Analysis using Logistic Regression

The algorithm will be heavily based on Kate Arbuzova's code from this tutorial.

The dataset used for this method can be found on Kaggle.

Our modifications and thoughts

Our first modifications to Kate's code was to only look at the Logistic Regression methods she used.

We also increased the numbers of features to 10,000 - this was probably a bad move, but we still did it.

Then we commented out a lot of code, just to make the program print less stuff.

The runtime for this was extremely long, so we would recommend scaling everything down.

Requirements 🛠️

In order for this algorithm to work you need to have python installed on your computer, as well as the following libraries:

Install libraries using pip

To install the libraries using pip, write the following command lines one by one:

Scikit-learn: pip install scikit-learn
SciPy: pip install scipy
NLTK: pip install nltk
Statsmodels: pip install statsmodels
Emoji: pip install emoji
Regex: pip install regex
Spacy: pip install spacy
TQDM: pip install tqdm
Matplotlib: pip install matplotlib
Pandas: pip install panda
Pickle: pip install pickle
Seaborn: pip install seaborn

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
Dataset		Dataset
.gitconfig		.gitconfig
.gitignore		.gitignore
API_KEYS.py		API_KEYS.py
README.md		README.md
sentiment_analysis_logistic_regression.py		sentiment_analysis_logistic_regression.py
sentiment_analysis_textblob.py		sentiment_analysis_textblob.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TNM108 Project - Twitter Sentiment Analysis

Twitter Sentiment Analysis using TextBlob

Our modifications and thoughts

Graphical User Interface

Requirements 🛠️

Install libraries using pip

Twitter Sentiment Analysis using Logistic Regression

Our modifications and thoughts

Requirements 🛠️

Install libraries using pip

About

Releases

Packages

Contributors 2

Languages

aannajonssonn/TNM108-Twitter-Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

TNM108 Project - Twitter Sentiment Analysis

Twitter Sentiment Analysis using TextBlob

Our modifications and thoughts

Graphical User Interface

Requirements 🛠️

Install libraries using pip

Twitter Sentiment Analysis using Logistic Regression

Our modifications and thoughts

Requirements 🛠️

Install libraries using pip

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages