🦠This project involves conducting sentiment analysis on COVID-19 tweets obtained from Twitter, by using a combination of data mining techniques and machine learning algorithms. This project was part of the course "Data Mining Techniques", as taught in 2022 by professor Dimitris Gunopulos.
The project covers the following main topics:
- Data cleansing
- Data analysis, which includes analyzing sentiment distribution, identifying most common words, and making observations
- Vectorization techniques such as Bag-of-Words, Tf-Idf, and Word Embeddings
- Classification using algorithms such as SVM, Random Forest, and KNN
- Topic modeling and Latent Dirichlet Allocation (LDA)
This project is the result of a collaboration with Giorgos Nikolaou.