This repository contains the implementation of a sentiment analysis model using various Recurrent Neural Networks (RNN, LSTM, GRU) for the IMDB dataset. The project includes features like data preprocessing, model training, evaluation, visualization, and logging with TensorBoard.
The goal of this project is to build and compare different neural network models for sentiment analysis on the IMDB dataset. The models are designed to classify movie reviews as either positive or negative.
- π§ Multiple Model Architectures: Supports RNN, LSTM, and GRU models.
- π Pretrained Embeddings: Utilizes GloVe pretrained embeddings.
- π§Ή Data Preprocessing: Includes text cleaning, tokenization, and vocabulary building.
- π Visualization: Generates word clouds for positive and negative reviews.
- π TensorBoard Integration: Logs training and evaluation metrics for visualization in TensorBoard.
To set up the project, clone the repository and install the required packages:
git clone https://github.com/NimaVahdat/IMDB_Sentiment_Analysis.git
cd imdb-sentiment-analysis
Ensure you have the following packages installed:
- torch
- torchtext
- pandas
- matplotlib
- wordcloud
- tqdm
- tensorboard
Create configuration files for your models (RNN, LSTM, GRU). Example configuration files are provided in the config
directory. You can modify these files according to your needs.
To train, evaluate, and predict sentiment using the models, run the main.py
script:
python main.py
This script will:
- Load configurations for RNN, LSTM, and GRU models.
- Initialize the models.
- Visualize word clouds for positive and negative reviews.
- Count and print the number of parameters in each model.
- Train and evaluate each model.
- Test each model on the test dataset.
- Predict the sentiment of example reviews using each model.
The visualize method in the script generates word clouds for positive and negative reviews in the training data:
# Visualize word clouds
imdb_rnn.visualize()
This helps in understanding the most common words in positive and negative reviews.
The script includes sentiment prediction for example reviews:
# Example reviews for sentiment prediction
review1 = "It's a good movie..."
review2 = "Wow that was a painful 90 minutes..."
# Predict sentiment for the example reviews using all models
imdb_rnn.predict_sentiment(review1)
imdb_lstm.predict_sentiment(review1)
imdb_gru.predict_sentiment(review1)
imdb_rnn.predict_sentiment(review2)
imdb_lstm.predict_sentiment(review2)
imdb_gru.predict_sentiment(review2)
This will output the predicted sentiment class and probability for the provided custom reviews.
Model | Training Accuracy | Validation Accuracy | Test Accuracy |
---|---|---|---|
RNN | 0.893 | 0.832 | 0.825 |
LSTM | 0.896 | 0.883 | 0.878 |
GRU | 0.913 | 0.893 | 0.882 |
Model | Number of Parameters |
---|---|
RNN | 7,729,202 |
LSTM | 12,580,802 |
GRU | 14,359,502 |
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
This project is licensed under the MIT License.