CS175_Project

This directory contains the following Python Jupyter Notebooks:

Project_Main.ipynb - This notebook contains the entirety of our project. In it we explore the dataset, provide a bag of words encoding sample, obtain manual and pre-trained word embeddings for the Word2Vec and Stanford GLOVE embedding models, and feed these four embeddings through a simple convolutional neural network. We compare different values of hyperparameters including kernel size, dropout rate, and # of filter for the CNN and plot the corresponding model accuracies. We also provided some sample predictions on positive and negative movie reviews using the model. The following dependencies must be installed in order to run this code: tensorflow

tensorflow-datasets
numpy
scipy
sklearn
nltk
gensim
matplotlib

BagOfWords.ipynb - This notebook contains a bag of words implementation of our dataset fed through a multilayer perceptron to compare with our word embeddings sentiment classifier. This implementation compares binary, tf-idf, frequency, and count vectors for the bag of words model. Sample model predictions on movie reviews are also provided. This code was collected from Jason Brownlee's tutorial, presented in the following link: https://machinelearningmastery.com/deep-learning-bag-of-words-model-sentiment-analysis/. This was added in to compare alongside the word embeddings implementation. The following dependencies must be installed to run this code:

pandas
matplotlib
nltk
keras.

BagOfWords_lite.ipynb - This notebook contains a bag of words implementation with binary vectors. This can be run on the review_polarity dataset. The following dependencies must be installed to run this code:

pandas
matplotlib
nltk
keras.

ProjectMain_lite.ipynb - This notebook contains an implementation of manually trained word embeddings fed through a simple convolutional neural network. This can be run on the review_polarity dataset. The following dependencies must be installed to run this code:

tensorflow
tensorflow-datasets.

The review_polarity dataset used for this project is also provided. It contains 1000 positive movie reviews and 1000 negative movie reviews.

In order to run this project, run the Project_Main_lite.ipynb and BagOfWords_lite.ipynb files.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.ipynb_checkpoints		.ipynb_checkpoints
__pycache__		__pycache__
review_polarity		review_polarity
BagofWords.ipynb		BagofWords.ipynb
BagofWords_lite.ipynb		BagofWords_lite.ipynb
Project_Main.ipynb		Project_Main.ipynb
Project_Main_lite.ipynb		Project_Main_lite.ipynb
README.md		README.md
explore_data.py		explore_data.py
vocab.txt		vocab.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CS175_Project

About

Releases

Packages

Contributors 2

Languages

Mimran0715/AI_project_175

Folders and files

Latest commit

History

Repository files navigation

CS175_Project

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages