Skip to content

ML model that classifies movie reviews as positive or negative.

Notifications You must be signed in to change notification settings

NicBonetto/binary-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

binary-classification

ML model that classifies movie reviews as positive or negative.

About the Project

This repository is meant to be an introduction to Machine Learning. The IMDB dataset from Keras, which contains reviews from the Internet Movie Database, is used to train the model to classify the sentiment of a movie review as good or bad. A lot of the code used can be found in the book Deep Learning with Python by François Chollet. There were two main goals for this project: the first was to set up a workspace so that I can actually train and run a model (which was surprisingly difficult without a machine running on Linux), and the second to get familiar with a basic Machine Learning workflow solving a simple problem, and experimenting with changing parameters and how they affect the outcome of the training.

About the Model

This model is trained using 3 dense layers - two of which have 16 hidden units and run the relu operation and a single layer with one hidden unit running the sigmoid operation. To explain what this means in detail, we can further define what each italicized word means to remove ambiguity for anyone unfamiliar with these terms. A dense layer refers to a layer where the output of the previous layer is passed to the next. So with 3 dense layers, we have a dense network where each layer is fully connected. A hidden unit is the dimension given to the weight matrix for the layer. A higher number means giving the network more freedom in learning as it can create much more complex representations of the input data. relu is short for rectified linear unit, and is the operation that is ran on the data passed to the layer: output = relu(dot(W, input) + b). relu runs the dot operation on the weight tensor (W) and the input, adds the bias tensor to the dot product output, and returns either 0 or the result of the dot(W, input) + b (whichever is bigger). Lastly, the sigmoid operation squashes a value between 0 and 1. The output for the sigmoid function works well with binary classification because we want the output to be a scale between 0 and 1 in order to represent the probability of the input belonging to one of two classes (in our case, a positive or negative review).

After the input data has run it's course through the network, we use the rmsprop optimizer, and binary crossentropy loss function to adjust the weights for the network.

About

ML model that classifies movie reviews as positive or negative.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages