GitHub - jonathangiber/GuessRating: A program that will guess rating product based on their description using machine learning

How accurate can I guess the review of some product (from 1 to 5 stars) based on the content (words) of the review? This program checks it for you in a dataset of 108 mb about video games (a json file from amazon) that can be downloaded from here https://drive.google.com/file/d/0BzAGHKa-swBzOG9HYmRJU1hIckU/view?usp=sharing

Just put both files in the same folder and run it.

About the script: I first extracted the reviews from JSON file and stored it in a CSV file. After that, I iterated on each row of a CSV file which contains a review in first column and rating in second column . I then cleaned the review using ntlk corpus of stopwords to prepare them for model training purposes. Now I vectorised the review using TfidfVectorizer and transformed my test and train data using this vectoriser. Finally, I trained my Logistic Regression model with this training data and tested on my test data and showed the accuracy of prediction.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Final.py		Final.py
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Releases

Packages

Languages

jonathangiber/GuessRating

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages