Skip to content

Analyzes the open-sourced Yelp dataset. It aims to develop a reviewer and business popularity ranking system to answer business questions such as correlation and predictions.

Notifications You must be signed in to change notification settings

dylanzxc/Yelp-analysis-project

Repository files navigation

cmpt732-yelp-analysis

CMPT732 Yelp Analysis Project

Installation

  • Make

How to run

  • The scripts need to run on our gateway to access Spark/Cassandra cluster
  • Step 1: Create Cassandra tables: make create_schema -f Makefile.production
  • Step 2: Copy data from AWS S3 to Hadoop HDFS: make prepare_data -f Makefile.production
  • Step 3: Load Yelp data from Hadoop HDFS to Cassandra tables: make load_data -f Makefile.production
  • Step 4: Do all analysis: make run_analyze -f Makefile.production
  • Step 5: Store results to Postgres DB: make store_data -f Makefile.production

Web Frontends:

About

Analyzes the open-sourced Yelp dataset. It aims to develop a reviewer and business popularity ranking system to answer business questions such as correlation and predictions.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published