CIS450/550 - Final Project: Polls, Pandemics, and Possibly More!

Members:

Benjamin Kosko, bkosko@seas.upenn.edu, bkosko
Victor Lin, lin1129@seas.upenn.edu, victorHLin
Chun-Fu Yeh, cfyeh@seas.upenn.edu, YehCF
Sarah Payne, paynesa@seas.upenn.edu, paynesa

Dependencies

Server:

chart.js 3.6.1
cors 2.8.5
express 4.17.1
mysql 2.18.1
node-fetch 3.0.0
nodemon 2.0.12
supertest 6.1.6
jest 27.1.0

Client:

ant-design/charts 1.2.14
fortawesome/fontawesome-svg-core 1.2.36
fortawesome/free-solid-svg-icons 5.15.4
fortawesome/react-fontawesome 0.1.16
testing-library/jest-dom 5.14.1
testing-library/react 11.2.7
testing-library/user-event 12.8.3
antd 4.16.13
antd-button-color 1.0.4
bootstrap 5.1.3
canvasjs 1.8.3
chart.js 3.6.1
colormap 2.3.2
d3 7.1.1
d3-format 3.0.1
datamaps 0.5.9
font-awesome 4.7.0
query-string 7.0.1
react17.0.2
react-bootstrap 2.0.3
react-chartjs-2 4.0.0
react-d3-library 1.1.8
react-dom 17.0.2
react-loading 2.0.3
react-promise-tracker 2.1.0
react-promise-tracker 2.1.0
react-router-dom 5.3.0
react-scripts 4.0.3
react-usa-map 1.5.0
react-vis 1.11.7
reactstrap 9.0.1
shards-react 1.0.3
web-vitals 1.1.2

Data Wrangling

R: tidyverse and dpylr

Running the Code

Running the App

Open two terminal windows. In one, type the following commands:

cd server
npm install
npm start

In the other type:

cd client
npm install
npm start

In a few moments, the server should be running and a browser window should pop up. If no window pops up, open your browser and go to http://localhost:3000/.

Running the Data Wrangling

Elections Data Wrangling Place preprocess_voting.R and 1976-2020-senate.csv in the same directory (both are in the preprocessing/voting_preprocessing directory by default). Then either execute the R script on the command line or open preprocess_voting.R in RStudio, set the session's working directory to the source file location, and execute the entire script.

Stock Data Wrangling Run stock_preprocess.ipynb to preprocess the original table downloaded from (https://www.kaggle.com/shannanl/sp500-dataset?select=sp500+agg.csv). Then, the preprocessed stock table can be retrieved.

COVID/Vaccine Data Wrangling In DataGrip, we replaced all slashes with hyphens so all the dates (in both files) followed the MM-DD-YYYY format. Vaccination data also had several negative values that needed to be corrected; we did this by sorting by case numbers, and then taking the absolute value of the clearly wrong 6 negative values.

Yelp data Wrangling Get yelp_academic_dataset_business.json, yelp_academic_dataset_user.json and yelp_academic_dataset_review.json from https://www.yelp.com/dataset.
To wrangle yelp_academic_dataset_review.json and yelp_academic_dataset_user.json, you need to execute chunk.sh first. It will chunk the original file to smaller size files to speed up wrangling time.
Usage of chunk.sh:

./chunck.sh {your_file_name}

Follow the program instruction to input the number of rows you want to store in a file. After chunking data, put all chunk files to the directory, and modify the path variable with directory path in yelp_review.py and yelp_user.py. Then execute yelp_review.py to create the csv file for Review table and execute yelp_user.py to create the csv file for User table.
To wrangle yelp_academic_dataset_business.json, you just need to modify file path in both yelp_business.py and yelp_categories.py. yelp_business.py will create the csv file for Business table. yelp_categories.py will create two csv files, one for Categories table and the other for Business_Categories table.

Name		Name	Last commit message	Last commit date
Latest commit History 86 Commits
client		client
preprocessing		preprocessing
server		server
.gitignore		.gitignore
README.MD		README.MD

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CIS450/550 - Final Project: Polls, Pandemics, and Possibly More!

Members:

Dependencies

Server:

Client:

Data Wrangling

Running the Code

Running the App

Running the Data Wrangling

About

Releases

Packages

Contributors 4

Languages

YehCF/CIS550DB

Folders and files

Latest commit

History

Repository files navigation

CIS450/550 - Final Project: Polls, Pandemics, and Possibly More!

Members:

Dependencies

Server:

Client:

Data Wrangling

Running the Code

Running the App

Running the Data Wrangling

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages