Reddit DataViz Battle August 2018

Code for the Reddit dataisbeautiful DataViz Battle for the Month of August 2018

Analysis and visual exploration of TSA Claims Data.

Installation

This repository uses pipenv. If you need to install it you can follow the documentation.

Create a pyhon 3.6 environment and install all the dependencies:

git clone git@github.com:jackdbd/reddit-dataviz-battle-2018-08.git
cd reddit-dataviz-battle-2018-08
pipenv --python python3.6
pipenv install

Data

The entire TSA dataset is spread across multiple Excel files and PDF files. Download all files from here and put them in the data directory.

The script make_db.py gathers data from all the files (.xls, .xlsx, .pdf) and creates a SQLite database. You can run it with sane defaults with:

cd src
pipenv run python make_db.py  # it takes ~20 minutes

If you want to specify different parameters to read the PDF/Excel files, run:

pipenv run python make_db.py --help

For instance, it might be useful to run the script in debug mode to see what's going on with the PDF files.

This will drop the database and read only 2 pages in each PDF file, skipping all Excel files.

pipenv run python make_db.py -d --no_excel

Usage

When your database TSA.db is ready, you can launch a Jupyter notebook and start exploring the data:

cd notebooks
pipenv run jupyter notebook

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
data		data
notebooks		notebooks
plots		plots
src		src
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Reddit DataViz Battle August 2018

Installation

Data

Usage

About

Releases

Packages

Languages

jackdbd/reddit-dataviz-battle-2018-08

Folders and files

Latest commit

History

Repository files navigation

Reddit DataViz Battle August 2018

Installation

Data

Usage

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages