Q: How do annual sales compare county to county and does that relate to ml purchased per dollar? aka, do counties with higher sales and larger populations buy more expensive liquor?
To view this report:
- Install Anaconda - https://www.anaconda.com/download/
- Clone this repo
- In the project's root folder run
setup.py
- Download the liquor sales source data either from my dropbox, or if you have a kaggle account, here
- Unzip the source data into project's
root/input/iowa-liquor-sales/
- Run
seed_data.py
This step can take a while, it will create the database, create the tables, then parse the data and insert it into several tables - 2.2 million rows are being parsed and inserted. A mid-range 5 year old desktop takes ~150 seconds. - The conda environment is readly to create locally using
conda env create -f environment.yml
and then be activated. It's named py_data_env. - After database seeding console prints
database seeding took X seconds
launch Anaconda, either Navigator or Prompt. A. If using prompt, navigate to the project's root directory, runjupyter notebook Iowa_liquor_data_vis.ipynb
B. If using navigator, launch Jupyter Notebook from the main menu and manually navigate to the project's root, click onIowa_liquor_data_vis.ipynb
to open the notebook - In the jupyter notebook's menu bar, select
Cell
andRun All
- The notebook contains step-by-step markdown labels for clarity, and comments throughout.
sneak-peek at graphed data: