Skip to content

Latest commit

 

History

History
140 lines (106 loc) · 6.65 KB

README.md

File metadata and controls

140 lines (106 loc) · 6.65 KB

NJDOT Traffic Crash Data

Analysis of NJDOT traffic crash data.

Plots:

Methods:

Plots

I've only done a very quick first pass at cleaning and plotting the data here, so take these with a grain of salt.

There is a marked decrease in "injury" and "property damage" crashes since the onset of COVID (≈March 2020), but fatal crashes are roughly flat:

Crashes per Month (Statewide)

Injuries per Month (Statewide)

Property Damage Crashes per Month (Statewide)

Deaths per Month (Statewide)

Crashes per {County, Month}

Injuries per {County, Month}

Property Damage Crashes per {County, Month}

Deaths per {County, Month}

Crashes per Year (Statewide)

Injuries per Year (Statewide)

Property Damage Crashes per Year (Statewide)

Deaths per Year (Statewide)

Crashes per {County, Year}

Injuries per {County, Year}

Property Damage Crashes per {County, Year}

Deaths per {County, Year}

Crash-Type Percentages

Injuries, Property Damage, Deaths (as Percentage of All Crashes)

Deaths (as Percentage of All Crashes)

Methods

rawdata.py is a CLI for downloading+caching .zips, extracting .txts, cleaning+converting to .pqt (Parquet).

./rawdata.py --help
# Usage: rawdata.py [OPTIONS] COMMAND [ARGS]...
# 
# Options:
#   --help  Show this message and exit.
# 
# Commands:
#   check-nj-agg      For one or more years, verify the `NewJersey` file is a
#                     concatenation of the county-specific files
#   parse-fields-pdf  Parse fields+lengths from one of the `*CrashTable.pdf`s,
#                     using Tabula
#   pqt               Convert 1 or more unzipped {year, county} `.txt` files to
#                     `.pqt`s, with some dtypes and cleanup
#   txt               Convert 1 or more {year, county} .zip files (convert each
#                     .zip to a single .txt)
#   zip               Download 1 or more {year, county} .zip file(s)

Example: Download + Clean Data

./rawdata.py zip -r NewJersey  # download statewide-aggregated `.zip`s for [2001,2020] x {Accidents,Drivers,Occupants,Pedestrians,Vehicles}
./rawdata.py txt -r NewJersey  # Extract each `.zip` (to a single `.txt`)
./rawdata.py pqt -r NewJersey  # Clean (parse dates, assign some dtypes) + convert to Parquet

Notebooks

SQLite DBs

njdot compute pqt -f
njdot compute db -f

cmym.ipynb: generate cmymc.db containing several {county, muni, year, month} aggregation tables.

Caveats / TODOs

The fatal crash stats here also seem to differ from NJSP's data (see the root of this repository) by ≈10%.


Attributions:

TODO: add to www pages