Skip to content
This repository has been archived by the owner on Oct 25, 2023. It is now read-only.

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

Notifications You must be signed in to change notification settings

simonw/cdc-vaccination-history

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

cdc-vaccination-history

Project retired as of 25th October 2023

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

Archives the JSON from https://covid.cdc.gov/covid-data-tracker/COVIDData/getAjaxData?id=vaccination_data every time it changes, checking three times an hour.

Watch Git scraping, the five minute lightning talk to see me live-code the creation of this repository.

This data as CSV

If you want to grab the entire dataset I'm now publishing it as two CSV files here:

This data in Datasette

The build_database.py script loops through the full commit history and uses it to build a SQLite database with a row for every daily report, mainly as a demonstration of how Python code can be used to extract data from a git scraped repository.

That database is then deployed using Datasette - you can browse the data at https://cdc-vaccination-history.datasette.io/cdc/daily_reports

You can filter down to individual states like so:

Take a look at the scrape.yml GitHub Actions workflow to see how the scraper runs, and how the data is then built into a database and published to Vercel using datasette publish.

Should you trust these numbers?

I honestly don't know. These are not coming from a documented API - I found it using the Firefox developer tools network pane. I don't know how the CDC are sourcing these. I don't know if they themselves consider them to be accurate.

All I know is that these are the numbers they are displaying on their own site - so you should treat this repository as tracking "numbers that were displayed on the CDC's website" as opposed to assuming it represents the full truth on the ground.

About

A git scraper recording the CDC's Covid Data Tracker numbers on number of vaccinations per state.

Topics

Resources

Stars

Watchers

Forks

Sponsor this project

 

Languages