-
Notifications
You must be signed in to change notification settings - Fork 326
Truth Data
The COVID-19 Forecast Hub collates daily deaths and confirmed cases from the Johns Hopkins University's (JHU) Center for System Science and Engineering (CSSE) group's COVID-19 github repository as the gold standard reference data for deaths in the US.
We also collate NYTimes and USAFacts for comparison to JHU.
We aggregate and format both Cumulative Death and Incident Death truth data from the JHU CSSE group. Although these csv
s are not explicitly used in the visualization code, they match the "Actual" line in the visualization. This python script creates these truth data csvs.
Weekly cumulative counts are the reported values as of the Saturday of each week. For example, the weekly cumulative count for the week ending Saturday, August 1, 2020 is equal to the reported daily cumulative count for Saturday, August 1, 2020.
Weekly incident counts are calculated as the difference between consecutive weekly cumulative counts. For example, the weekly incident count for the week ending Saturday, August 1, 2020 is the difference between the weekly cumulative count for Saturday, August 1, 2020 and the weekly cumulative count for Saturday, July 25, 2020.
The cumulative and incident counts at the state level are calculated by summing reported cumulative and incident counts in the JHU data file across all locations with the same value for the Province_State
field. This includes some "county-level" records for which we do not request forecasts. These are records with a five-digit FIPS code beginning with 80
or 90
, corresponding to "Out of State" or "Unassigned" locations. For this reason, the counts at the state level may in general be larger than the sum of the counts for the counties within a given state.
The counts at the national level are calculated as the sum of counts for all locations in the JHU data file. This includes counts for the Diamond Princess cruise ship, and so the counts for the state level again do not sum to the counts for the national level.
The Actual
line in the visualization is based on the JHU CSSE group truth data. The visualization uses this Cumulative Death JSON, and this Incident Death JSON. This python script creates these JSONS.
The actual data the visualization uses (Forecasts + Truth Data) is in this folder. These JSONs are created with the commands in 0-init-vis.sh using the truth data when the visualization is built. The file called "season-latest" is the default view, which is also Cumulative Deaths. For each State key in the JSON, there is an Actual
object that contains the truth data in the visualization. More on the JSON structure here.
The Zoltar truth data is created with this python script and can be found here.
JHU truth data are updated every 6 hours through Github Action CI while NYTimes and USAFacts are updated manually and sporadically.
- Home
- Submitting Forecasts
- Data Validation
- Truth Data
- Baseline model
- Weekly ensemble release
- Developer