Skip to content
This repository has been archived by the owner on Dec 27, 2022. It is now read-only.

A simple tool that automatically collects and parses data on the COVID-19 outbreak in The Netherlands

License

Notifications You must be signed in to change notification settings

tomdewildt/covid-19-data-collector

Repository files navigation

COVID-19 Data Collector

Build Scheduler Coverage Size License

DEPRECATED

This tool is deprecated in favor of the government provided dashboard and data.

This tool automatically collects and parses the data from the RIVM and NICE websites on the COVID-19 oubreak in The Netherlands.

How To Run

Prerequisites:

  • virtualenv version 20.0.3 or later
  • python version 3.8.5 or later
  • pylint version 2.4.4 or later
  • black version 19.10b0 or later

Development

  1. Run make init to initialize the environment.
  2. Run make run/[task] to execute a single task.

Available tasks

  • get_national_dataset retrieves national outbreak data.
  • get_municipality_dataset retrieves outbreak data per municipality.
  • get_intensive_care_dataset retrieves intensive care data.
  • clean_national_dataset clean the national datasets.
  • clean_municipality_dataset clean the municipality datasets.
  • clean_intensive_care_dataset clean the intensive care datasets.
  • merge_national_dataset merge the national datasets.
  • merge_municipality_dataset merge the municipality datasets.
  • merge_intensive_care_dataset merge the intensive care datasets.

Test

  1. Run make init to initialize the environment.
  2. Run make test to execute the tests.

Datasets

This repository contains three datasets that are updated every day. The data is collected from the RIVM and NICE websites.

The data folder contains four subfolders:

  • raw contains the raw datasets.
  • interim contains contains the cleaned datasets.
  • processed contains the merged datasets.
  • external contains external datasets.
Dataset Source Fields
rivm-covid-19-national.csv RIVM Confirmed Cases (PositiefGetest), Hospitalized (Opgenomen), Deceased (Overleden), Date (Datum)
rivm-covid-19-municipality.csv RIVM Municipality Code (Gemeentecode), Confirmed Cases (PositiefGetest), Municipality (Gemeente), Province Code (Provinciecode), Province (Provincie), Date (Datum)
nice-covid-19-intensive-care.csv NICE Date (Datum), Hospitalized Cumulative (OpgenomenCumulatief), Intensive Care (Intensive Care), Survived Cumulative (OverleeftCumulatief), Deceased Cumulative (OverledenCumulatief), Hospitalized (Opgenomen), Newly Hospitalized Suspicious (NieuwOpgenomenVerdacht), Newly Hospitalized Proven (NieuwOpgenomenBewezen)

References

RIVM COVID-19

NICE COVID-19

Beautiful Soup

Pandas

Numpy

Pytest

About

A simple tool that automatically collects and parses data on the COVID-19 outbreak in The Netherlands

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published