Analysis of preventive measures (COVID-19)

Disclaimer: For informative purposes only. I am not a professional nor do I claim to be one. This is a school project that I extended upon in my spare time. I hold no responsibility for data accuracy or data gathering. Full credit goes to good people that made these datasets, links bellow.

That said, dataset created is free for you to use.

Dataset can be found in data folder

Project goal

Goal of this project was to create a dataset containing facts about countries and preventive measures taken by those countries during COVID-19 pandemic, so far. For these purposes, I combined data from multiple sources to create a simple data warehouse that stores data on new measures implemented for each day in the period from 01-01-2020 to 02-06-2020, for every country. With this I hope to provide a single dataset for analysing the effectiveness of preventive measures worldwide.

Data sources

Data for this project was combined from 4 sources:

ACAPS COVID-19: Government Measures Dataset by The Humanitarian Data Exchange - puts together all the measures implemented by governments worldwide in response to the Coronavirus pandemic.
COVID-19 Coronavirsu data by EU Open Data Portal - data on the geographic distribution of COVID-19 cases worldwide.
Oxford COVID-19 Government Response Tracker (OxCGRT) by Hale et. al. - systematically collects information on several different common policy responses that governments have taken to respond to the pandemic on 17 indicators such as school closures and travel restrictions.
The World by Income and Region by The World Bank - The World Bank classifies economies for analytical purposes into four income groups: low, lower-middle, upper-middle, and high income.

ETL

Image below shows ETL process. The transforming part was done with Python Pandas, code can be found in ETL.ipynb document in this repo. Main part of ETL was to clean the data and to split it into meaningful tables.

Data model

Data is modeled in the form of star schema consisting of 4 dimension tables and 1 fact table. Dimension tables contain: country, date and measures data. There is also one junk dimension (additional_dim) that contains additional data for each day. (Can be used in the future to get more insights)

Data descriptions

Country dimension (country_dim)

Column name	Description
country_name	Name of the country.
country_code	3 letter code unique to every country.
geo_id	Similar to country code.
continent	Continent that country resides on. Similar (or same as) to region.
region	Region of the world in which the country resides.
population_data_2018	Population number (data from 2018.).
income_group_2020	Income group of the country, as categorised by The World Bank (for the year 2020.).

Measure dimension (measures_dim)

Column name	Description
measure	Name of the measure. Categorical value, there is 34 of them.
measure_category	Each measure is categorised in one of six categories.

Fact table (measures_by_day_fact)

Column name	Description
additional_id	Reference to additional_dim table.
date_id	Reference to date_dim table.
measure_id	Reference to measure_dim table.
country_id	Reference to country_dim table.
cases	New cases for each day for each county.
deaths	Number of death for each day for each country.
stringency_index	Index developed by OxCGRT, shows how strict government response measures are. For each day for each country.
government_response_index	Index developed by OxCGRT, shows government responsivnes for each day for each country.
containement_health_index	Another index by OxCGRT, more info on link
economic_support_index	Another index by OxCGRT, more info on link

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
data		data
diagrams		diagrams
images		images
.gitignore		.gitignore
ETL.ipynb		ETL.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Analysis of preventive measures (COVID-19)

Project goal

Data sources

ETL

Data model

Data descriptions

Country dimension (country_dim)

Measure dimension (measures_dim)

Fact table (measures_by_day_fact)

About

Languages

marzekan/covid-19-measures-analysis

Folders and files

Latest commit

History

Repository files navigation

Analysis of preventive measures (COVID-19)

Project goal

Data sources

ETL

Data model

Data descriptions

Country dimension (country_dim)

Measure dimension (measures_dim)

Fact table (measures_by_day_fact)

About

Resources

Stars

Watchers

Forks

Languages