Data from published updates by the Spanish Ministry of Health, detailing the number of diagnosed cases, ICU admissions and deaths, by Autonmous Region (Comunidad Autónoma).
Up-to-date as of: 27/03/2020 at 19h
UPDATE!!!: I've been told that the Ministry of Health has recently started publishing the data feed via Instituto de Salud Carlos III: you can find the data here: https://covid19.isciii.es/resources/serie_historica_acumulados.csv
- The data has been extracted from the "daily" updates published by the Spanish Ministry of Health in this webpage
- The data is published in an unhelpful PDF format, so I used the open-source PDF-extraction tool Tabula to extract the tabular data into CSV formatted files.
- Warning I: The format of the published data has varied several times, so in some cases I had to manually update the extracted CSVs to achieve some level of consistency in their format. I have not altered the values in any way (unless they were altered by accident during the extraction process)
- Warning II: The Health Ministry did not publish data for March 7th and 8th. Also, for March 14th and 15th they did not publish the ICU admission counts.
extracted_data/
: Raw data as extracted by Tabulaconsolidated/
: Consolidateddiagnosed.csv
,icu.csv
anddeaths.csv
data from the data inextracted_data/
.
- Visit the Health Ministry webpage and click on the link "Actualización nºXX: enfermedad por SARS-CoV-2 (COVID-19)"
- Download the PDF file to your local host
- Install Tabula and follow the instructions to extract the table containing the daily update
- Extract the data to a CSV, double check the format doesn't change and
- Run notebooks/000_consolidate_data.ipynb to update the consolidated files
- Inspect the data to make sure it makes sense notebooks/001_plots.ipynb
- Commit and push the resulting consolidated files