This repo is used to store and serve daily collected data from https://dadescovid.org (institutional data published by the Generalitat de Catalunya ) and Seguiment Covid19 BCN (institutional data collected and published by the Ajuntament de Barcelona) into GitHub pages. The reasons for collecting this data are:
- The data will be used from apps which don't need more than daily updates, like Covid Data
Refactored
, an open source serverless Progressive Web Application - Original data requests might be blocked by CORS or other technologies
- Original servers and data might not be efficient enough
- When applicable, normalize data from various servers
The collected data from https://dadescovid.cat is minimally adapted before publishing it:
Maps:
- Transform JS statements into JSON objects
- Reduce some non-visible
id
's to reduce users and servers resource consumptions
Charts:
- Transform JS statements into JSON objects
- Transform HTML tags attributes and content into JSON structured data
The collected data from Seguiment Covid19 BCN is deeply reshaped, throwing away the unneded/repetitive data.
This repo might collect other data in the future, from the same server, it's backend server or from 3rd party servers (EU statistics servers? Data collection from other regions?).
This process is executed from a GitHub Workflow (cron
scheduled some minuts after official data publication at 10am CEST). Once the data is obtained, deploy it to this repo' GitHub Pages in the gh-pages
branch.
The data is collected by an ugly BASH script. This script collects the interesting parts (maps SVG source, JS code with data on it) and saves them into files. The SVG files are saved transparently. The JS files are executed with NODE to ouput the collected data as JSON.
The data is collected by a NodeJS package. This script scrapes data from HTML tags and JS code. It generates individual JSON files for each region/population selectors, and a global JSON index file with the regions recursive structure and all the download links. Deep use of async
/await
.
The data is collected by a nice NodeJS package. This script uses a self made version of SockJS to scrape data from a RStudio/Shiny server. It generates individual JSON files for each datasource or datasource section, and a global JSON index file with the data and all the download links. Deep use of async
/await
. Some use of Streams and Iterator Generator and deep use of async
/await
. Very funny stuff!
The application, scripts and documentation in this project are released under the GNU General Public License v3.0.
The license of the data scraped from https://dadescovid.cat and saved into the directories Charts
and Maps
is the same as the original: Open Data Commons Attribution License, as stated in the backend API page owned by the Generalitat de Catalunya.
The license of the data scraped from https://dades.ajuntament.barcelona.cat/seguiment-covid19-bcn is the same as the original: Creative Commons CC-BY. The owner is the Ajuntament de Barcelona.