-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add weekly hash check to NUTS file and XLSX to YAML utility function (#…
…40) * Add weekly hash check * Add utility to convert NUTS to YAML format * Fix data loading * Add short documentation on utility function
- Loading branch information
1 parent
99bf1dc
commit 342913f
Showing
4 changed files
with
71 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
name: Compare File Hash Weekly | ||
|
||
on: | ||
schedule: | ||
- cron: '0 0 * * 0' # Runs weekly on Sunday at midnight UTC | ||
|
||
jobs: | ||
compare-hash: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- name: Checkout Repository | ||
uses: actions/checkout@v3 | ||
|
||
- name: Calculate Local File Hash | ||
id: local-hash | ||
run: echo "LOCAL_HASH=$(sha256sum pysquirrel/data/NUTS2021-NUTS2024.xlsx | awk '{print $1}')" >> $GITHUB_ENV | ||
|
||
- name: Download File | ||
run: curl -o most-recent-version.xlsx "https://ec.europa.eu/eurostat/documents/345175/629341/NUTS2021-NUTS2024.xlsx" | ||
|
||
- name: Calculate Downloaded File Hash | ||
id: downloaded-hash | ||
run: echo "DOWNLOADED_HASH=$(sha256sum most-recent-version.xlsx | awk '{print $1}')" >> $GITHUB_ENV | ||
|
||
- name: Compare Hashes | ||
run: | | ||
if [ "$LOCAL_HASH" != "$DOWNLOADED_HASH" ]; then | ||
echo "Hashes do not match!" | ||
exit 1 | ||
else | ||
echo "Hashes match!" | ||
fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,15 @@ | ||
# Updating the NUTS source file | ||
|
||
EUROSTAT occasionally updates the current NUTS classification spreadsheet. These updates might be minor and not encompass changing region names or codes, but knowing they take place, it is important to ensure the package accesses the most up-to-date version of the data. | ||
|
||
To this end, a weekly GitHub action compares pysquirrel's copy of the file and the version hosted in the EUROSTAT website with a hash check. The workflow fails if hashes differ. | ||
|
||
In such a case, using a local installation of pysquirrel, and with the newest version of the spreadsheet downloaded: | ||
|
||
```python | ||
from pysquirrel.core import nuts_to_yaml | ||
|
||
nuts_to_yaml("path/to/latest_nuts.xlsx", "path/to/output") | ||
``` | ||
|
||
The function will parse the XLSX file and output the two corresponding YAML files (for NUTS regions and Statistical Regions). YAML files allow for easy tracking of changes in GitHub commits. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Binary file not shown.