-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add sample data structure, local validation scripts, and documentation
Also - update errors/clarity in datapackage.json template
- Loading branch information
Showing
24 changed files
with
1,393 additions
and
646 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,68 @@ | ||
#!/usr/bin/env bash | ||
|
||
# Script: validate_frictionless_package.sh | ||
# Description: Bash script to validate a Frictionless Data Package using the Frictionless CLI. | ||
# Usage: validate_frictionless_package.sh [-v tides_version | -l local_schema_location] [-d dataset_location] | ||
# -v tides_version: Optional. Specify the version of the TIDES specification or 'local' to | ||
# use a local schema. Default is to use the schema specified in the datapackage. | ||
# -l local_schema_location: Optional. Specify the location of the local schema directory. | ||
# Default is '../spec'. Is only used if tides_version = local. | ||
# -d dataset_location: Optional. Specify the location of the TIDES datapackage.json. | ||
# Default is the current directory. | ||
|
||
# Set default values | ||
tides_version="" | ||
local_schema_location="../spec" | ||
dataset_location="." | ||
|
||
# Parse command-line arguments | ||
while getopts ":v:l:d:" opt; do | ||
case $opt in | ||
v) | ||
tides_version=$OPTARG | ||
;; | ||
l) | ||
local_schema_location=$OPTARG | ||
;; | ||
d) | ||
dataset_location=$OPTARG | ||
;; | ||
\?) | ||
echo "Invalid option: -$OPTARG" >&2 | ||
exit 1 | ||
;; | ||
esac | ||
done | ||
|
||
# Create a temporary data package if using a different schema reference or a local schema | ||
tmp_datapackage="" | ||
if [ "$tides_version" != "" ] then | ||
tmp_datapackage=$(mktemp) | ||
cp "$dataset_location/datapackage.json" "$tmp_datapackage" | ||
fi | ||
|
||
# Set the schema URL based on the option chosen | ||
schema_url="" | ||
if [ "$tides_version" == "local" ]; then | ||
schema_path_prefix="$local_schema_location" | ||
else | ||
schema_path_prefix="https://raw.githubusercontent.com/TIDES-transit/TIDES/$tides_version/spec" | ||
fi | ||
|
||
# Update the 'schema' property in the temporary copy of the datapackage.json file, if applicable | ||
if [ "$tmp_datapackage" != "" ]; then | ||
schema_file=$(echo "$tmp_datapackage" | sed 's/\//\\\//g') | ||
sed -E -i "s/\"schema\": \"[^\/]+\.schema\.json\"/\"schema\": \"$schema_path_prefix\/\${schema_file##*\/}\"/g" "$tmp_datapackage" | ||
dataset_location="$tmp_datapackage" | ||
fi | ||
|
||
# Validate the data package JSON against the TIDES schema | ||
./validate-data-package-json.sh -v "$tides_version" -d "$dataset_location" -l "$local_schema_location" | ||
|
||
# Validate the Frictionless Data Package using the Frictionless CLI | ||
frictionless validate "$dataset_location" | ||
|
||
# Remove the temporary data package file, if applicable | ||
if [ "$tmp_datapackage" != "" ]; then | ||
rm "$tmp_datapackage" | ||
fi |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,64 @@ | ||
#!/usr/bin/env bash | ||
|
||
# Script to validate a local JSON file against a schema specified in a GitHub repository. | ||
# Usage: validate-data-package-json.sh [-r ref | -l local_schema_location] [-f datapackage_file] | ||
# -r ref: Optional. Specify the ref name of the GitHub repository. Default is 'main'. | ||
# -l local_schema_location: Optional. Specify the location of the local schema directory. | ||
# -f datapackage_file: Optional. Specify the location of the datapackage.json file. Default is 'datapackage.json' in the execution directory. | ||
|
||
# Check if jsonschema-cli is installed | ||
command -v jsonschema-cli >/dev/null 2>&1 || { | ||
echo >&2 "jsonschema-cli is required but not found. You can install it using 'pip install jsonschema-cli'. Aborting." | ||
exit 1 | ||
} | ||
|
||
# Set default values | ||
ref="main" | ||
local_schema_location="" | ||
datapackage_file="datapackage.json" | ||
|
||
# Parse command-line arguments | ||
while getopts ":r:l:f:" opt; do | ||
case $opt in | ||
r) | ||
ref=$OPTARG | ||
;; | ||
l) | ||
local_schema_location=$OPTARG | ||
;; | ||
f) | ||
datapackage_file=$OPTARG | ||
;; | ||
\?) | ||
echo "Invalid option: -$OPTARG" >&2 | ||
exit 1 | ||
;; | ||
esac | ||
done | ||
|
||
echo "Validating data package file in $dataset_location" | ||
|
||
# Set the temporary directory path | ||
temp_dir=$(mktemp -d) | ||
|
||
# Set the schema file path based on the option chosen | ||
schema_file="" | ||
if [ "$local_schema_location" != "" ]; then | ||
schema_file="$local_schema_location/tides-data-package.json" | ||
else | ||
# Download the schema file to the temporary directory | ||
schema_url="https://raw.githubusercontent.com/TIDES-transit/TIDES/$ref/spec/tides-data-package.json" | ||
schema_file="$temp_dir/data-package.json" | ||
|
||
if curl -s --head "$schema_url/tides-data-package.json" >/dev/null; then | ||
echo "Schema file not found on GitHub for the specified TIDES version: $tides_version" | ||
exit 1 | ||
fi | ||
curl -o "$schema_file" "$schema_url" | ||
fi | ||
|
||
# Validate datapackage against the downloaded schema | ||
jsonschema-cli validate "$schema_file" "$datapackage_file" | ||
|
||
# Clean up the temporary directory | ||
rm -rf "$temp_dir" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# Data Package | ||
|
||
TIDES data must include a `datapackage.json` in the format specified by the [`tides-data-package` json schema](https://raw.githubusercontent.com/TIDES-transit/TIDES/main/spec/tides-data-package.json), which is an extension of the [frictionless data package](https://specs.frictionlessdata.io/data-package/) schema. | ||
|
||
You may create your own `datapackage.json` based on the documentaiton or start with the provided [template](#template), but don't forget to [validate](#validation) it to make sure it is in the correct format! | ||
|
||
## Data Package Format | ||
|
||
{{ frictionless_data_package('spec/tides-data-package.json') }} | ||
|
||
## Tabular Data Resource | ||
|
||
Required and recommended fields for each `tabluar-data-resource` are as follows: | ||
|
||
{{ frictionless_data_package('spec/tides-data-package.json',sub_schema="resources") }} | ||
|
||
## Template | ||
|
||
The canonical `datapackage.json` template is available at [`/data/template/TIDES/datapackage.json`](https://raw.githubusercontent.com/TIDES-transit/TIDES/main/samples/template/TIDES/datapackage.json). | ||
|
||
!!! warning | ||
This version of `tides-data-package` template is dependent on the version of the documentation you are viewing and only represents the canonical `tides-data-package` template if you are viewing the `main` documentation version. | ||
|
||
{{ include_file('samples/template/TIDES/datapackage.json',code_type='json') }} | ||
|
||
## Validation | ||
|
||
There are lots of options for validating your `datapackage.json` file including: | ||
|
||
- [Command Line Interface (CLI) Script](#cli) | ||
- [Various online websites](#point-and-drool) | ||
|
||
### CLI | ||
|
||
You can easily validate your data package file with the script provided in [`/bin/validate-data-package-json`](https://raw.githubusercontent.com/TIDES-transit/TIDES/main/bin/validate-data-package-json) | ||
|
||
??? tip "installation requirements" | ||
|
||
Make sure you have jsonschema-cli installed. You can install it specifically or with all of the other suggested tools using one of the commands below: | ||
|
||
```sh | ||
pip install jsonschema-cli | ||
pip install -r requirements.txt | ||
``` | ||
|
||
```sh title="usage" | ||
validate-data-package-json -f my-datapackage.json | ||
``` | ||
|
||
{{ include_file('bin/validate-data-package-json',code_type='sh') }} | ||
|
||
### Point-and-Drool | ||
|
||
Because a `tides-data-package` is just a json-schema, you can use the myriad of different json-schema validator out there on the web. Use the [canonical `tides-data-package`](https://raw.githubusercontent.com/TIDES-transit/TIDES/main/spec/tides-data-package.json) or copy and paste the version from below. | ||
|
||
!!! warning | ||
This version of `tides-data-package` is dependent on the version of the documentation you are viewing and only represents the canonical `tides-data-package` if you are viewing the `main` documentation version. | ||
|
||
{{ include_file('spec/tides-data-package.json',code_type='json') }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# Sample Data | ||
|
||
Sample data can be found in the `/samples` directory, with one directory for each data sample. | ||
|
||
{{ include_file('samples/README.md')}} | ||
|
||
## Data List | ||
|
||
{{ list_samples('samples') }} |
Oops, something went wrong.