v6.0.1
Fixed
- All required calls in the API are now paginated by Boto3. This fixes some large issues where, when there were more than 50 crawlers in the account the API would fail to retrieve all datasets as the backend call would paginate onto a next page.
- Fixes an issue where the delete data file endpoint was deleting the raw data file from S3 and now instead deletes the processed file instead.
- Fixes an issue where the uploaded files were temporarily stored with just the name they were uploaded with, this was causing errors if two identically names files were uploaded within a small window.
Added
- New optional environment variable
CATALOG_DISABLED
that can be passed to disable the internal data catalog if required. - New endpoint that allows for protected domains to be deleted. Can be called using the method
DELETE /api/protected_domains/{domain}
. - New endpoint that allows for the entire deletion of a dataset from within rAPId. This new method removes all raw and uploaded data files, any schemas, tables and crawlers. Can be thought of an entire dataset wiping from rAPId. The method can be called using
DELETE /api/datasets/{domain}/{dataset}
.
Changed
- Breaking Change - Domains are now case insensitive. This fixes an issue where if you created a Protected domain with an uppercase domain and then the same with a lowercase domain the permissions do not match up as they are interpreted as different endpoints. All domains now have to be lower case. To migrate them, you will need to run:
migrations/scripts/v6_domain_case_insensitive.py
. - When downloading data the extra Pandas DataFrame index column is not included now.
- FastAPI has been upgraded to 0.92.0.
Migration
To migrate from v5 to v6, you will need to run the migration script: migrations/scripts/v6_domain_case_insensitive.py
.
This can be done by first installing the Python requirements from requirements.txt
and then running python migrations/scripts/v6_domain_case_insensitive.py
You will also need to provide values for the following environment variables, either by defining them in a .env
file in the repo root or exporting them to the environment where the script is run.
AWS_REGION
DATA_BUCKET
RESOURCE_NAME_PREFIX
AWS_ACCOUNT_ID
v5.0.1 - 2023-02-02
See v5.0.1 changes
Fixed
- Fix data always being written to version 1 location
Security
- Upgrade GitPython to 3.1.30
v4.2.0 - 2022-12-14
See v4.2.0 changes
v4.2.0 Improves the behaviour of the query dataset endpoint to allow the querying of large datasets (>100000 rows)
Changed
The query dataset endpoint can now be used for the querying of large datasets (>100000 rows), if the query includes a limit
clause ensuring that less that 100000 rows of data will be returned.
v4.1.1 - 2022-11-23
v4.1.1 fixes a few issues with the schema creation UI
Fixes
- Create client error
- Organisation typo
- Keep sensitivity on second create schema UI page
- Fix owner name validation for create schema
v4.0.0 - 2022-09-21
See v4.0.0 changes
v4.0.0 introduces schema versioning for datasets and allows both uploads and downloads. Also, it allows large file
uploads and downloads
Added
- Schema versioning
- Schema update endpoint
- Upload and downloading specific version and latest version by default
- Large file upload/download
- Track upload/download job status
- List job status endpoint
- UI
- Task status flow
Changed
- Schema versioning
- New schema upload defaulting to version 1
- Upload/Download data defaults to latest version
Security
- Security headers
- Tracing requests by subject ID
v2.0.0 - 2022-08-19
See v2.0.0 changes
v2.0.0 provides a complete overhaul on how we handle authorisation as well as an extended UI.
Fixed
- Consistent exception handling
Added
- Endpoints:
- Create subject
- List subjects
- Delete subject
- Modify subject permissions
- Get subject permissions
- Get all permissions
- UI
- User management flows
- Data management flows
- UI user journey test
Changed
- Complete overhaul of authorisation process
v1.2.0 - 2022-05-31
See v1.2.0 changes
Added
- Documentation improvements
- OpenAPI spec includes endpoint behaviour documentation
- Added example scripts for programmatic interaction
v1.0.0 - 2022-05-10
See v1.0.0 changes
Added
- Documentation and usage guides can be found here
- First full application release
- Features:
- Generate a schema from a dataset
- Upload a schema
- Upload a dataset
- List available datasets
- Get metadata information for a dataset
- Query a previously uploaded dataset
- Add a new client app