Merge pull request #33 from no10ds/feature/changelog

Feature/changelog
no10ds · Sep 6, 2023 · 5f5b155 · 5f5b155
2 parents 17869ad + 833a8f8
commit 5f5b155
Show file tree

Hide file tree

Showing 5 changed files with 117 additions and 0 deletions.
diff --git a/Makefile b/Makefile
@@ -187,3 +187,9 @@ release:
 	@python get_latest_release_changelog.py
 	@gh release create ${version} -F latest_release_changelog.md
 	@rm -rf latest_release_changelog.md
+
+
+# Migration --------------------
+##
+migrate-v7:			## Run the migration
+	@cd api/; ./batect migrate-v7 -- "--layer ${layer} --all-layers ${all-layers}"
diff --git a/api/batect.yml b/api/batect.yml
@@ -121,3 +121,9 @@ tasks:
     run:
       container: service-image
       command: 'python get_latest_release_changelog.py'
+
+  migrate-v7:
+    description: Run the rAPId migration script
+    run:
+      container: service-image
+      command: 'python migrations/scripts/v7_layer_migration.py'
diff --git a/docs/changelog.md b/docs/changelog.md
@@ -0,0 +1,26 @@
+# Changelog
+
+## v7.0.0 - _2023-09-06_
+
+See [v7.0.0] changes.
+
+### Major Changes
+
+- Layers have been introduced to rAPId. These are now the highest level of grouping for your data. They allow you to separate your data into areas that relate to the layers in your data architecture e.g `raw`, `curated`, `presentation`. You will need to specify your layers when you create or migrate a rAPId instance.
+- All the code is now in this monorepo. The previous [Infrastructure](https://github.com/no10ds/rapid-infrastructure), [UI](https://github.com/no10ds/rapid-ui) and [API](https://github.com/no10ds/rapid-api) repos are now deprecated. This will ease the use and development of rAPId.
+- Schemas are now stored in DynamoDB, rather than S3. This offers speed and usability improvements, as well as making rAPId easier to extend.
+- Code efficiency improvements. There were several areas in rAPId where we were executing costly operations that caused performance to degrade at scale. We've fixed these inefficiencies, taking us from O(n²) -> O(n) in these areas.
+- Glue Crawlers have been removed, with Athena tables are created directly by the API instead. Data is now available to query immediately after it is uploaded, rather than the previous wait (approximately 3 mins) while crawlers ran. It also offers scalability benefits because without crawlers we are not dependant on the number of free IPs within the subnet.
+- Improved UI testing with Playwright.
+
+### Breaking Changes
+
+- All dataset endpoints will be prefixed with `layer`. Typically going from `domain/dataset` to `layer/domain/dataset`.
+- All sdk functions that interact with datasets will now require an argument for layer.
+
+### Migration
+
+- See the [migration doc](migration.md) for details on how to migrate to v7 from v6.
+
+[Unreleased changes]: https://github.com/no10ds/rapid/compare/v7.0.0...HEAD
+[v7.0.0]: https://github.com/no10ds/rapid/v7.0.0
diff --git a/docs/migration.md b/docs/migration.md
@@ -0,0 +1,78 @@
+# Migration
+
+## Migrating to v7 from v6
+
+All of the datasets need to be moved to a layer as part of the v7 migration.
+
+The migration script carries this out, along with other operations.
+
+To execute it, you'll need to decide:
+
+1. Which layer the existing datasets should be moved to.
+2. What the full complement of layers in your rAPId instance should be.
+
+### Prerequisites
+
+#### Infrastructure changes
+
+The v7.0.0 infrastructure changes need to be applied to your rAPId instance.
+
+Update the version of the rAPId terraform module that you are using and apply the terraform.
+
+#### Local requirements
+
+You will need the ability to run `Batect`, the requirements for which are listed [here](https://batect.dev/docs/getting-started/requirements/).
+
+### Steps:
+
+#### Clone the repo
+
+To do this, run:
+
+`git clone -b v7.0.0 git@github.com:no10ds/rapid.git`
+
+#### Set your environment variables
+
+Within the rAPId repo, set the following variables in the `.env` file to match those of your rAPId instance and AWS account:
+
+```
+# rAPId instance variables
+- AWS_REGION=
+- DATA_BUCKET=
+- RESOURCE_PREFIX=
+
+# AWS environment variables
+- AWS_ACCESS_KEY_ID=
+- AWS_SECRET_ACCESS_KEY=
+- AWS_SESSION_TOKEN=
+```
+
+#### Run the migration script
+
+You can now run the script and specify your layer configuration. Examples for it are below:
+
+##### Example 1:
+
+You do not wish to use the layer functionality:
+
+- The existing datasets can be moved to a `default` layer
+- The full complement of layers can just consist of one, called `default`.
+
+To do this, you would run:
+
+```
+make migrate-v7 layer=default all-layers=default
+```
+
+##### Example 2:
+
+You wish to use the layer functionality and largely have raw data already in your rAPId instance:
+
+- The existing datasets can be moved to a `raw` layer.
+- The full complement of layers in your rAPId instance can mirror your architecture and be: `raw`, `curated` and `presentation`
+
+To do this, you would run:
+
+```
+make migrate-v7 layer=raw all-layers=raw,curated,presentation
+```
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -48,6 +48,7 @@ nav:
           - Patterns:
               - sdk/api/patterns/data.md
   - Releases: changelog.md
+  - Migration: migration.md
   - Contributing: contributing.md
 
 plugins: