-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
History of applied migrations #179
Comments
Why do they want this? Also see: #65 I'm hesitant to add this functionality to migrate unless there's a good reason since the required work is non-trivial. e.g. need to make a schema change to the schema version table for every DB driver and what dates should be used to backfill existing migrations? It may be easier to add lifecycle hooks to migrate and have the consumer track the state for each migration... |
It's not the customer that requires it explicitly, it's more of an internal compliance towards customers' requirements where a history would be useful. It can be accomplished with SQL functions, e.g. a trigger to copy every add/update to a new table with a timestamp column. The important part is the number and date. I wanted to bring up this question if other users would find it useful and since it hasn't been asked before. #65 is similar but not what I'm looking for. The implementation details can be decided upon later if this issue is accepted and put on the roadmap. I understand that it might not be trivial to implement and some decision must be made for regarding existing migrations etc. |
I'll leave this issue open to gather feedback
IDK if every DB supported by In the meanwhile, so you're not blocked, it sounds like you could start tracking history yourself by adding a trigger. |
Yeah, in the mean time we can track history with a trigger. I plan on creating the table and trigger with a migration. I can share the final result when it's done should anyone else finds it useful. |
As an additional data-point, we're running a service where database migrations are applied on startup with This, of course, comes down to the fact that certain pull-requests introducing migrates were merged out-of-order, and we should've been more diligent with at least merging these in-order, if not re-numbering for the right order. However, the fact still remains that the migration system did not error out with these missing migrates, as it does when a version stored in the database does not have a corresponding migrate file (this sometimes happens when testing changes locally, and we're jumping between branches). It would be wonderful if we could apply these missing migrates in their versioned order, but throwing an error is imperative in catching these issues early, and tracking the history is, I believe, mandatory in implementing this. I can try and work up some reasonable implementation for at least the subset of database engines I can test for, if you'd be willing to entertain the idea. |
@deuill I'm glad you caught the issue before going to production! Have you seen my comment here? #65 (comment) I'm interested in solving the problem, but I'm not sure what the best solution for this problem (detecting changed/missing migrations) is. We should discuss solutions for this problem before implementing anything. Goals for any solution (ordered by importance):
Nice to have for solutions:
Current solutions:
|
I very much agree with @deuill I don;t see much benefit from maintaining/detecting modifications on migration files, once merged with code or release it should not be touched at all. History is essential through to catch possible production issues especially with dependencies. Apparently this issue has been pointed two years ago mattes/migrate#237 with no resolution agreed on. |
100% agree as well. Having an history of the migrations and being able to apply missing migrations (out of order) are a must. Maybe a feature for v5? |
A migration (for PostgresSQL) to track history of applied migrations.
This adds a new row to table |
@jabbors good implementation of history for a use case like mentioned above that the contract with client required to know the timestamp. It does not prevent the issues with skipping the source by migration tool, nor it updates history on migrate down, but it is great database implementation of tracking. We were thinking about something like that just for sake fo checking if any error occurs. I am most afraid of the skipped migration file from source, we will be adding a policy that upon code merge the migration file timestamp version must be the current time to avoid issues. Otherwise, the newer timestamp merged/released before would cause any migration file with an old timestamp to never being applied. |
Turns out
|
Probably one of the best migration tool out there is FlywayDB. Why not getting a bit of inspiration from them? https://flywaydb.org/documentation/migrations#overview Something I really like with FlywayDB is they let you manage your migration the "way you want". For example, down migrations are optional and migrations can be run "out of order" (optional): I would love to have similar features and flexibility with golang-migrate. That would make the tool less opinionated on "how migration should be or should not be run" and it would most likely adapt better to different type of workflow/environments. Just my 2cents :) |
This would be a great addition to |
We have the same issue with multiple migrations being merged not necessarily in the same order as they were created and expectation is that non-conflicting migrations will still be applied. So there should be a history of applied migrations and missing ones should be running. Non-conflicting merges are, obviously, task for developers to ensure they merge what should be merged and when. |
If this isn't going to be implemented with the suggested approach, i'd recommend changing the table name from |
Will this proposed design (#470) meet most of your needs? e.g. everything except for auditability |
We have several production environments and it's common that some environments fall behind both in regards of deploying application upgrades and new migrations. For this reason, auditability is preferred in case we need to do some troubleshooting and track when a migration has been applied. But we have also solved that in Postgres by applying the above trigger function as an initial migration. As for the proposed designed, it will be a great addition and it will help us with some issues relating to CI. |
@jabbors we are planning to implement a similar approach. Is there any references that you can share for the implementation that you have made? I am still new to go-lang-migrate and it seems most of the migration logic is part of the package. |
@amanangira this is the implementation that we are using #179 (comment). It meets our requirements and have been working so far. Below is a sample from the table in a live system. Note: we are using a sequential version number in our migrations files
|
A good practice that we have adapted is to create the trigger as the very first migration for new databases, ie. |
I probably missed this in the thread. This looks interesting. I will spend more time putting the approach against our use case. Thank you @jabbors |
As a workaround, my team has opted to use a custom required GitHub action which runs on new PRs. This also requires enforcing the strict status check branch protection rule so that the head branch is up to date with the base branch (ensuring the script compares against the latest migrations). # .github/workflows/migration-validation.yml
name: migration-validation
on:
pull_request:
types: [opened, synchronize, edited, reopened]
branches:
- main
jobs:
migration-validation:
name: migration-validation
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: update checkout
run: git fetch --prune
- name: migration-validation
run: scripts/migration-validation.sh ${{ github.event.pull_request.base.ref }}
shell: bash
# scripts/migration-validation.sh
BASE_BRANCH=$1
OLDEST_NEW_MIGRATION_FILE=$(git diff --name-only origin/$BASE_BRANCH --diff-filter=d | grep -m1 db/migrations/)
if [[ -z $OLDEST_NEW_MIGRATION_FILE ]]; then
echo "no new migrations"
exit 0
fi
NEWEST_EXISTING_MIGRATION_FILE=$(git ls-tree -r origin/$BASE_BRANCH --name-only | grep db/migrations/ | tail -1)
if [[ -z $NEWEST_EXISTING_MIGRATION_FILE ]]; then
echo "no existing migrations"
exit 0
fi
echo "oldest new migration $OLDEST_NEW_MIGRATION_FILE"
echo "newest existing migration $NEWEST_EXISTING_MIGRATION_FILE"
EXISTING_TIMESTAMP="$(basename $NEWEST_EXISTING_MIGRATION_FILE | cut -d '_' -f 1)"
NEW_TIMESTAMP="$(basename $OLDEST_NEW_MIGRATION_FILE | cut -d '_' -f 1)"
if [[ $EXISTING_TIMESTAMP -ge $NEW_TIMESTAMP ]]; then
echo "existing migration timestamp is greater than or equal to incoming migration timestamp. please update your migrations timestamp."
exit 1
fi
echo "new migration(s) are safe to merge"
exit 0
Running this as a Github action provides a good feedback loop and seems like a decent hack until a solution is settled on and implemented. |
Just an idea - how about checking the filenames before loading them for processing? package db
import (
"embed"
"log"
"strconv"
migrate "github.com/golang-migrate/migrate/v4"
_ "github.com/golang-migrate/migrate/v4/database/postgres"
"github.com/golang-migrate/migrate/v4/source/iofs"
)
//go:embed migrations
var fs embed.FS
func Migrate() {
d, err := iofs.New(fs, "migrations")
// ...
}
// Checks that migration files are in proper format and index has no gaps or reused numbers.
func init() {
dir, err := fs.ReadDir("migrations")
if err != nil {
log.Fatal("Unable to open migrations embedded directory")
}
if len(dir)%2 != 0 {
log.Fatal("Migration files must be even")
}
// count migration prefixes
checks := make([]int, len(dir)/2)
for _, de := range dir {
ix, err := strconv.Atoi(de.Name()[:5])
if err != nil {
log.Fatalf("Migration %s does not start with an integer?", de.Name())
}
if ix-1 > len(checks)-1 {
log.Fatalf("Is there a gap in migration numbers? Number %d is way too high", ix)
}
checks[ix-1]++
}
// check expected result
for i, x := range checks {
if x != 2 {
log.Fatalf("There are not exactly two migration files with index %05d, found: %d", i+1, x)
}
}
} |
Thank you for sharing this solution this is what we chose to do to address this gap in the tool. Elegant and reverse compatible with the way the tool currently works. Would encourage that a similar solution is added to the tool to address all of the concerns on this issue. Similarly as others have noted here this seems like a big gap when compared to other migration tools I have used. |
Adding a Table, Function, and Trigger to track the history of our schema migrations. Currently the `schema_migrations` table only stores the latest migration so it is difficult to track what migrations were applied and when and this is an issue when troubleshooting failed or missed migrations. The solution was found in an [issue](golang-migrate/migrate#179 (comment)) for our migration tool Resolves #330 Example output of the table on a clean database where all migrations ran ![image](https://user-images.githubusercontent.com/10135546/192883450-a90ec30b-0db3-45b5-9ac9-42f2259e8dfb.png)
Problem
A use case that have come up with a customer would require us to keep a history of when migrations have been applied.
Solution
Most migration system I've seen include history of all applied migrations in the
schema_migrations
table. Instead of just containing the last applied version and dirty state I would propose that each migration applied would end up as a new row in the table, which includes the number, name and date applied. The table structure could be something like.Which would yield something like this when migrations are applied
The text was updated successfully, but these errors were encountered: