Skip to content
This repository has been archived by the owner on Jun 28, 2018. It is now read-only.

Store all migrations that have been applied; run all that are missing #237

Open
abscondment opened this issue May 19, 2017 · 15 comments
Open

Comments

@abscondment
Copy link

Right now you're using a "high-water mark" approach to check whether a migration ought to be applied. If it's less or equal to than the version in the database, it is ignored.

Would you consider a change that stores a row per migration applied?

This would handle certain out-of-order situations that are problematic under the current scheme (example: you create a migration in a feature branch; a day later, you create and deploy migration as part of a hotfix; when the feature branch is merged, the "new" migration has a lower timestamp and is skipped).

FWIW, this is the path that Rails took for ActiveRecord migrations. It started with the single-row approach and moved over to "store everything I've run" in 2008.

@mattes
Copy link
Owner

mattes commented May 20, 2017

Interesting. I wasn't aware Rails does that and couldn't find that behavior in https://github.com/rails/rails/blob/master/activerecord/lib/active_record/migration.rb Can you help me out or send some other docs?

@abscondment
Copy link
Author

abscondment commented May 20, 2017

The behavior is actually there -- for example:

  • get_all_versions queries the schema_migrations table for all of the versions that have actually been inserted
  • needs_migration? finds the migration files on your path, turns them into versions, and then removes the versions present in schema_migrations to check whether there are any pending
  • record_version_state_after_migrating actually removes/creates the rows for a completed up/down migration

I'm not sure I'd recommend structuring it the way Rails does (they rely on a lot of magic that we can't)... but is this pattern something you'd be OK with in general?

@abscondment
Copy link
Author

Here's another example from Elixir's ecto library.

@mattes
Copy link
Owner

mattes commented May 21, 2017

You are right. Rails actually does that and I'm super surprised. How does it handle conflicting migrations then?

@abscondment
Copy link
Author

The same way code conflicts are handled: the developer doing the merge has to correct the problem.

In my experience, the "skipped compatible-but-out-of-sequence migrations" scenario happens fairly often, and the "conflicting migrations" scenario requires merge work from the developer either way.

@mattes
Copy link
Owner

mattes commented May 22, 2017

If it's independent files/migrations merge conflicts seem unlikely, I guess.

@mattes
Copy link
Owner

mattes commented May 22, 2017

Behavior A (current):

Current version is 2.

1_migration.up.sql
2_migration.up.sql
3_migration.up.sql
4_migration.up.sql

migrate up will apply 3_migration.up.sql and 4_migration.up.sql

Behavior B (suggested):

Current version is 2 but 1_migration.up.sql wasn't applied yet.

1_migration.up.sql
2_migration.up.sql
3_migration.up.sql
4_migration.up.sql

migrate up will apply 1_migration.up.sql, 3_migration.up.sql, 4_migration.up.sql.

@mattes
Copy link
Owner

mattes commented May 22, 2017

@johnweldon
Copy link
Contributor

I would consider the proposed behaviour to be a new feature. As a new feature I'd expect it to be explicitly toggled either by a flag or config setting, or maybe a different command, apply for example. (migrate apply 4 to apply all needed migrations to get to 4, using up and down migrations as needed. E.g. migrate apply 3 to apply the up for 1, 2, and 3, and then the down for 4)

@ngauthier
Copy link
Contributor

Hello! I don't use migrate any more, I wrote a bash script to perform migrations. My approach there is similar to rails, where the migration is given a timestamp as its identifier prefix (along with the name). This is very similar to rails (which I have done for a long time).

It's very nice because during parallel branch development, you don't have conflicts, and as mentioned before in this thread, it's up to the devs to handle conflicts at merge-time. The CI system should catch migrations that conflict.

When I run the migrations, there are done alphabetically (because they are timestamps). If there happened to be an (incredible!) conflict of the timestamp, alphabetical sort uses the migration names as a tiebreaker.

Similar to what was mentioned here, I use a "migrations" table that has a row for each migration performed which uses the the full name (timestamp and migration name) as primary key. So the "up" operation is "run all migrations not in the table in alphabetical order".

I'm happy to share a snapshot of the current version of my migrate script as inspiration:

https://gist.github.com/ngauthier/2fbfcec2697b835a44f272b04c95ec3f

Note that it does have a hardcoded default database tied to the project it's in (line 7) that you would want to remove if you feel like using it.

@bt
Copy link

bt commented May 30, 2017

I also think that storing the the name of the migration in the database is a better approach than matching version numbers.

In my current use-case, I have migrations that will set up the database. Additionally, I have another set of migrations called seeds that I use either for setting up a staging application or for integration testing.

With the current way that migrate works, because my seeds are not necessarily dated after my default migrations, it won't let me run the migrations on the seed scripts. On top of that, in fact, in my current situation, (after an hour of debugging) I noticed that I can't even get migrate to work because in the readUp function, it will first try to find the last migrated version and therefore will fail because the said migration wouldn't exist in the seeds folder.

In the mean time, I'll be using an approach similar to @ngauthier. If this is something that @mattes is willing to accept as a feature request, I'll be happy to devote some of my time refactoring.

@lgonzalez-silen
Copy link

fwiw, this is the main reason we are using the v1 gemnasium fork. Multiple developers creating migrations in their branches and merging at different times was a pita.

(Migration identifiers as a sequence were also a big issue necessitating constant renames at the time we switched, but good to see that was solved by using timestamps)

Would love to be able to use v3 and only load the database we are using, but we are not able to go back to enforcing timestamps for new merges to be the new highwater mark.

@gravis
Copy link
Contributor

gravis commented Jul 13, 2017

Be careful with this fork, as we don't maintain it anymore.
It has been deprecated in favor of https://github.com/db-journey/
We move the repo to an org because we don't want to include every single available driver.

@nexdrew
Copy link

nexdrew commented Aug 11, 2017

I'm +1 for this change as well. This is how a lot of other db migration systems work e.g. Flyway.

"Well, why not use Flyway?" you ask - because I don't currently use a JVM and I really like the single binary distribution of migrate. (I also need support for CockroachDB.)

@josephbuchma
Copy link

+1 for this feature.
btw, https://github.com/db-journey/ looks very promising

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants