-
-
Notifications
You must be signed in to change notification settings - Fork 160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC 0155]: NixOS Migrations #155
Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This refers to the Sovereign Tech Fund grants that support sustainable open source development. @Fresheyeball proposed applying with this project. I also think that would be very valuable for the ecosystem, but due to its far-reaching nature it seems sensible to first make sure the design is agreeable before planning implementation. |
I discussed a lot about solutions to that problem, notably in the ensure-style options issue, while I appreciate the effort, I feel like it's not ambitious enough and will cause more churn in the long run as it won't be able to address the core issue: restoring the ability to NixOS to rollback even from data in some scenarios. I don't know what is the timeline for the STIF grant, etc. But I can spend some time explaining my full vision of it and what are the steps to take, there's also some academical research involved on that subject, that I do, if people are interested. |
It would be a great start if we used this RFC as a place to compile everything we know and then go from there. |
May I recommend to use discourse and a nixpkgs tracking issue instead? Overly long discussions in GitHub comments are a pain to use and refer to. RFC discussions are particularly annoying because they incentivize long threads (due to deep topics) while disincentivizing splitting into multiple threads that would keep things manageable. A tracking issue allows for branching off, and summarizing back. Similarly for discussions on discourse. |
I'd like to highlight #138, which proposes to use repositories for RFC development and discussion instead, which would improve this kind of thing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going from the current state of manual migrations and a global stateVersion
straight to automatic migrations sounds a bit dangerous. I'd rather have a plan to just move to a per-service stateVersion
-like approach first, which would fix the problem of a global stateVersion
, be fairly straightforward, and safe.
This RFC is now open for shepherd nominations! |
I think it might be an good idea to add an additional field of |
# Down Script | ||
|
||
A place in the Nix Module system for describing the imperative steps required to migrate from the current version to the previous version. This will run as a systemd "oneshot" service, to take advantage of the standard architecture. | ||
|
||
```Nix | ||
{ | ||
down.script = "my-service.up" '' | ||
${pkgs.my-service-cli} --run-migration /etc/service-state | ||
''; | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about things that don't have a way to downgrade, not everyone provides scripts for migrating to a old version.
We could link a backup of the current state to the corresponding genration when rebuilding and get it back that way, yes this would loose new data but it would be good enough to test if the migration worked in $userEnv and roll back if it didn't.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not everyone provides scripts for migrating to a old version
Even worse, not everyone provides instructions for it. Honestly, I'd be more surprised to learn that somebody does. There are upgrade guides for software, but almost never downgrade guides. Which means that it's left as an exercise for the reader, which ends up multiplying the workload tenfold.
This RFC has not acquired enough shepherds. This typically shows lack of interest from the community. In order to progress a full shepherd team is required. Consider trying to raise interest by posting in Discourse, talking in Matrix or reaching out to people that you know. If not enough shepherds can be found in the next month we will close this RFC until we can find enough interested participants. The PR can be reopened at any time if more shepherd nominations are made. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work on the RFC!
That said, I'm not sure if actually implementing it is feasible. The whole area surrounding migrations if usually quite hand-wavy and unspecified, and I'm not sure if Nixpkgs maintainers can provide a good UX if the developers of the software didn't (as is evident by them not having written a migration script).
Nix can even be an obstacle to doing so, as often these steps are dissonant with the methodologies required for declarative and immutable structures
I think this is closer to the actual problem that can be solved. Perhaps we can document better on how to upgrade the services given the NixOS constraints.
I'd rather have a plan to just move to a per-service stateVersion-like approach first, which would fix the problem of a global stateVersion, be fairly straightforward, and safe.
Yes please!
On the less related note: this RFC explores the concept of NixOS managing state too. I like having that direction explored, and I wonder if there are people who want NixOS to do more state lifecycle management (things like Impermanence, maybe even segregation of state into different buckets, suggesting backups etc.), I certainly would like to be on the receiving end of it (and not like to be on the implementing end of it lol).
# Down Script | ||
|
||
A place in the Nix Module system for describing the imperative steps required to migrate from the current version to the previous version. This will run as a systemd "oneshot" service, to take advantage of the standard architecture. | ||
|
||
```Nix | ||
{ | ||
down.script = "my-service.up" '' | ||
${pkgs.my-service-cli} --run-migration /etc/service-state | ||
''; | ||
} | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not everyone provides scripts for migrating to a old version
Even worse, not everyone provides instructions for it. Honestly, I'd be more surprised to learn that somebody does. There are upgrade guides for software, but almost never downgrade guides. Which means that it's left as an exercise for the reader, which ends up multiplying the workload tenfold.
|
||
```Nix | ||
{ | ||
up.warn = '' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think a warning is sufficient here. It would be nice if there was "immediately stop and force the user to read the warning and confirm if he accepts the risk" thing in Nixpkgs, but there is none. So I think a better way would be to throw an error outright and have the way to bypass it with something like forceMigration = true;
in the config.
This is kinda like the thing with unfree software, in a way. We REALLY don't want to do anything potentially problematic, so it's better to be left opt-in.
} | ||
``` | ||
|
||
`$BACKUP` is a path to a temporary filesystem location which will be deleted upon completion of the migration. This provides a temporary location to backup state if needed. `up.backup` is a hook that will run before `up.script`. If `up.script` encounters an error, `up.restore` is run to ensure that the failed migration does not result in contamination of the system. These options are available in both `up` and `down` definitions. This roughly allows for transaction-like logic for the migration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$BACKUP
is a path to a temporary filesystem location
I really hope you don't mean it's on tmpfs
. I think a different wording should be used here.
Regardless, I don't think it's a good idea to clear backups as part of a script, even if the script ran successfully. There might be an error in the logic for all you know. Lastly, we REALLY need this to be as transactional as possible, though thankfully there's a lot of know-how in the Nix space around that.
|
||
# Testing | ||
|
||
Extend the NixOS VM Test framework to ergonomically test migrations in an automated fashion. Migrations should be accompanied by VM tests demonstrating that migrations succeed from a clean service state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
While that's eons better than nothing, the problem with migrations isn't really when your service state is clean; it's when it's not. And I'm afraid we can't really test for that.
@KFearsoff are you interested in being a shepherd for this RFC. Also a reminder that this week is NixCon, a great time for everyone attending to talk to potential shepherds in person. |
|
||
While the current NixOS Configuration system works incredibly well for immutable services, not all services are immutable, and we currently lack facilities to handling imperative steps needed to upgrade or downgrade complex stateful services. Nix can even be an obstacle to doing so, as often these steps are dissonant with the methodologies required for declarative and immutable structures. | ||
|
||
Let's take the example of GitLab. GitLab is notoriously hard to upgrade, and the current NixOS module system produces major obstacles to upgrades of this nature, as GitLab expects the user to run many imperative steps to modify stateful parts of the application such as the database and configuration files. The need to migrate stateful portions of an application to new versions is nothing new; database migrations are standard practice and can provide structure and inform the concerns of module migrations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a GitLab package/module maintainer, I can say that GitLab is a bad example. The things said in this paragraph are not generally true for GitLab.
PostgreSQL major upgrades would be a good example.
|
||
## Up Script | ||
|
||
A place in the Nix Module system for describing the imperative steps required to migrate from the previous version to the new version. This will be run as a systemd "oneshot" service, to take advantage of the standard architecture. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer if migrations were not fully automatically triggered, for example for nixos-unstable users it could be important to make a backup and supervise the process.
@fricklerhandwerk is this on-hold due to grant status? This RFC does not yet have enough shepherds. |
As the author @Fresheyeball should determine if he has time to drive the process and how to proceed. |
This RFC is being closed due to lack interest. If enough shepherds are found this issue can be reopened. If you don't have permission to reopen please open an issue for the NixOS RFC Steering Committee linking to this PR. |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: |
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/what-about-state-management/37082/2 |
Making this an RFC as requested by @fricklerhandwerk and after doing a review with @roberth. This project is intended for Sovereign Tech Fund grants.