Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add concept of "staged" deployment #1503

Closed
wants to merge 1 commit into from

Conversation

cgwalters
Copy link
Member

Add API to write a deployment state to /run/ostree/staged-deployment,
along with a systemd service which runs at shutdown time.

Just compile tested, still WIP.

For: coreos/rpm-ostree#40

Copy link
Member

@jlebon jlebon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only did a high-level overview so far, but looks nice! 👍

* ostree_sysroot_stage_tree() API.
*/
gboolean
ostree_sysroot_deploy_tree (OstreeSysroot *self,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's start by splitting out this initialize + finalize separation in a prep PR?

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 01717d7) made this pull request unmergeable. Please resolve the merge conflicts.

@rh-atomic-bot
Copy link

💥 Invalid .papr.yml: file could not be parsed as valid YAML.

@cgwalters
Copy link
Member Author

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 22cd178) made this pull request unmergeable. Please resolve the merge conflicts.

@cgwalters
Copy link
Member Author

TODO:

  • Clean up the story around kargs
  • Determine whether staged deployment is part of _get_deployments(). Tradeoffs both ways. If not (as currently) then all tools will need to learn to query for it.
  • tests

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 1f3f657) made this pull request unmergeable. Please resolve the merge conflicts.

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably 7ec3d06) made this pull request unmergeable. Please resolve the merge conflicts.

@cgwalters
Copy link
Member Author

Sigh, needed to move it into the main /usr/bin/ostree as that's install_exec_t which is what we need to do deployments.

Copy link
Contributor

@peterbaouoft peterbaouoft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really nice, learned a lot from just reading it :D. Some minor comments and questions :p.

@@ -2388,6 +2398,7 @@ ostree_sysroot_deploy_tree (OstreeSysroot *self,
return FALSE;

_ostree_deployment_set_bootcsum (new_deployment, kernel_layout->bootcsum);
_ostree_deployment_set_bootconfig_from_kargs (new_deployment, override_kernel_argv);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this seems to be also done line #2404 to #2417 (the one right below)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that was all duplicate code. Fixed, thanks!

cancellable, error))
return FALSE;
}

/* Don't fsync here, as we assume that's all done in
* ostree_sysroot_write_deployments().
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: do we still want to keep this comment for non-staged case? /me assuming we are still supporting "non-staged" deployments (the old 3 way merge way)?

g_variant_builder_add (builder, "{sv}", "kargs",
g_variant_new_strv ((const char *const*)override_kernel_argv, -1));

char *dnbuf = strdupa (_OSTREE_SYSROOT_RUNSTATE_STAGED);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: I am not too familiar with "/run" directories, wondering any reasons / benefits to store data there? :-) /me assuming we are writing deployment info into that location.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The basic idea of /run is it's guaranteed to go away if the system reboots (it's a tmpfs, stored only in memory). Which I think helps simplify the logic around failure cases. I feel like we only want to try to process staging data once rather than having it be persistent.

GCancellable *cancellable,
GError **error)
{
/* The service has a ConditionPathExists=/run/ostree/staged-deployment */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: seems like /run/ostree/staged-deployment is not there in the service file. Only ConditionPathExists=/run/ostree-booted seems to be there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not minor, that's a bug! The service would fail if there was no staged deployments. Fixed.


/* TODO: Proxy across flags too? */
OstreeSysrootSimpleWriteDeploymentFlags flags = 0;
if (!ostree_sysroot_simple_write_deployment (self, ostree_deployment_get_osname (self->staged_deployment),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question/minor: noticed in rpm-ostree, seems like we had rpmostree_syscore_write_deployment to add some flags for livefs and rollback, do we need to consider a bit here to make it easier to integrate this into rpm-ostree in the future?(if needed :p)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right that integrating this nicely will require pulling down some of that logic.

if (self->staged_deployment)
{
char *deployment_path = ostree_sysroot_get_deployment_dirpath (self, self->staged_deployment);
g_hash_table_replace (active_deployment_dirs, deployment_path, deployment_path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: wanted to confirm, we don't include bootcsum here because the "staged deployment" is not ready for boot yet? ( i.e the merge and boot entries weren't ready)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, that was an oversight. One not revealed because we weren't installing a staged deployment with a new boot checksum.

I have kind of been going back and forth on whether /boot should be modified during staging or finalization. Currently it's during staging.

@cgwalters cgwalters removed the WIP label Apr 6, 2018
@cgwalters cgwalters changed the title WIP: Add concept of "staged" deployment Add concept of "staged" deployment Apr 6, 2018
@cgwalters cgwalters force-pushed the deployment-pending branch 2 times, most recently from a81672c to f56d41d Compare April 7, 2018 15:52
# https://lists.freedesktop.org/archives/systemd-devel/2018-March/040557.html
[Unit]
Description=OSTree Deploy Staged
ConditionPathExists=/run/ostree-booted
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we do tolerate no staged deployments in _ostree_sysroot_deploy_staged, though we might as well short-circuit this from here and add a ConditionPathExists=/run/ostree/staged-deployment here too, no? Making it explicit also makes it easier to discover how these pieces fit together.

/after reading more code

Oh right, this won't work with the multi-user.target approach since it's evaluated on startup. Though it should work if we use shutdown.target, no? (Related to next comment).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makefile-boot.am Outdated
@@ -39,7 +39,7 @@ endif

if BUILDOPT_SYSTEMD
systemdsystemunit_DATA = src/boot/ostree-prepare-root.service \
src/boot/ostree-remount.service
src/boot/ostree-remount.service src/boot/ostree-deploy-staged.service
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is slightly bikeshed, though it feels like it would be more appropriate to name the service something like ostree-finalize-staged.service. (And the private function _ostree_sysroot_finalize_staged).

My reasoning is that a staged deployment is in some ways already deployed, right? And I see the rest of the new APIs/CLIs reflect that as well (e.g. ostree_sysroot_get_staged_deployment, and ostree admin deploy --stage).

[Service]
Type=oneshot
RemainAfterExit=yes
ExecStop=/usr/bin/ostree admin deploy-staged
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this seems odd to me. I'm guessing the idea here is that the Before=multi-user.target will ensure that most services will have been shut down, assuming services are stopped in reverse order. Is this the canonical way to run on shutdown? Why not WantedBy=shutdown.target?

@@ -808,6 +809,7 @@ static gboolean
write_origin_file_internal (OstreeSysroot *sysroot,
OstreeSePolicy *sepolicy,
OstreeDeployment *deployment,
gboolean if_not_exists,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused why we need this. Why not just unconditionally overwrite it in finalize? Then even in the stage path, we're sure of using the right label for it post-merge.

static GVariant *
serialize_deployment_to_variant (OstreeDeployment *deployment)
{
g_autoptr(GVariantBuilder) builder = g_variant_builder_new ((GVariantType*)"a{sv}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we stick with g_auto where we can for good form?

g_autoptr(GVariantBuilder) builder = g_variant_builder_new ((GVariantType*)"a{sv}");
g_autofree char *name =
g_strdup_printf ("%s.%d", ostree_deployment_get_csum (deployment),
ostree_deployment_get_deployserial (deployment));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about keeping those separate to make deconstruction lower down easier?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eh, we already need to parse them from the filesystem and had a function for it.

g_variant_new_strv ((const char *const*)override_kernel_argv, -1));

char *dnbuf = strdupa (_OSTREE_SYSROOT_RUNSTATE_STAGED);
const char *parent = dirname (dnbuf);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor: any reason not to collapse these two lines? (i.e. dirname (strdupa (...)))

g_autoptr(OstreeDeployment) deployment = NULL;
if (!sysroot_initialize_deployment (self, osname, revision, origin, override_kernel_argv,
&deployment, cancellable, error))
return FALSE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I wonder here if we should do something if there's a deployment already staged. E.g. in rpm-ostree, if we have a staged auto-updates policy, we could just continuously be staging newer updates as we see them right? I guess we can put the cleanup there, though it does feel like someone calling ostree_sysroot_stage_tree expects it to implicitly delete a previously staged deployment, rather than having them pile up.

if (self->ostree_booted && self->root_is_sysroot
&& !self->booted_deployment)
const gboolean root_is_ostree_booted =
self->ostree_booted && self->root_is_sysroot;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At this point, we might as well make replace instances of those with a new field.

if (opt_retain)
flags |= OSTREE_SYSROOT_SIMPLE_WRITE_DEPLOYMENT_FLAGS_RETAIN;
else
if (opt_stage)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's just error out here if any of the --retain opts are given?

@cgwalters
Copy link
Member Author

Pushed a fixup ⬆️ which addresses a lot of comments (but not all). In the meantime split out prep patches into #1535

@rh-atomic-bot
Copy link

☔ The latest upstream changes (presumably b9fc3ea) made this pull request unmergeable. Please resolve the merge conflicts.

Add API to write a deployment state to `/run/ostree/staged-deployment`,
along with a systemd service which runs at shutdown time.

This is a big change to the ostree model for hosts,
but it closes a longstanding set of bugs; many, many people have
hit the "losing changes in /etc" problem.  It also avoids
the other problem of racing with programs that modify `/etc`
such as LVM backups:
https://bugzilla.redhat.com/show_bug.cgi?id=1365297

We need this in particular to go to a full-on model for
automatically updated host systems where (like a dual-partition model)
everything is fully prepared and the reboot can be taken
asynchronously.

Closes: ostreedev#545
@jlebon
Copy link
Member

jlebon commented Apr 12, 2018

Let's do this!
@rh-atomic-bot r+ 7adfe24

@rh-atomic-bot
Copy link

⌛ Testing commit 7adfe24 with merge eb506c7...

@rh-atomic-bot
Copy link

☀️ Test successful - status-atomicjenkins
Approved by: jlebon
Pushing eb506c7 to master...

cgwalters added a commit to cgwalters/ostree that referenced this pull request Apr 13, 2018
Followup to: ostreedev#1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.
cgwalters added a commit to cgwalters/ostree that referenced this pull request Apr 13, 2018
Followup to: ostreedev#1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.
rh-atomic-bot pushed a commit that referenced this pull request Apr 16, 2018
Followup to: #1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.

Closes: #1539
Approved by: jlebon
rh-atomic-bot pushed a commit that referenced this pull request Apr 16, 2018
Followup to: #1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.

Closes: #1539
Approved by: jlebon
rh-atomic-bot pushed a commit that referenced this pull request Apr 16, 2018
Followup to: #1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.

Closes: #1539
Approved by: jlebon
rh-atomic-bot pushed a commit that referenced this pull request Apr 17, 2018
Followup to: #1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.

Closes: #1539
Approved by: jlebon
rh-atomic-bot pushed a commit that referenced this pull request Apr 18, 2018
Followup to: #1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.

Closes: #1539
Approved by: jlebon
rh-atomic-bot pushed a commit that referenced this pull request Apr 18, 2018
Followup to: #1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.

Closes: #1539
Approved by: jlebon
rh-atomic-bot pushed a commit that referenced this pull request Apr 18, 2018
Followup to: #1503
After starting some more work on on this in rpm-ostree, it is
actually simpler if the staged deployment just shows up in the list.

It's effectively opt-in today; down the line we may make it the default,
but I worry about breaking things that e.g. assume they can mutate
the deployment before rebooting and have `/etc` already merged.

There's not that many things in libostree that iterate over the deployment
list.  The biggest change here is around the
`ostree_sysroot_write_deployments_with_options` API.  I initially
tried hard to support a use case like "push a rollback" while retaining
the staged deployment, but everything gets very messy because that
function truly is operating on the bootloader list.

For now what I settled on is to just discard the staged deployment;
down the line we can enhance things.

Where we then have some new gymnastics is around implementing
the finalization; we need to go to some effort to pull the staged
deployment out of the list and mark it as unstaged, and then pass
it down to `write_deployments()`.

Closes: #1539
Approved by: jlebon
LorbusChris pushed a commit to LorbusChris/ostree-spec that referenced this pull request Oct 23, 2018
Prep for ostreedev/ostree#1503 which will
add `/usr/lib/ostree/ostree-deploy-staged`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants