Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

64-gce-disk-removal.rules: delete #51

Merged

Conversation

jlebon
Copy link
Contributor

@jlebon jlebon commented May 17, 2023

This rule tries to clean up mount points after their backing block devices have been detached.

However, it does not actually work on any recent systemd-based distro. It's been almost ten years now since systemd-udevd has been set up to run in a mount namespace[1]. (This has been replaced by PrivateMounts=yes in more recent versions.)

This means that the umount done here will have no effect on the system. I've verified this is the case in a Debian 11 instance on GCP where I detached a device that was mounted. (It's worth noting that the initial addition of this rule, which AFAICT is this commit[2], predates the systemd patch by a year which means that this rule did work in the beginning.)

Nowadays, mounting and unmounting should be done via systemd mount units. systemd knows to clean up stale mount entries if their underlying devices go away. If it really must be done from a udev rule, the systemd-mount command is recommended, but unmounting with this tool will not work for our purposes here since it assumes the device is still present.

Using mount interactively is fine, but then it's reasonable to expect users to also manually umount once they're done with it.

Also, note that the logic here is not quite correct. If umount fails (e.g. because the device wasn't actually mounted), the overall exit code of the shell command is nonzero, which systemd will log as a udev error like:

(udev-worker)[534]: vda2: Process '/bin/sh -c '/bin/umount -fl
/dev/vda2 && /usr/bin/logger -p daemon.warn -s WARNING: hot-removed
/dev/vda2 that was still mounted, data may have been corrupted''
failed with exit code 32.

For all the above reasons, drop this rule.

This rule tries to clean up mount points after their backing block
devices have been detached.

However, it does not actually work on any recent systemd-based distro.
It's been almost ten years now since `systemd-udevd` has been set up to
run in a mount namespace[[1]]. (This has been replaced by
`PrivateMounts=yes` in more recent versions.)

This means that the `umount` done here will have no effect on the
system. I've verified this is the case in a Debian 11 instance on GCP
where I detached a device that was mounted. (It's worth noting that the
initial addition of this rule, which AFAICT is this commit[[2]],
predates the systemd patch by a year which means that this rule did work
in the beginning.)

Nowadays, mounting and unmounting should be done via systemd mount
units. systemd knows to clean up stale mount entries if their underlying
devices go away. If it really must be done from a udev rule, the
`systemd-mount` command is recommended, but unmounting with this tool
will not work for our purposes here since it assumes the device is still
present.

Using `mount` interactively is fine, but then it's reasonable to expect
users to also manually `umount` once they're done with it.

Also, note that the logic here is not quite correct. If `umount` fails
(e.g. because the device wasn't actually mounted), the overall exit code
of the shell command is nonzero, which systemd will log as a udev error
like:

    (udev-worker)[534]: vda2: Process '/bin/sh -c '/bin/umount -fl
    /dev/vda2 && /usr/bin/logger -p daemon.warn -s WARNING: hot-removed
    /dev/vda2 that was still mounted, data may have been corrupted''
    failed with exit code 32.

For all the above reasons, drop this rule.

[1]: systemd/systemd@c2c13f2
[2]: GoogleCloudPlatform/compute-image-packages@502f06e
@google-oss-prow
Copy link

Hi @jlebon. Thanks for your PR.

I'm waiting for a GoogleCloudPlatform member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@zmarano
Copy link
Contributor

zmarano commented May 22, 2023

/ok-to-test

@zmarano
Copy link
Contributor

zmarano commented May 22, 2023

/hold

@zmarano
Copy link
Contributor

zmarano commented May 22, 2023

This change needs some investigation as these packages have to support a lot of old distros as well. One option for you is to just not include this udev rule in your package for Fedora CoreOS. We will have to investigate if this rule is actually needed in all other places.

@dustymabe
Copy link

Thanks @zmarano. That's reasonable.

@jlebon
Copy link
Contributor Author

jlebon commented May 23, 2023

Ack, thanks @zmarano. OOC, what are the oldest supported distros?

@zmarano
Copy link
Contributor

zmarano commented May 23, 2023

Technically it is still RHEL 6 (because of ELS)... but realistically it will be RHEL 7.

@jlebon
Copy link
Contributor Author

jlebon commented May 23, 2023

Technically it is still RHEL 6 (because of ELS)... but realistically it will be RHEL 7.

Thanks. RHEL 6 doesn't ship systemd, but if I had to guess, I'd say it's likely udevd is not executed in a mount namespace (so this rule would indeed have a effect). RHEL 7 ships systemd v219, which has the mentioned commit to make systemd-udevd use MountFlags=slave.

@dustymabe
Copy link

This change needs some investigation as these packages have to support a lot of old distros as well. One option for you is to just not include this udev rule in your package for Fedora CoreOS. We will have to investigate if this rule is actually needed in all other places.

Any updates here?

@dorileo
Copy link
Member

dorileo commented Aug 8, 2023

It actually looks good to me.

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jlebon, zmarano

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@zmarano
Copy link
Contributor

zmarano commented Aug 8, 2023

/lgtm

1 similar comment
@dorileo
Copy link
Member

dorileo commented Aug 8, 2023

/lgtm

@dorileo
Copy link
Member

dorileo commented Aug 8, 2023

/retest

@dorileo
Copy link
Member

dorileo commented Aug 8, 2023

/unhold

@google-oss-prow google-oss-prow bot merged commit dd7bc60 into GoogleCloudPlatform:master Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants