Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

overlay: add GCP udev rules #160

Merged
merged 1 commit into from
Sep 11, 2019

Conversation

miabbott
Copy link
Member

@miabbott miabbott commented Sep 7, 2019

The downstream BZ#1748638[0] requests the addition of udev rules for
nodes in GCP to support the dynamic provisioning of storage. The
rules live in the GoogleCloudPlatform/compute-image-packages repo[1]
and were most recently updated in Dec 2018. This commit is a copy of
the rules from the repo.

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1748638
[1] https://github.com/GoogleCloudPlatform/compute-image-packages/tree/master/packages/google-compute-engine/src/lib/udev/rules.d

@miabbott
Copy link
Member Author

miabbott commented Sep 7, 2019

Worth noting is that the BZ references one other udev rule (99-gce.rules) that does not appear in the repo referenced in the commit. I'm not sure where that rule comes from and/or how we could include it.

@bgilbert
Copy link
Contributor

bgilbert commented Sep 8, 2019

These rules aren't conditionalized for GCP, so they'd run on every platform. We should limit them to GCP. Here's how Container Linux does it.

@lucab
Copy link
Contributor

lucab commented Sep 9, 2019

Additionally, this would hijack filepaths that belong to other packages (In Fedora, that would be google-compute-engine-tools).

Have we considered splitting the binary package out instead of duplicating them? That would also help with maintenance/updates/ownership.

@miabbott
Copy link
Member Author

miabbott commented Sep 9, 2019

These rules aren't conditionalized for GCP, so they'd run on every platform. We should limit them to GCP.

Doh! I didn't bother to check; just assumed that they would behave like the Azure rules in #159. I'll rework this.

Have we considered splitting the binary package out instead of duplicating them? That would also help with maintenance/updates/ownership.

Similar to what you've proposed for the Azure rules?

@cgwalters
Copy link
Member

Additionally, this would hijack filepaths that belong to other packages (In Fedora, that would be google-compute-engine-tools).

That doesn't seem like a real problem to me; we're not supporting installing the agent on FCOS right? That said, it would be cleaner to have them come from a google-compute-engine-agent-core or something I guess.

@lucab
Copy link
Contributor

lucab commented Sep 9, 2019

Oh, definitely not something to stall this on, but more like whether a plan was in place for longer term (similar to Azure, yes).

@miabbott
Copy link
Member Author

miabbott commented Sep 9, 2019

I'd like to keep the existing rules as intact as possible, so would it be reasonable to scope them with ENV{ID_VENDOR}=="Google"? I haven't been able to test all of them with this change, but a simple add/remove of a disk seemed to trigger the correct rules.

Related: if the rules are changed as such, how does the copyright stanza have to change, if at all?

@cgwalters
Copy link
Member

Oh, definitely not something to stall this on, but more like whether a plan was in place for longer term (similar to Azure, yes).

Yep, understood. We're basically making the rules up as we go along, so to speak.

@bgilbert
Copy link
Contributor

@miabbott I'm not super-comfortable with the way the rules assign to ENV{ID_SERIAL} and ENV{ID_SERIAL_SHORT}, but yeah, I think that'd be okay.

We could add a 2019 Red Hat copyright statement, but AIUI there's not a strong need to do so.

@cgwalters
Copy link
Member

These rules aren't conditionalized for GCP, so they'd run on every platform. We should limit them to GCP. Here's how Container Linux does it.

It'd be safe to propose a patch upstream for this too, right?

@miabbott
Copy link
Member Author

Pushed a commit with additional scoping ⬆️

Thanks for the feedback so far!

The downstream BZ#1748638[0] requests the addition of udev rules for
nodes in GCP to support the dynamic provisioning of storage.  The
rules live in the `GoogleCloudPlatform/compute-image-packages` repo[1]
and were most recently updated in Dec 2018.

This commit is mostly a copy of those rules with additional scoping to
only run in the GCP environment.

[0] https://bugzilla.redhat.com/show_bug.cgi?id=1748638
[1] https://github.com/GoogleCloudPlatform/compute-image-packages/tree/master/packages/google-compute-engine/src/lib/udev/rules.d
@miabbott
Copy link
Member Author

Rebased to see if CI will be happy 🤞

@cgwalters
Copy link
Member

One thing I wanted to quickly verify is the vendor, and yep, LGTM:

walters@toolbox ~/s/g/o/installer> oc debug node/ci-ln--v4m6v-w-b-rmbmt.c.openshift-gce-devel-ci.internal
...
sh-4.2# chroot /host
...
sh-4.4# cat /sys/block/sda/device/vendor 
Google  
sh-4.4# 

@cgwalters cgwalters merged commit 8ab4ca8 into coreos:testing-devel Sep 11, 2019
#
# When a disk is removed, unmount any remaining attached volumes.

ACTION=="remove", SUBSYSTEM=="block", KERNEL=="sd*|vd*", ENV{ID_VENDOR}=="Google", RUN+="/bin/sh -c '/bin/umount -fl /dev/$name && /usr/bin/logger -p daemon.warn -s WARNING: hot-removed /dev/$name that was still mounted, data may have been corrupted'"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this rule. It means that we're behaving differently on GCP than on other platforms when devices are yanked from the system.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, in the end I don't think it's really specific to GCE, on at least AWS too one can force detach with the API.

And AIUI this occurs when e.g. higher level code like Kubernetes Persistent Volume management cleans up resources before it unmounts it on the OS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we just drop this file?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Bugzilla that spawned this doesn't say exactly what went wrong...but presumably it was related to the naming, and not this hack to do the unmount. So we probably could drop it...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pinged the reporter in the BZ about the removal rules. We can carry them in RHCOS if needed, but I agree we could probably remove them for FCOS.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dustymabe pushed a commit to jbtrystram/fedora-coreos-config that referenced this pull request Apr 19, 2024
This reverts a change in a section heading, which broke an incoming
link from coreos#528.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants