-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cluster: drop DisableLocalStorageCapacityIsolation #3360
cluster: drop DisableLocalStorageCapacityIsolation #3360
Conversation
it was used to workaround a kubelet crash issue with rootless providers. The Kubelet seems to work fine now with localStorageCapacityIsolation enabled in a user namespace so drop the special handling. After this change, ephemeral storage can be used in a rootless cluster. Closes: kubernetes-sigs#3359 Signed-off-by: Giuseppe Scrivano <gscrivan@redhat.com>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: giuseppe The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be identifying what changed, we know it was broken in the past, in which cases is it still broken? Some combination of docker or podman version and/or kubernetes version must be problematic still.
I see you looked for failures in #3359 but it's not clear on what dimension(s). |
I am running kind as rootless on Fedora 38. I am still digging in the old issue, but it refers Fedora 35 which is EOF. I'll try on a fresh machine with the default configuration and see what happens there |
it seems to work fine on Fedora 38 with the default configuration:
trying with 3 nodes:
|
I'm wondering if it was a kubelet or podman/runc/crun change, we've had to do things like this conditionally on version before from either the "node runtime" (podman/docker) or k/k. I do think we should merge this, but I'm hesitant to make cluster bringup entirely fail again versus this feature gap. |
any decisions on this PR? Do we accept the risk :-) or play safe and close it? IMO, cluster failures are bad and we should avoid them as much as possible, on the other hand, it was also not easy to find the root reason for this failure and why the rootless cluster was behaving differently. It took a while to figure out what was happening. Something that a hard failure would have made easier to fix. |
@BenTheElder ping |
The hard failure is NOT fixable though, short of upgrading something (??) which we don't seem to know yet. I'm hesitant to ship disabling a workaround without knowing what change in the ecosystem made it no longer required (e.g. if it's a particular k8s or podman or docker version then we can version gate this workaround), at the same time I understand the desire to disable this no longer necessary workaround FWIW: You should not expect parity in rootless yet. Rootless k8s has always had gaps AFAIK, kind or otherwise. |
since there is no interest in this change, I am closing the PR |
FTR: I think we should ship dropping DisableLocalStorageCapacityIsolation, but preferably we'd understand when it's safe to drop. I don't know if anyone ever dug into what the root issue was here. That said, if we can be confident that this will only be broken on really outdated hosts and will benefit the rest, we've just gone with requiring more up to date hosts before (like cgroupns=private #3311) |
The commit 968c842 that added I won't have time to dig further into this. If anyone is interested in taking this over (or just accepting the risk), I can reopen it. |
The However, in the end, #2524 turned out to be a duplicate (#2524 (comment)) of a different known issue. I guess the main question is, was ab38ef8 actually needed as part of #2559, or did it just look like it was needed? |
Hi @giuseppe: For me it is NOT working:
|
@damzog have you applied this patch? Does it work without it? |
@giuseppe no I have installed current version of kind and ran into the issue of not having ephemeral storage. Then I started searching for a reason and found this discussion. |
thanks for confirming it. I still believe it would be better to have the current patch as it is solving a real issue, and if any problems come up, it can be analyzed and maybe fix it in a different way |
You can set a kubeadmConfigPatch with something like this: kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
kubeadmConfigPatches:
- |
kind: KubeletConfiguration
localStorageCapacityIsolation: true And see how this change would work for you |
@BenTheElder Thanks, that actually worked. And sorry for being a newbie in this :) Maybe this issue would be worth to be added to the standard docs? |
Not at all. And FWIW the config patches are a pretty advanced feature not meant to be required for everyday usage versus tinkering with Kubernetes itself.
I think we need to revisit merging this change and inverting it to requiring the unknown broken environments to use a patch with I'd like to consider it for the release after the current pending one (to minimize disruption, we're overdue to cut one anytime now and already have a solid change set) |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
@giuseppe: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
[rebased #3360] cluster: drop DisableLocalStorageCapacityIsolation
it was used to workaround a kubelet crash issue with rootless providers.
The Kubelet seems to work fine now with localStorageCapacityIsolation enabled in a user namespace so drop the special handling. After this change, ephemeral storage can be used in a rootless cluster.
Closes: #3359