Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use socket with unix:/// in CAPD. #6155

Closed
Tracked by #5968
sbueringer opened this issue Feb 16, 2022 · 20 comments · Fixed by #6169 or #6543
Closed
Tracked by #5968

Use socket with unix:/// in CAPD. #6155

sbueringer opened this issue Feb 16, 2022 · 20 comments · Fixed by #6169 or #6543
Assignees
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Milestone

Comments

@sbueringer
Copy link
Member

sbueringer commented Feb 16, 2022

We are currently using /var/run/containerd/containerd.sock in our CAPD configurations.

We should actually be using unix:///var/run/containerd/containerd.sock. Goal of this issue is to go over test/ and replace all occurrences.

P.S. there will be a warning with kubeadm v1.24 for paths which are not using a URL scheme. (xref: kubernetes/kubeadm#2426 (comment))

/kind cleanup
/area provider-docker
/good-first-issue

[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

@k8s-ci-robot
Copy link
Contributor

@sbueringer: The label(s) area/provider-docker cannot be applied, because the repository doesn't have them.

In response to this:

We are currently using /var/run/containerd/containerd.sock in our CAPD configurations.

We should actually be using unix:///var/run/containerd/containerd.sock. Goal of this issue is to go over test/ and replace all occurrences.

P.S. there will be a warning with kubeadm v1.24 for paths without a scheme.

/kind cleanup
/area provider-docker
/good-first-issue

[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. labels Feb 16, 2022
@sbueringer
Copy link
Member Author

@fabriziopandini I assume we don't have to do anything for upgrades as we don't do in-place upgrades:

during kubeadm upgrade, we should iterate all nodes in the cluster and patch "kubeadm.alpha.kubernetes.io/cri-socket"
kubernetes/kubeadm#2426

@stmcginnis
Copy link
Contributor

I don't see anywhere that we are referencing this directly in the CAPD code:

$ rg socket test/*
test/go.sum
520:github.com/gorilla/websocket v0.0.0-20170926233335-4201258b820c/go.mod h1:E7qHFY5m1UJ88s3WnNqhKjPHQ0heANvMoAMk2YaljkQ=
521:github.com/gorilla/websocket v1.4.0/go.mod h1:E7qHFY5m1UJ88s3WnNqhKjPHQ0heANvMoAMk2YaljkQ=
522:github.com/gorilla/websocket v1.4.2/go.mod h1:YR8l580nyteQvAITg2hZ9XVh4b55+EU/adAjf1fMHhE=
883:github.com/tmc/grpc-websocket-proxy v0.0.0-20170815181823-89b8d40f7ca8/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U=
884:github.com/tmc/grpc-websocket-proxy v0.0.0-20190109142713-0ad062ec5ee5/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U=
885:github.com/tmc/grpc-websocket-proxy v0.0.0-20201229170055-e5319fda7802/go.mod h1:ncp9v5uamzpCO7NfCPTXjqaC+bZgJeR0sMTm6dMHP7U=

test/infrastructure/docker/Dockerfile
60:# NOTE: CAPD can't use non-root because docker requires access to the docker socket

I think with our use of the docker API, any interaction with the socket is all encapsulated inside that library.

@sbueringer
Copy link
Member Author

sbueringer commented Feb 16, 2022

That is correct. I was referring to the places where we configure the criSocket in YAML: https://grep.app/search?q=/var/run/containerd/containerd.sock&filter[repo.pattern][0]=kubernetes-sigs/cluster-api&filter[lang][0]=YAML

P.S. This config is used by kubeadm "inside" the nodes, afaik to configure Kubelet.

@fabriziopandini
Copy link
Member

@sbueringer we should tell the user to update their kubeconfig, but I'm -1 to upgrade settings automatically because this will trigger rollouts
@vincepri @CecileRobertMichon @enxebre opinions ^^

@sbueringer
Copy link
Member Author

sbueringer commented Feb 16, 2022

@sbueringer we should tell the user to update their kubeconfig, but I'm -1 to upgrade settings automatically because this will trigger rollouts @vincepri @CecileRobertMichon @enxebre opinions ^^

+1. I think we should have a section about 1.24 in our version.md and link that in the next releases which add support for 1.24. Maybe it's a good/sustainable pattern to document the really really breaking changes for CAPI users in Kubernetes releases there. (I'm referring to those parts which are kind of abstracted away by CAPI like kubeadm, not generic Kubernetes things like dropping support for old apiGroups in upstream Kubernetes)

@CecileRobertMichon
Copy link
Contributor

Just to confirm, the scope here is only CAPD? If so, considering CAPD is not meant for production and only for development, I don't think we need to be as strict with not causing rollouts. IMO the term "user" doesn't really apply here.

@sbueringer
Copy link
Member Author

Just to confirm, the scope here is only CAPD? If so, considering CAPD is not meant for production and only for development, I don't think we need to be as strict with not causing rollouts. IMO the term "user" doesn't really apply here.

The scope of what I want to fix is CAPD.

Fabrizio was referring to that we don't want to automatically migrate in KCP or CABPK to not trigger rollouts everywhere.

So essentially we should communicate to CAPI users that they should use a path with a URL scheme going forward, given that "the kubelet has long deprecated paths without unix:// and it might stop supporting them in the future" (kubernetes/kubeadm#2426 (comment))

@fabriziopandini
Copy link
Member

Thanks, Stefan for clarifying!
WRT to CAPD what about dropping set cri setting entirely? (kubeadm should auto-detect containerd, right?)

@sbueringer
Copy link
Member Author

Thanks, Stefan for clarifying! WRT to CAPD what about dropping set cri setting entirely? (kubeadm should auto-detect containerd, right?)

Yes. Let's try if that works!

@vincepri
Copy link
Member

Agree with @CecileRobertMichon's message above, if the scope is only for CAPD we should probably not bother and just do the breaking change. For general KCP consumption, we could consider to only migrate the configuration when we detect an upgrade to 1.24?

@sbueringer
Copy link
Member Author

For general KCP consumption, we could consider to only migrate the configuration when we detect an upgrade to 1.24?

But I think we can't do it for KubeadmConfig(Template) which means we would produce different configs in KCP / worker nodes. What do you think about trying to make some progress on webhook warnings in controller-runtime (not sure what the current state is there) and then adding a warning for it?

@sbueringer
Copy link
Member Author

sbueringer commented Feb 17, 2022

@ good first issue readers :)

Concrete task that can be already done:

  • Drop the criSocket: /var/run/containerd/containerd.sock configurations in all YAMLs below test/*
    • If the tests are still green, that should be it.
    • Otherwise let's take a look at the error together, we might have to add the prefix then instead.

Meanwhile we're further discussing on this issue what we can/should do in KCP/CABPK.

@shubham14bajpai
Copy link
Contributor

/assign

@sbueringer
Copy link
Member Author

Thx @shubham14bajpai
We'll continue here to figure out what/if we have to do in KCP/CABPK
/reopen

@k8s-ci-robot
Copy link
Contributor

@sbueringer: Reopened this issue.

In response to this:

Thx @shubham14bajpai
We'll continue here to figure out what/if we have to do in KCP/CABPK
/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot reopened this Feb 18, 2022
@sbueringer
Copy link
Member Author

sbueringer commented Feb 23, 2022

/remove-good-first-issue

CAPD YAMLs have been changed accordingly.

Let's pivot back to what we want do for KCP / CABPK. Trying to summarize:

The old criSocket format (without URL scheme) is deprecated on multiple levels now (kubeadm (with 1.24) / kubelet)

Our options are (listing all for completeness):

  1. Document that folks should migrate to the new format in a Kubernetes 1.24 specific section in version.md. We can link to that section when we announce Kubernetes 1.24 support
  2. Extend our validating webhooks to return a warning when the old format is found
  3. Try to automatically migrate criSocket fields:
    • This is not an option as it would trigger rollouts and probably lead to problems with GitOps tools and potentially ClusterClass too

My take is, let's implement 1. and open a new focused issue for 2.

Opinions? @CecileRobertMichon @enxebre @fabriziopandini @vincepri

@k8s-ci-robot k8s-ci-robot removed the good first issue Denotes an issue ready for a new contributor, according to the "help wanted" guidelines. label Feb 23, 2022
@fabriziopandini
Copy link
Member

I'm +1 to option 1 because AFAIK most of the users rely on kubeadm CRI autodetection (and it is just a warning now)
WRT to support warning in webhooks I'm +100 but I would consider a separated work, not an alternative to 1

@fabriziopandini
Copy link
Member

/milestone v1.2

@k8s-ci-robot k8s-ci-robot added this to the v1.2 milestone Mar 7, 2022
@sbueringer
Copy link
Member Author

sbueringer commented Mar 8, 2022

I'm tracking this as part of #5968, so we can document that folks should migrate when we have support for Kubernetes 1.24. (currently versions.md doesn't include v1.24 which makes sense as it hasn't been released yet)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt.
Projects
None yet
7 participants