Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubeadm join uses wrong ip #2418

Closed
mrksngl opened this issue Mar 25, 2021 · 6 comments
Closed

kubeadm join uses wrong ip #2418

mrksngl opened this issue Mar 25, 2021 · 6 comments
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@mrksngl
Copy link

mrksngl commented Mar 25, 2021

Is this a BUG REPORT or FEATURE REQUEST?

FEATURE REQUEST

Versions

kubeadm version (use kubeadm version): 1.20

Environment:

What happened?

My machines have two interfaces, enp0s3 and enp0s8. The first is providing the default route, the latter is meant for communication across the machines. The latter should also be used for communication in kubernetes.
Note: my later deployment will be similar, where the machines will communicate internally in a vlan whereas the default route is provided by another interface. Thus, I don't think my scenario is too artificial.
Additionally, there is a virtual IP for load balancing the control endpoint, within the network of and reachable via enp0s8.

Using a proper configuration file for kubeadm init, it generates all certificates and configurations on the intended, internal ip, i.e. that of enp0s8.
However, using kubeadm join on a second node (supposed to be on the control plane) fails to do so:

  • all certificates are issued on the external address of enp0s3
  • the kube-apiserver.yaml manifest has the external address as --advertise-address and also on all probes
  • all /etc/kubernetes/*.conf files use the external address
  • when bringing up etcd, it also advertises the wrong address, bringing the etcd on my first node into a failed state

What you expected to happen?

That I could somehow specify the addresses to be used for join, very much as I could do for init.

Although I would rather expect that kubeadm join uses the address of the interface on which the api server endpoint (in my case, that was the virtual ip) can be reached, rather than falling back to the interface having the default route.

How to reproduce it (as minimally and precisely as possible)?

  • Create a new master using kubeadm init and set its localAPIEndpoint.advertiseAddress to an address which is not on the network interface having the default route
  • Try to join a second node using kubeadm join

Below is my configuration for joining. There is also a section on etcd which I omitted here.
The node's internal ip (which should be used) is 192.168.3.12, the virtual ip where the first master can be found is 192.168.3.100.

apiVersion: kubeadm.k8s.io/v1beta2
kind: JoinConfiguration
nodeRegistration:
  criSocket: "unix:///var/run/crio/crio.sock"
  kubeletExtraArgs:
    node-ip: 192.168.3.12
  taints: []
controlPlane:
  localApiEndpoint:
    advertiseAddress: 192.168.3.12
    bindPort: 6443
  certificateKey: "..."
discovery:
  bootstrapToken:
    apiServerEndpoint: 192.168.3.100:6443
    token: ...
    caCertHashes:
      - ...
    unsafeSkipCAVerification: false
  timeout: 5m0s

Anything else we need to know?

@neolit123
Copy link
Member

neolit123 commented Mar 25, 2021

hi, trying to figure out what may be the cause for this i noticed:

  localApiEndpoint:

sadly, the sigs.k8s.io/yaml library that kubeadm and a number of other k8s components use has a bug (or a lack of a feature) where it ignores case sensitivity of fields even if string unmarshaling mode is enabled (kubeadm has warnings, not errors in strict mode).

kubernetes-sigs/yaml#15

the correct field casing for both Init and JoinConfiguration is localAPIEndpoint:
https://pkg.go.dev/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#JoinControlPlane
https://pkg.go.dev/k8s.io/kubernetes/cmd/kubeadm/app/apis/kubeadm/v1beta2#InitConfiguration

i suspect that is the problem here. let me know if it works for you.

you may ask why were are not fixing that bug - the response is "its complicated".

/kind support

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Mar 25, 2021
@mrksngl
Copy link
Author

mrksngl commented Mar 25, 2021

Ah ok, thanks, I didn't know it was case sensitve.
I actually used the links you posted as a reference, but since each key in the yaml file begins lowercase (whereas they don't in the go structs) I somehow assumed it was case insensitive all the way.

Now that actually kind-of did the trick:

  • certificates are created as they should
  • all files point to the internal ip address

The only thing left is the manifest for the stacked etcd: it seems that my etcd configuration (provided by a kind: ClusterConfiguration) inside the same yaml file is completely ignored by the join cmdlet.
Instead, it comes up with manifest which is pretty much the same as on my first machine, i.e. it even tries to listen on the address of that machine. I changed the manifest by hand, waiting for etcd to come up, then everything worked so far.

@neolit123
Copy link
Member

Now that actually kind-of did the trick:

glad it worked.

The only thing left is the manifest for the stacked etcd: it seems that my etcd configuration (provided by a kind: ClusterConfiguration) inside the same yaml file is completely ignored by the join cmdlet.
Instead, it comes up with manifest which is pretty much the same as on my first machine, i.e. it even tries to listen on the address of that machine. I changed the manifest by hand, waiting for etcd to come up, then everything worked so far.

clusterconfiguration is cluster wide. joinconfiguration does not support modifying the local etcd member flags, but if you'd like to see that you can create a separate issue explaining the details of the request. if we support this it becomes difficult to do kubeadm upgrade, because the only object we persist is the clusterconfiguration and we do not persist a joinconfiguration / kubeletconfiguration per node. so we might have to defer this proposal to a much later API version - e.g. not in v1beta3.

in the meantime you could use patches, see the --experimental-patches flag for init/join/upgrade.
it supports the same patch formats as kubectl, and you can have a etcd-foo.json patch file that would patch the kubeadm generated etcd manifest on that node. you can then pass the same patch on upgrade.

i will close this, but let me know if you have more Qs.

/close

@k8s-ci-robot
Copy link
Contributor

@neolit123: Closing this issue.

In response to this:

Now that actually kind-of did the trick:

glad it worked.

The only thing left is the manifest for the stacked etcd: it seems that my etcd configuration (provided by a kind: ClusterConfiguration) inside the same yaml file is completely ignored by the join cmdlet.
Instead, it comes up with manifest which is pretty much the same as on my first machine, i.e. it even tries to listen on the address of that machine. I changed the manifest by hand, waiting for etcd to come up, then everything worked so far.

clusterconfiguration is cluster wide. joinconfiguration does not support modifying the local etcd member flags, but if you'd like to see that you can create a separate issue explaining the details of the request. if we support this it becomes difficult to do kubeadm upgrade, because the only object we persist is the clusterconfiguration and we do not persist a joinconfiguration / kubeletconfiguration per node. so we might have to defer this proposal to a much later API version - e.g. not in v1beta3.

in the meantime you could use patches, see the --experimental-patches flag for init/join/upgrade.
it supports the same patch formats as kubectl, and you can have a etcd-foo.json patch file that would patch the kubeadm generated etcd manifest on that node. you can then pass the same patch on upgrade.

i will close this, but let me know if you have more Qs.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@mrksngl
Copy link
Author

mrksngl commented Mar 25, 2021

Works with a patch, thanks for the hint.

I'm even a bit grateful it didn't work out of the box (for me), because it forced me to go through a lot of painful lessons ;)

@neolit123
Copy link
Member

kubeadm can do a lot in the expense of not being a turn-key solution.
as far as the YAML bug goes, that's just bad, but it requires breaking changes to a lot of consumers of that library so there are trade offs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

No branches or pull requests

3 participants