Adding a new static worker node results in a preflight check failure on existing nodes #2802

xmudrii · 2023-06-13T19:27:07Z

What happened?

Trying to add a new static worker node results in the following error:

+ sudo kubeadm init phase preflight --config=./kubeone/cfg/master_0.yaml
W0613 19:21:47.950292   27890 initconfiguration.go:331] [config] WARNING: Ignored YAML document with GroupVersionKind kubeadm.k8s.io/v1beta3, Kind=JoinConfiguration
W0613 19:21:47.958412   27890 initconfiguration.go:119] Usage of CRI endpoints without URL scheme is deprecated and can cause kubelet errors in the future. Automatically prepending scheme "unix" to the "criSocket" with value "/run/containerd/containerd.sock". Please update your configuration!
W0613 19:21:47.958515   27890 utils.go:69] The recommended value for "clusterDNS" in "KubeletConfiguration" is: [10.96.0.10]; the provided value is: [169.254.20.10]
	[WARNING DirAvailable--var-lib-etcd]: /var/lib/etcd is not empty
error execution phase preflight: [preflight] Some fatal errors occurred:
	[ERROR Port-6443]: Port 6443 is in use
	[ERROR Port-10259]: Port 10259 is in use
	[ERROR Port-10257]: Port 10257 is in use
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-apiserver.yaml]: /etc/kubernetes/manifests/kube-apiserver.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-controller-manager.yaml]: /etc/kubernetes/manifests/kube-controller-manager.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-kube-scheduler.yaml]: /etc/kubernetes/manifests/kube-scheduler.yaml already exists
	[ERROR FileAvailable--etc-kubernetes-manifests-etcd.yaml]: /etc/kubernetes/manifests/etcd.yaml already exists
	[ERROR Port-10250]: Port 10250 is in use
	[ERROR Port-2379]: Port 2379 is in use
	[ERROR Port-2380]: Port 2380 is in use
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

What happens is that joining a new static worker nodes triggers the WithFullInstall workflow that's used to provision the cluster from scratch as well. There we run preflight checks with kubeadm on each node to verify that VMs satisfy requirements to be a Kubernetes node.

That works the first time we provision the cluster, but subsequent runs (e.g. when adding a new static node) are failing on existing nodes because the cluster is provisioned, so files are already created and ports are taken by Kubernetes components.

Expected behavior

Adding a new static worker node works as expected

How to reproduce the issue?

Provision the cluster
Try to add a new static worker node after the cluster is provisioned

What KubeOne version are you using?

Provide your KubeOneCluster manifest here (if applicable)

{
  "kubeone": {
    "major": "1",
    "minor": "6",
    "gitVersion": "v1.6.0-rc.2-36-g0536063a",
    "gitCommit": "0536063ab064601ba217c2abd41abd4c80a02477",
    "gitTreeState": "",
    "buildDate": "2023-06-13T21:16:41+02:00",
    "goVersion": "go1.20.4",
    "compiler": "gc",
    "platform": "darwin/arm64"
  },
  "machine_controller": {
    "major": "",
    "minor": "",
    "gitVersion": "8e5884837711fb0fc6b568d734f09a7b809fc28e",
    "gitCommit": "",
    "gitTreeState": "",
    "buildDate": "",
    "goVersion": "",
    "compiler": "",
    "platform": "linux/amd64"
  }
}

What cloud provider are you running on?

Baremetal

What operating system are you running in your cluster?

Ubuntu 20.04.6

Additional information

We can mitigate this issue by ignoring those failures, in some cases, those failures can be real issues that's going to prevent cluster from being provisioned.

The text was updated successfully, but these errors were encountered:

kubermatic-bot · 2023-09-11T23:50:58Z

Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale.

If this issue is safe to close now please do so with /close.

/lifecycle stale

xmudrii · 2023-09-12T08:15:15Z

/remove-lifecycle stale

kubermatic-bot · 2023-12-11T11:53:01Z

Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale.

If this issue is safe to close now please do so with /close.

/lifecycle stale

xmudrii · 2023-12-11T11:58:48Z

/remove-lifecycle stale

kubermatic-bot · 2024-04-08T00:04:16Z

Issues go stale after 90d of inactivity.
After a furter 30 days, they will turn rotten.
Mark the issue as fresh with /remove-lifecycle stale.

If this issue is safe to close now please do so with /close.

/lifecycle stale

xmudrii · 2024-04-08T11:58:48Z

/remove-lifecycle stale

xmudrii added kind/bug Categorizes issue or PR as related to a bug. sig/cluster-management Denotes a PR or issue as being assigned to SIG Cluster Management. labels Jun 13, 2023

xmudrii self-assigned this Jun 13, 2023

xmudrii mentioned this issue Jun 13, 2023

Ignore some kubeadm preflight checks when validating cluster requirements #2803

Merged

kubermatic-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 11, 2023

kubermatic-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 12, 2023

kubermatic-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 11, 2023

kubermatic-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 11, 2023

xmudrii modified the milestones: KubeOne 1.8 - Candidate, KubeOne 1.9 Jan 8, 2024

kubermatic-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 8, 2024

kubermatic-bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 8, 2024

xmudrii added the priority/low Not that important. label Jun 24, 2024

kron4eg removed the priority/low Not that important. label Aug 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding a new static worker node results in a preflight check failure on existing nodes #2802

Adding a new static worker node results in a preflight check failure on existing nodes #2802

xmudrii commented Jun 13, 2023

kubermatic-bot commented Sep 11, 2023

xmudrii commented Sep 12, 2023

kubermatic-bot commented Dec 11, 2023

xmudrii commented Dec 11, 2023

kubermatic-bot commented Apr 8, 2024

xmudrii commented Apr 8, 2024

Adding a new static worker node results in a preflight check failure on existing nodes #2802

Adding a new static worker node results in a preflight check failure on existing nodes #2802

Comments

xmudrii commented Jun 13, 2023

What happened?

Expected behavior

How to reproduce the issue?

What KubeOne version are you using?

Provide your KubeOneCluster manifest here (if applicable)

What cloud provider are you running on?

What operating system are you running in your cluster?

Additional information

kubermatic-bot commented Sep 11, 2023

xmudrii commented Sep 12, 2023

kubermatic-bot commented Dec 11, 2023

xmudrii commented Dec 11, 2023

kubermatic-bot commented Apr 8, 2024

xmudrii commented Apr 8, 2024