BUG: `sealos cert` generate wrong DNS altnames for HA etcd clusters #3887

dinoallo · 2023-09-11T03:54:21Z

Sealos Version

v4.3.3 / main

How to reproduce the bug?

Steps to reproduce:

Create a cluster with multiple master nodes.
Run sealos cert on a master node
Log into any other master nodes.
openssl x509 -text -in /etc/kubernetes/pki/etcd/peer.crt -noout | less and we can see that the DNS altnames, specifically the hostname and the IP, are the ones for the node running the command.

What is the expected behavior?

The certificates for etcd are generated correctly with DNS altnames consisting of each node's hostname and IP.

What do you see instead?

It can be seen that the DNS altnames, specifically the hostname and the IP, are the ones for the node running the command.

Operating environment

- Sealos version: v4.3.3 / main
- Docker version:
- Kubernetes version:
- Operating system:
- Runtime environment:
- Cluster size:
- Additional information:

Additional information

If this issue happened, the first etcd service would complain that the other etcd services have wrong DNS altnames in their certificates, resulting in this etcd service restarting frequently. The cluster can be recovered by re-issuing the wrong etcd certificates(peer.crt, server.crt) on each node.

Step to recover the cluster:
(Do the following on each node affected)

Backup the old certificates
Remove the wrong etcd certificates, i.e. peer.crt, server.crt
Create a ClusterConfiguration like this:

# config.yaml
apiVersion: "kubeadm.k8s.io/v1beta3"
kind: ClusterConfiguration
etcd:
    local:
        serverCertSANs:
        - "<your-node-ip>"
        - "<your-node-hostname>"
        peerCertSANs:
        - "<your-node-ip>"
        - "<your-node-hostname>"

Run kubeadm init phase certs etcd-peer --config config.yaml and kubeadm init phase certs etcd-server --config config.yaml to generate new certificates.

The text was updated successfully, but these errors were encountered:

dinoallo · 2023-09-11T04:17:23Z

cross reference: #3708

Signed-off-by: cuisongliu <cuisongliu@qq.com> labring#3708 labring#3887

* fix: dnsDomain does not take effect in kubelet (#3834) (#3835) Signed-off-by: yangxg <yangxggo@163.com> Co-authored-by: yangxg <yangxggo@163.com> (cherry picked from commit c60b2fd) * fix: ignore http server close error (#3854) (#3857) (cherry picked from commit 2d4d78b) * fix: skip same path (#3898) (#3899) Co-authored-by: 榴莲榴莲 <78798447@qq.com> (cherry picked from commit a256283) * fix: disable scp checksum by default (#3913) (#3919) Co-authored-by: fengxsong <fengxsong@outlook.com> (cherry picked from commit 96cb79d) * feat: support timeout setting for lvscare http prober (#3901) (#3905) Co-authored-by: fengxsong <fengxsong@outlook.com> (cherry picked from commit 6bd5c0a) * feature: kubefile CMD support ENV variable format (#3921) (#3942) Co-authored-by: Zihan Li <eden.zh.li@outlook.com> (cherry picked from commit 4b5f3fe) * delete cr build for buildah (#3953) (#3954) Co-authored-by: yy <56745951+lingdie@users.noreply.github.com> (cherry picked from commit 865803c) * delete: controller part and useless service. (#3950) * delete controllers and useless service. * delete buildah image cr part. * delete ci. * roll back (cherry picked from commit 076c7c7) Signed-off-by: cuisongliu <cuisongliu@qq.com> * fix: using extra valid status codes when response status code greater than 400 (#3986) (#3988) Co-authored-by: fengxsong <fengxsong@outlook.com> (cherry picked from commit 7be765f) * feature(main): add lvscare gomod (#3995) Signed-off-by: cuisongliu <cuisongliu@qq.com> (cherry picked from commit 050d70b) * fix(main): sync cert for cert cmd Signed-off-by: cuisongliu <cuisongliu@qq.com> #3708 #3887 --------- Co-authored-by: sealos-ci-robot <109538726+sealos-ci-robot@users.noreply.github.com> Co-authored-by: yy <56745951+lingdie@users.noreply.github.com>

dinoallo added the kind/bug Something isn't working label Sep 11, 2023

dinoallo mentioned this issue Sep 11, 2023

Is there something wrong with etcd certs #3708

Closed

cuisongliu linked a pull request Sep 11, 2023 that will close this issue

fix(main): sync cert for cert cmd #3891

Merged

zzjin closed this as completed in #3891 Sep 18, 2023

cuisongliu pushed a commit to cuisongliu/sealos that referenced this issue Sep 29, 2023

fix(main): sync cert for cert cmd

a6e3d03

Signed-off-by: cuisongliu <cuisongliu@qq.com> labring#3708 labring#3887

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: `sealos cert` generate wrong DNS altnames for HA etcd clusters #3887

BUG: `sealos cert` generate wrong DNS altnames for HA etcd clusters #3887

dinoallo commented Sep 11, 2023 •

edited

Loading

dinoallo commented Sep 11, 2023

BUG: sealos cert generate wrong DNS altnames for HA etcd clusters #3887

BUG: sealos cert generate wrong DNS altnames for HA etcd clusters #3887

Comments

dinoallo commented Sep 11, 2023 • edited Loading

Sealos Version

How to reproduce the bug?

What is the expected behavior?

What do you see instead?

Operating environment

Additional information

dinoallo commented Sep 11, 2023

BUG: `sealos cert` generate wrong DNS altnames for HA etcd clusters #3887

BUG: `sealos cert` generate wrong DNS altnames for HA etcd clusters #3887

dinoallo commented Sep 11, 2023 •

edited

Loading