Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An error is reported when you run the k3s etcd-snapshot save command. #11411

Closed
moseszane168 opened this issue Dec 4, 2024 · 9 comments
Closed

Comments

@moseszane168
Copy link

moseszane168 commented Dec 4, 2024

Environmental Info:
K3s Version:

[root@ecs-6a0f ~]# k3s --version
k3s version v1.30.6+k3s1 (1829eaa)
go version go1.22.8

Node(s) CPU architecture, OS, and Version:

[root@ecs-6a0f ~]# uname -m
aarch64

[root@ecs-6a0f ~]# uname -s
Linux

[root@ecs-6a0f ~]# lsb_release -a
LSB Version: n/a
Distributor ID: HuaweiCloudEulerOS
Description: Huawei Cloud EulerOS release 2.0 (West Lake)
Release: 2.0
Codename: WestLake

[root@ecs-6a0f ~]# uname -r
5.10.0-182.0.0.95.r1941_123.hce2.aarch64

[root@ecs-6a0f ~]# lscpu
Architecture: aarch64
CPU op-mode(s): 64-bit
Byte Order: Little Endian
CPU(s): 8
On-line CPU(s) list: 0-7
Vendor ID: HiSilicon
BIOS Vendor ID: QEMU
BIOS Model name: virt-6.2
Model: 0
Thread(s) per core: 1
Core(s) per socket: 8
Socket(s): 1
Stepping: 0x0
Frequency boost: disabled
CPU max MHz: 2900.0000
CPU min MHz: 2900.0000
BogoMIPS: 200.00
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp s
ha512 sve asimdfhm dit uscat ilrcpc flagm ssbs sb dcpodp flagm2 frint svei8mm svef32mm svef64mm svebf16 i8mm bf16 dgh rng ec
v
Caches (sum of all):
L1d: 512 KiB (8 instances)
L1i: 512 KiB (8 instances)
L2: 4 MiB (8 instances)
L3: 32 MiB (1 instance)
NUMA:
NUMA node(s): 1
NUMA node0 CPU(s): 0-7
Vulnerabilities:
Gather data sampling: Not affected
Itlb multihit: Not affected
L1tf: Not affected
Mds: Not affected
Meltdown: Not affected
Mmio stale data: Not affected
Reg file data sampling: Not affected
Retbleed: Not affected
Spec rstack overflow: Not affected
Spec store bypass: Vulnerable
Spectre v1: Mitigation; __user pointer sanitization
Spectre v2: Not affected
Srbds: Not affected
Tsx async abort: Not affected

Cluster Configuration:

1 servers

Describe the bug:

[root@ecs-6a0f ~]# sudo k3s etcd-snapshot save --etcd-token=K107fc70c4b633ebb32eeb30daad73d8b5a50153405a16945a0da7bf13c7c3991ad::server:12345
FATA[0000] see server log for details: Unauthorized

Steps To Reproduce:

  • Installed K3s:
    curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh |
    INSTALL_K3S_MIRROR=cn
    INSTALL_K3S_SKIP_SELINUX_RPM=true
    INSTALL_K3S_SELINUX_WARN=true
    K3S_TOKEN=12345 sh -s -
    --system-default-registry=registry.cn-hangzhou.aliyuncs.com
  • Save Snapshot:
    sudo k3s etcd-snapshot save --etcd-token=K107fc70c4b633ebb32eeb30daad73d8b5a50153405a16945a0da7bf13c7c3991ad::server:12345

Expected behavior:

The k3s etcd-snapshot save command should create an etcd snapshot without encountering errors.

Actual behavior:

As described above

Additional context / logs:

[root@ecs-6a0f ~]# sudo k3s etcd-snapshot save --etcd-token=K107fc70c4b633ebb32eeb30daad73d8b5a50153405a16945a0da7bf13c7c3991ad::server:12345
FATA[0000] see server log for details: Unauthorized

[root@ecs-6a0f ~]# sudo k3s etcd-snapshot save
FATA[0000] see server log for details: Unauthorized

@dereknola
Copy link
Member

Don't use the full sever token when taking the snapshot. You defined the token as 12345, so use that value when calling snapshot save --token 12345.

@brandond
Copy link
Member

brandond commented Dec 4, 2024

The issue here is that you didn't start the server with --cluster-init, so etcd is not enabled, which means you can't take a snapshot.

This is a duplicate of

@brandond brandond closed this as completed Dec 4, 2024
@github-project-automation github-project-automation bot moved this from New to Done Issue in K3s Development Dec 4, 2024
@moseszane168
Copy link
Author

curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | \

INSTALL_K3S_MIRROR=cn
INSTALL_K3S_SKIP_SELINUX_RPM=true
INSTALL_K3S_SELINUX_WARN=true
K3S_TOKEN=12345 sh -s -
--system-default-registry=registry.cn-hangzhou.aliyuncs.com
[INFO] Finding release for channel stable
[INFO] Using v1.30.6+k3s1 as release
[INFO] Downloading hash rancher-mirror.rancher.cn/k3s/v1.30.6-k3s1/sha256sum-arm64.txt
[INFO] Downloading binary rancher-mirror.rancher.cn/k3s/v1.30.6-k3s1/k3s-arm64
[INFO] Verifying binary download
[INFO] Installing k3s to /usr/local/bin/k3s
[INFO] Skipping installation of SELinux RPM
[INFO] Creating /usr/local/bin/kubectl symlink to k3s
[INFO] Creating /usr/local/bin/crictl symlink to k3s
[INFO] Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /usr/bin/ctr
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s.service
[INFO] systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO] systemd: Starting k3s
Job for k3s.service failed because the control process exited with error code.
See "systemctl status k3s.service" and "journalctl -xeu k3s.service" for details.

@moseszane168
Copy link
Author

mkdir /etc/rancher/k3s
vim /etc/rancher/k3s/config.yaml
cluster-init: true
bind-address: "127.0.0.l" # 替换为你的有效 IPv4 地址
advertise-address: "127.0.0.1" # 替换为你的有效 IPv4 地址

@moseszane168
Copy link
Author

The issue here is that you didn't start the server with --cluster-init, so etcd is not enabled, which means you can't take a snapshot.

This is a duplicate of

mkdir /etc/rancher/k3s
vim /etc/rancher/k3s/config.yaml
cluster-init: true
bind-address: "127.0.0.l" # 替换为你的有效 IPv4 地址
advertise-address: "127.0.0.1" # 替换为你的有效 IPv4 地址

curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh | \

INSTALL_K3S_MIRROR=cn
INSTALL_K3S_SKIP_SELINUX_RPM=true
INSTALL_K3S_SELINUX_WARN=true
K3S_TOKEN=12345 sh -s -
--system-default-registry=registry.cn-hangzhou.aliyuncs.com
[INFO] Finding release for channel stable
[INFO] Using v1.30.6+k3s1 as release
[INFO] Downloading hash rancher-mirror.rancher.cn/k3s/v1.30.6-k3s1/sha256sum-arm64.txt
[INFO] Downloading binary rancher-mirror.rancher.cn/k3s/v1.30.6-k3s1/k3s-arm64
[INFO] Verifying binary download
[INFO] Installing k3s to /usr/local/bin/k3s
[INFO] Skipping installation of SELinux RPM
[INFO] Creating /usr/local/bin/kubectl symlink to k3s
[INFO] Creating /usr/local/bin/crictl symlink to k3s
[INFO] Skipping /usr/local/bin/ctr symlink to k3s, command exists in PATH at /usr/bin/ctr
[INFO] Creating killall script /usr/local/bin/k3s-killall.sh
[INFO] Creating uninstall script /usr/local/bin/k3s-uninstall.sh
[INFO] env: Creating environment file /etc/systemd/system/k3s.service.env
[INFO] systemd: Creating service file /etc/systemd/system/k3s.service
[INFO] systemd: Enabling k3s unit
Created symlink /etc/systemd/system/multi-user.target.wants/k3s.service → /etc/systemd/system/k3s.service.
[INFO] systemd: Starting k3s
Job for k3s.service failed because the control process exited with error code.
See "systemctl status k3s.service" and "journalctl -xeu k3s.service" for details.

@moseszane168
Copy link
Author

The issue here is that you didn't start the server with --cluster-init, so etcd is not enabled, which means you can't take a snapshot.

This is a duplicate of

curl -sfL https://rancher-mirror.rancher.cn/k3s/k3s-install.sh |
INSTALL_K3S_MIRROR=cn
INSTALL_K3S_SKIP_SELINUX_RPM=true
INSTALL_K3S_SELINUX_WARN=true
INSTALL_K3S_EXEC="--bind-address 192.168.0.199 --advertise-address 192.168.0.199 --cluster-init"
K3S_TOKEN=12345 sh -s -
--system-default-registry=registry.cn-hangzhou.aliyuncs.com

@moseszane168
Copy link
Author

Job for k3s.service failed because the control process exited with error code.

The issue here is that you didn't start the server with --cluster-init, so etcd is not enabled, which means you can't take a snapshot.

This is a duplicate of

[root@ecs-6a0f ~]# journalctl -xeu k3s.service
Dec 05 10:09:44 ecs-6a0f k3s[396318]: time="2024-12-05T10:09:44+08:00" level=info msg="etcd temporary data store connection OK"
Dec 05 10:09:44 ecs-6a0f k3s[396318]: time="2024-12-05T10:09:44+08:00" level=info msg="Reconciling bootstrap data between datastore and disk"
Dec 05 10:09:44 ecs-6a0f k3s[396318]: time="2024-12-05T10:09:44+08:00" level=info msg="stopping etcd"
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.445411+0800","caller":"embed/etcd.go:375","msg":"closing etcd server">
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.446639+0800","caller":"etcdserver/server.go:1513","msg":"skipped lead>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.449657+0800","caller":"embed/etcd.go:579","msg":"stopping serving pee>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.450298+0800","caller":"embed/etcd.go:584","msg":"stopped serving peer>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.450313+0800","caller":"embed/etcd.go:377","msg":"closed etcd server",>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: time="2024-12-05T10:09:44+08:00" level=info msg="Starting etcd for existing cluster member"
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.456783+0800","caller":"embed/etcd.go:127","msg":"configuring peer lis>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.456815+0800","caller":"embed/etcd.go:494","msg":"starting with peer T>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: time="2024-12-05T10:09:44+08:00" level=info msg=start
Dec 05 10:09:44 ecs-6a0f k3s[396318]: time="2024-12-05T10:09:44+08:00" level=info msg="schedule, now=2024-12-05T10:09:44+08:00, entry=1, next=2024-12>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"error","ts":"2024-12-05T10:09:44.456845+0800","caller":"embed/etcd.go:536","msg":"creating peer liste>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.45964+0800","caller":"embed/etcd.go:375","msg":"closing etcd server",>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: {"level":"info","ts":"2024-12-05T10:09:44.459649+0800","caller":"embed/etcd.go:377","msg":"closed etcd server",>
Dec 05 10:09:44 ecs-6a0f k3s[396318]: time="2024-12-05T10:09:44+08:00" level=fatal msg="starting kubernetes: preparing server: start managed database>
Dec 05 10:09:44 ecs-6a0f systemd[1]: k3s.service: Main process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ An ExecStart= process belonging to unit k3s.service has exited.
░░
░░ The process' exit code is 'exited' and its exit status is 1.
Dec 05 10:09:44 ecs-6a0f systemd[1]: k3s.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ The unit k3s.service has entered the 'failed' state with result 'exit-code'.
Dec 05 10:09:44 ecs-6a0f systemd[1]: k3s.service: Unit process 396083 (containerd-shim) remains running after unit stopped.
Dec 05 10:09:44 ecs-6a0f systemd[1]: k3s.service: Unit process 396084 (containerd-shim) remains running after unit stopped.
Dec 05 10:09:44 ecs-6a0f systemd[1]: k3s.service: Unit process 396085 (containerd-shim) remains running after unit stopped.
Dec 05 10:09:44 ecs-6a0f systemd[1]: k3s.service: Unit process 396109 (containerd-shim) remains running after unit stopped.
Dec 05 10:09:44 ecs-6a0f systemd[1]: Failed to start Lightweight Kubernetes.
░░ Subject: A start job for unit k3s.service has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░
░░ A start job for unit k3s.service has finished with a failure.
░░
░░ The job identifier is 28791 and the job result is failed.

@moseszane168
Copy link
Author

The issue here is that you didn't start the server with --cluster-init, so etcd is not enabled, which means you can't take a snapshot.

This is a duplicate of

How Do I Disable IPv6 Binding?

A scenario that doesn't work:
[root@ecs-6a0f ~]# lsof -i :2380COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAMEpd-server 409409 tidb 8u IPv6 41654496 0t0 TCP *:etcd-server (LISTEN)

Row scenario:
[root@ecs-6a0f ~]# lsof -i :2380
COMMAND PID USER FD TYPE DEVICE SIZE/OFF NODE NAME
k3s-serve 409584 root 9u IPv4 41735231 0t0 TCP localhost:etcd-server (LISTEN)
k3s-serve 409584 root 10u IPv4 41735232 0t0 TCP ecs-6a0f:etcd-server (LISTEN)

@brandond
Copy link
Member

brandond commented Dec 5, 2024

You've trimmed all your log lines to terminal width so I have no idea what the errors are here.

Why can't it bind to the wildcard address? I don't think ipv6 is your problem. Are you trying to run multiple copies of k3s on the same host? It looks like the service is failing to start because you already have k3s running in a terminal somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done Issue
Development

No branches or pull requests

3 participants