Replies: 14 comments
-
It happened again when I got an
Which I don't understand why or how? Updated logs: |
Beta Was this translation helpful? Give feedback.
-
I had similar problems when using etcd on sdcards (industrial), they can't really handle etcd as it is write intensive. After switching to emmc, etcd was happy. |
Beta Was this translation helpful? Give feedback.
-
Not running on a SD-card, it's running off of a external SSD. |
Beta Was this translation helpful? Give feedback.
-
ok. what i also had to do to make it stable, was cordon the master nodes. |
Beta Was this translation helpful? Give feedback.
-
That seems wierd.... I also got only one master. :) |
Beta Was this translation helpful? Give feedback.
-
I'm running it stable on 4 fedora rpi4 and 4 odroid n2+ with 3 master nodes. But i just found the following, and will give raspberry pi os a other try: Unfortunatly i don't have a other idea. |
Beta Was this translation helpful? Give feedback.
-
If you have only a single server, there's not really any point in using etcd - especially on raspberry pi, where CPU and IO is already somewhat constrained. You can't go back to sqlite from etcd, but you might consider rebuilding the cluster at some point, and not using etcd. The logs show that your storage (even if it is ssd) is not able to keep up, and it is frequently taking several seconds for etcd to sync your changes to disk - to the point where leader elections are timing out. This is almost exclusively caused by high storage If you're on a node with older iptables, you might take a look at the |
Beta Was this translation helpful? Give feedback.
-
The crashing was already happening when running SQLite (not sure why tho) and as that doesn't keep db backups, I thought etcd would be better used for that reason. It seems that I should move the master to be not on a Pi or rather have it running on a CM4 with emmc storage? |
Beta Was this translation helpful? Give feedback.
-
I have personally run K3s on a Pi4b with SSD using etcd with no issues. I have also used sqlite on SDHC without issues. However in both cases I made sure that IO-intensive workloads were not using the same disk as the datastore - I put everything on NFS PVCs and minimized large image pull operations. The key is just to make sure that there's not a lot of other IO that will need to be flushed before the datastore write can be completed. |
Beta Was this translation helpful? Give feedback.
-
It seems that longhorn was scheduled on the master, which is probably a bad thing so I evicted it via toleration. Let's see if that helps. |
Beta Was this translation helpful? Give feedback.
-
Oh yeah that would do it. If you're going to do LH, try to put it on a separate physical disk from the datastore to avoid competing with it for iops. |
Beta Was this translation helpful? Give feedback.
-
Two USB enclosures with SSD's would probably result in competing for USB bandwidth (at least on a Pi4). :) |
Beta Was this translation helpful? Give feedback.
-
I got me some RasPiKey which are EMMC storage keys to be used in the sdcard slot. |
Beta Was this translation helpful? Give feedback.
-
I'm going to convert this to a discussion just in case someone runs into the same thing. Doesn't appear to be a clear K3s bug though. |
Beta Was this translation helpful? Give feedback.
-
Environmental Info:
K3s Version: v1.24.4+k3s1 (c3f830e)
go version go1.18.1
Node(s) CPU architecture, OS, and Version:
arm64, Ubuntu 22.04 (all except two)
Linux k8s-master1 5.15.0-1021-raspi #23-Ubuntu SMP PREEMPT Fri Nov 25 15:27:43 UTC 2022 aarch64 aarch64 aarch64 GNU/Linux
amd64, Ubuntu 22.04 x 2
Linux k8s-worker-amd64-0 5.15.0-56-generic #62-Ubuntu SMP Tue Nov 22 19:54:14 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
Cluster Configuration:
1 server, 5 agents
Describe the bug:
My k3s apiserver seems to frequently crash / auto-restart
Steps To Reproduce:
Expected behavior:
I would expect it to not keep frequently crashing.
Actual behavior:
Frequent crashes / auto-restarts of service
Additional context / logs:
k3s.log
Beta Was this translation helpful? Give feedback.
All reactions