Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd snapshot cleanup fails if node name changes #3714

Closed
thirdeyenick opened this issue Dec 15, 2022 · 8 comments
Closed

etcd snapshot cleanup fails if node name changes #3714

thirdeyenick opened this issue Dec 15, 2022 · 8 comments

Comments

@thirdeyenick
Copy link

thirdeyenick commented Dec 15, 2022

Environmental Info:

RKE2 Version:
rke2 version v1.21.14+rke2r1 (514ae51)
go version go1.16.14b7

Node(s) CPU architecture, OS, and Version:

Linux testmachine 5.15.70-flatcar rancher/rancher#1 SMP Thu Oct 27 12:53:14 -00 2022 x86_64 Intel(R) Xeon(R) Silver 4210R CPU @ 2.40GHz GenuineIntel GNU/Linux

Cluster Configuration:
We have multiple rke2 clusters, but all of them have at least 3 control plane nodes and multiple workers

Describe the bug:
We have multiple rke2 clusters and all of them have automatic etcd snapshots enabled (taken every 5 hours). We also configured s3 uploading of those snapshots. Recently, we found that no s3 snapshots are uploaded anymore. We investigated the issue and found the following rke2-server output:

Dec 14 05:27:01 testmachine[1665]: time="2022-12-14T05:27:01Z" level=error msg="failed to save local snapshot data to configmap: ConfigMap \"rke2-etcd-snapshots\" is invalid: []: Too long: must have at most 1048576 bytes"

I checked the code and found that rke2 is leveraging the etcd snapshot capabilities from k3s for this. A function is executed periodically on all control plane nodes. The function takes local snapshots, uploads them to s3 (if configured) and also reconciles a configmap which contains all snapshots and metadata about them. Looking at the code it seems that the reconcilation of that "sync" configmap is based on the name of the node which executes the etcd snapshot. Same goes for the s3 retention functions (only old objects which contain the node name will be cleaned up). As we are replacing all our nodes in the clusters whenever there is a new flatcar version, the node names change quite often. This leads to orphaned entries in the config map and also orphaned objects in the s3 buckets (although this could be worked around with a lifecycle policy).

Are there any ideas what could be done to fix this?

I found this bug report which describes the too large configmap in the rancher repo.

Steps To Reproduce:

Enable etcd snapshots and s3 uploading. After replacing the control plane nodes with new machines (new names), there will be orphaned entries in the 'rke2-etcd-snapshots' configmap. Whenever the configmap grew too large, no new snapshots will be uploaded to s3 anymore.

Expected behavior:
The sync configmap only contains the snapshots of the current nodes of the clusters and removes all other ones.

@brandond
Copy link
Contributor

I'll talk this over with the team. On the S3 side, the correct behavior is probably to retain N snapshots regardless of whether or not they match the current node name.

cc @briandowns @cwayne18

@brandond
Copy link
Contributor

brandond commented Apr 3, 2023

Still needs to be worked.

@riuvshyn
Copy link

It is not only fails to upload new backups but also filling up masters disc space with local snapshots which are not cleaned up when CM grows too large and fails to apply which leads to an incident as it puts master nodes to disc pressure.

@brandond
Copy link
Contributor

@riuvshyn we are working this separate from the snapshot list configmap issue. This issue will serve only to track the issue of snapshot cleanup only handling snapshots whose name contains the current node's hostname.

@vitorsavian
Copy link
Member

/backport v1.26.8+rke2r1

@vitorsavian
Copy link
Member

/backport v1.25.13+rke2r1

@vitorsavian
Copy link
Member

/backport v1.24.17+rke2r1

@aganesh-suse
Copy link

aganesh-suse commented Aug 14, 2023

Validated on master branch with commit c3ec545

Environment Details

Infrastructure

  • Cloud
  • Hosted

Node(s) CPU architecture, OS, and Version:

cat /etc/os-release | grep PRETTY
PRETTY_NAME="Ubuntu 22.04.2 LTS"

Cluster Configuration:

Server config: 3 etcd, control planes servers/1 agent config

Config.yaml:

Main ETCD SERVER (+CONTROL PLANE) CONFIG:

token: blah
node-name: "server1"
etcd-snapshot-retention: 2
etcd-snapshot-schedule-cron: "* * * * *"
etcd-s3: true
etcd-s3-access-key: xxx
etcd-s3-secret-key: xxx
etcd-s3-bucket: s3-bucket-name
etcd-s3-folder: rke2snap/commit-setup
etcd-s3-region: us-east-2
write-kubeconfig-mode: "0644"

Sample Secondary Etcd, control plane config.yaml:

token: blah
server: https://x.x.x.x:9345
node-name: "server3"
write-kubeconfig-mode: "0644"

AGENT CONFIG:

token: blah
server: https://x.x.x.x:9345
node-name: "agent1"

Additional files

Testing Steps

  1. Create config dir and place the config.yaml file in server/agent nodes:
$ sudo mkdir -p /etc/rancher/rke2 && sudo cp config.yaml /etc/rancher/rke2

Note: First round node-names:
<version|commit>-server1
server2
server3
agent1
2. Install RKE2:
Using Commit:

curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_COMMIT='c3ec545e153916bee18b2ce0fc000eb538a0790d' INSTALL_RKE2_TYPE='server' INSTALL_RKE2_METHOD=tar sh -

Using Version:

curl -sfL https://get.rke2.io | sudo INSTALL_RKE2_VERSION='v1.27.4+rke2r1' INSTALL_RKE2_TYPE='server' INSTALL_RKE2_METHOD=tar sh -
  1. Wait for 2 minutes.
    Note: The snapshot gets created every 1 minute (etcd-snapshot-schedule-cron: "* * * * *") . Retention is for 2 snapshots (etcd-snapshot-retention: 2).
    Reference for cron job format: https://cloud.google.com/scheduler/docs/configuring/cron-job-schedules
    After 2 minutes: 2 snapshots are created with name etcd-snapshot-server1-2-xxxx if node-name: server1-2 in config.yaml),
  2. Check outputs of:
sudo ls -lrt /var/lib/rancher/rke2/server/db/snapshots
sudo rke2 etcd-snapshots list

4a. Also check the s3 bucket/folder in aws to see the snapshots listed.
5. Update the node-name in the config.yaml:
node-names:
<version|commit>-server1-<|suffix1>
server2-<|suffix1>
server3-<|suffix1>
agent1-<|suffix1>
6. restart the rke2 service for all nodes.

sudo systemctl restart rke2-server
  1. Wait for 2 more minutes and check the snapshot list:
sudo ls -lrt /var/lib/rancher/rke2/server/db/snapshots
sudo rke2 etcd-snapshots list

7a. Also check the s3 bucket/folder in aws to see the snapshots listed.

  1. Repeat steps 5 through 7 once more.
    node names:
    <version|commit>-server1-<|suffix1>-<|suffix2>
    server2-<|suffix1>-<|suffix2>
    server3-<|suffix1>-<|suffix2>
    agent1-<|suffix1>-<|suffix2>

Replication Results:

  • rke2 version used for replication:

SETUP:

$ rke2 -v
rke2 version v1.27.4+rke2r1 (3aaa57a9608206d95eeb9ce3f79c0ec2ea912b20)
go version go1.20.5 X:boringcrypto

Node-names in order of update for the main etcd server:

version-setup-server1               
version-setup-server1-25477
version-setup-server1-24232-25477           

Final output of snapshot list - after multiple node name changes:

$ sudo ls -lrt /var/lib/rancher/rke2/server/db/snapshots 
total 64320
-rw------- 1 root root  8151072 Aug 11 18:10 etcd-snapshot-version-setup-server1-1691777403
-rw------- 1 root root  8364064 Aug 11 18:11 etcd-snapshot-version-setup-server1-1691777464
-rw------- 1 root root 14946336 Aug 11 18:27 etcd-snapshot-version-setup-server1-25477-1691778420
-rw------- 1 root root 14946336 Aug 11 18:28 etcd-snapshot-version-setup-server1-25477-1691778483
-rw------- 1 root root  9715744 Aug 11 18:43 etcd-snapshot-version-setup-server1-24232-25477-1691779380
-rw------- 1 root root  9715744 Aug 11 18:44 etcd-snapshot-version-setup-server1-24232-25477-1691779444
$ sudo rke2 etcd-snapshot list 
time="2023-08-11T18:44:54Z" level=warning msg="Unknown flag --token found in config.yaml, skipping\n"
time="2023-08-11T18:44:54Z" level=warning msg="Unknown flag --etcd-snapshot-retention found in config.yaml, skipping\n"
time="2023-08-11T18:44:54Z" level=warning msg="Unknown flag --etcd-snapshot-schedule-cron found in config.yaml, skipping\n"
time="2023-08-11T18:44:54Z" level=warning msg="Unknown flag --write-kubeconfig-mode found in config.yaml, skipping\n"
time="2023-08-11T18:44:54Z" level=info msg="Checking if S3 bucket xxx exists"
time="2023-08-11T18:44:54Z" level=info msg="S3 bucket xxx exists"
Name                                                       Size     Created
etcd-snapshot-version-setup-server1-1691777403             8151072  2023-08-11T18:10:05Z
etcd-snapshot-version-setup-server1-1691777464             8364064  2023-08-11T18:11:05Z
etcd-snapshot-version-setup-server1-24232-25477-1691779380 9715744  2023-08-11T18:43:02Z
etcd-snapshot-version-setup-server1-24232-25477-1691779444 9715744  2023-08-11T18:44:05Z
etcd-snapshot-version-setup-server1-25477-1691778420       14946336 2023-08-11T18:27:01Z
etcd-snapshot-version-setup-server1-25477-1691778483       14946336 2023-08-11T18:28:05Z

As we can see above, previous snapshots with different node-names are still listed and not cleaned-up.

Validation Results:

  • rke2 version used for validation:
rke2 -v
rke2 version v1.27.4+dev.c3ec545e (c3ec545e153916bee18b2ce0fc000eb538a0790d)
go version go1.20.5 X:boringcrypto

Node names in order of update for the main etcd server:

commit-setup-server1              
commit-setup-server1-23678        
commit-setup-server1-6695-23678   

After updating node-names 2 times, the snapshots listed are:

$ sudo ls -lrt /var/lib/rancher/rke2/server/db/snapshots
total 31928
-rw------- 1 root root 16343072 Aug 14 23:26 etcd-snapshot-commit-setup-server1-6695-23678-1692055563
-rw------- 1 root root 16343072 Aug 14 23:27 etcd-snapshot-commit-setup-server1-6695-23678-1692055620
$ sudo rke2 etcd-snapshot list
WARN[0000] Unknown flag --token found in config.yaml, skipping
WARN[0000] Unknown flag --etcd-snapshot-retention found in config.yaml, skipping
WARN[0000] Unknown flag --etcd-snapshot-schedule-cron found in config.yaml, skipping
WARN[0000] Unknown flag --write-kubeconfig-mode found in config.yaml, skipping
WARN[0000] Unknown flag --cni found in config.yaml, skipping
INFO[0000] Checking if S3 bucket xxx exists
INFO[0000] S3 bucket xxx exists
Name                                                     Size     Created
etcd-snapshot-commit-setup-server1-6695-23678-1692055681 16343072 2023-08-14T23:28:03Z
etcd-snapshot-commit-setup-server1-6695-23678-1692055620 16343072 2023-08-14T23:27:02Z

As we can see, the previous snapshots with old node-names are no longer retained and get cleaned up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants