Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for cluster version to non-default value to remove errors when cluster is bootstrapping. #19068

Merged
merged 1 commit into from
Dec 16, 2024

Conversation

serathius
Copy link
Member

Don't think those errors are useful or understandable. Function UpdateStorageVersionIfNeeded was supposed to wait until cluster version is set, however it missed case that cluster version is set to "3.0.0" temporarily before all members join.

{"level":"warn","ts":"2024-12-16T13:21:15.690381+0100","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"fd422379fda50e48","rtt":"0s","error":"dial tcp 127.0.0.1:32380: connect: connection refused"}
{"level":"warn","ts":"2024-12-16T13:21:15.690398+0100","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"fd422379fda50e48","rtt":"0s","error":"dial tcp 127.0.0.1:32380: connect: connection refused"}
{"level":"info","ts":"2024-12-16T13:21:16.082165+0100","caller":"version/monitor.go:116","msg":"cluster version differs from storage version.","cluster-version":"3.0.0","storage-version":"3.5.0"}
{"level":"error","ts":"2024-12-16T13:21:16.082221+0100","caller":"version/monitor.go:120","msg":"failed to update storage version","cluster-version":"3.0.0","error":"cannot create migration plan: version \"3.5.0\" is not supported","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver/version.(*Monitor).UpdateStorageVersionIfNeeded\n\tgo.etcd.io/etcd/server/v3/etcdserver/version/monitor.go:120\ngo.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).monitorStorageVersion\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:2295\ngo.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).GoAttach.func1\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:2476"}
{"level":"info","ts":"2024-12-16T13:21:20.083213+0100","caller":"version/monitor.go:116","msg":"cluster version differs from storage version.","cluster-version":"3.0.0","storage-version":"3.5.0"}
{"level":"error","ts":"2024-12-16T13:21:20.083282+0100","caller":"version/monitor.go:120","msg":"failed to update storage version","cluster-version":"3.0.0","error":"cannot create migration plan: version \"3.5.0\" is not supported","stacktrace":"go.etcd.io/etcd/server/v3/etcdserver/version.(*Monitor).UpdateStorageVersionIfNeeded\n\tgo.etcd.io/etcd/server/v3/etcdserver/version/monitor.go:120\ngo.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).monitorStorageVersion\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:2295\ngo.etcd.io/etcd/server/v3/etcdserver.(*EtcdServer).GoAttach.func1\n\tgo.etcd.io/etcd/server/v3/etcdserver/server.go:2476"}
{"level":"warn","ts":"2024-12-16T13:21:20.691121+0100","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_RAFT_MESSAGE","remote-peer-id":"fd422379fda50e48","rtt":"0s","error":"dial tcp 127.0.0.1:32380: connect: connection refused"}
{"level":"warn","ts":"2024-12-16T13:21:20.691137+0100","caller":"rafthttp/probing_status.go:68","msg":"prober detected unhealthy status","round-tripper-name":"ROUND_TRIPPER_SNAPSHOT","remote-peer-id":"fd422379fda50e48","rtt":"0s","error":"dial tcp 127.0.0.1:32380: connect: connection refused"}

/cc @ahrtr

…luster is bootstrapping.

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Copy link

codecov bot commented Dec 16, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 68.79%. Comparing base (3241803) to head (e7df5bf).

Additional details and impacted files
Files with missing lines Coverage Δ
server/etcdserver/version/monitor.go 96.69% <100.00%> (ø)

... and 24 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #19068      +/-   ##
==========================================
+ Coverage   68.75%   68.79%   +0.03%     
==========================================
  Files         420      420              
  Lines       35626    35626              
==========================================
+ Hits        24495    24508      +13     
+ Misses       9712     9694      -18     
- Partials     1419     1424       +5     

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3241803...e7df5bf. Read the comment docs.

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Then we can revert #19060

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ahrtr, serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@serathius
Copy link
Member Author

Then we can revert #19060

Are we sure those errors are coming from setting storage version? Wanted to repro it and confirm.

@serathius serathius merged commit 3acf3e5 into etcd-io:main Dec 16, 2024
35 checks passed
@ahrtr
Copy link
Member

ahrtr commented Dec 16, 2024

Are we sure those errors are coming from setting storage version?

Yes, pretty sure. Let's merge #19069 and keep watching the test failures/error messages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

3 participants