Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bump golang version to 1.19.1 #14463

Merged
merged 4 commits into from
Sep 22, 2022
Merged

Bump golang version to 1.19.1 #14463

merged 4 commits into from
Sep 22, 2022

Conversation

ahrtr
Copy link
Member

@ahrtr ahrtr commented Sep 14, 2022

Kubernetes already upgraded to 1.19 since release-1.25, so etcd should upgrade to 1.19 as well.

Signed-off-by: Benjamin Wang wachao@vmware.com

Please read https://github.com/etcd-io/etcd/blob/main/CONTRIBUTING.md#contribution-flow.

@ahrtr ahrtr marked this pull request as draft September 14, 2022 11:00
@ahrtr
Copy link
Member Author

ahrtr commented Sep 14, 2022

Related to #14449

@ahrtr ahrtr force-pushed the bump_go_1.19 branch 3 times, most recently from 868466b to 7578ecc Compare September 15, 2022 03:10
@codecov-commenter
Copy link

codecov-commenter commented Sep 15, 2022

Codecov Report

Merging #14463 (e398e46) into main (6333f37) will decrease coverage by 0.15%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main   #14463      +/-   ##
==========================================
- Coverage   75.35%   75.20%   -0.16%     
==========================================
  Files         457      457              
  Lines       37258    37258              
==========================================
- Hits        28077    28019      -58     
- Misses       7413     7459      +46     
- Partials     1768     1780      +12     
Flag Coverage Δ
all 75.20% <ø> (-0.16%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
client/v3/internal/endpoint/endpoint.go 87.50% <ø> (ø)
client/v3/txn.go 100.00% <ø> (ø)
etcdctl/main.go 40.00% <ø> (ø)
pkg/adt/interval_tree.go 87.21% <ø> (ø)
raft/confchange/confchange.go 82.06% <ø> (ø)
server/etcdserver/raft.go 88.29% <ø> (ø)
server/main.go 0.00% <ø> (ø)
server/storage/mvcc/key_index.go 64.16% <ø> (ø)
server/storage/mvcc/watchable_store.go 88.76% <ø> (-1.09%) ⬇️
client/pkg/v3/tlsutil/tlsutil.go 83.33% <0.00%> (-8.34%) ⬇️
... and 19 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ahrtr ahrtr marked this pull request as ready for review September 15, 2022 08:21
@ahrtr
Copy link
Member Author

ahrtr commented Sep 15, 2022

This PR is ready for review. The Release failure is because it always run Release on main branch, which still requires go 1.17.8. I believe the issue should disappear after merging this PR.

I suggest to review this PR commit by commit. cc @ptabor @serathius @spzala @mitake

@ahrtr ahrtr force-pushed the bump_go_1.19 branch 2 times, most recently from 9192a96 to 29bb9a3 Compare September 18, 2022 05:27
@chaochn47
Copy link
Member

Kubernetes 1.24 bumped version of golang it is compiled with to go1.18, which introduced significant changes to its garbage collection algorithm. As a result, we observed an increase in memory usage for kube-apiserver in larger an heavily loaded clusters up to ~25% (with the benefit of API call latencies drop by up to 10x on 99th percentiles). If the memory increase is not acceptable for you you can mitigate by setting GOGC env variable (for our tests using GOGC=63 brings memory usage back to original value, although the exact value may depend on usage patterns on your cluster). (kubernetes/kubernetes#108870, @dims)

ref. https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.24.md

This is not blocker for upgrading to 1.19.

However, I think it's something we should have some visibility into, e.g. comparing memory used between latest stale release and main branch. @ahrtr

For example, looks like etcdctl check datascale is a good candidate for evaluating the impact but it was added 4 years ago. /cc @spzala

// NewCheckDatascaleCommand returns the cobra command for "check datascale".
func NewCheckDatascaleCommand() *cobra.Command {
cmd := &cobra.Command{
Use: "datascale [options]",
Short: "Check the memory usage of holding data for different workloads on a given server endpoint.",
Long: "If no endpoint is provided, localhost will be used. If multiple endpoints are provided, first endpoint will be used.",
Run: newCheckDatascaleCommand,
}
cmd.Flags().StringVar(&checkDatascaleLoad, "load", "s", "The datascale check's workload model. Accepted workloads: s(small), m(medium), l(large), xl(xLarge)")
cmd.Flags().StringVar(&checkDatascalePrefix, "prefix", "/etcdctl-check-datascale/", "The prefix for writing the datascale check's keys.")
cmd.Flags().BoolVar(&autoCompact, "auto-compact", false, "Compact storage with last revision after test is finished.")
cmd.Flags().BoolVar(&autoDefrag, "auto-defrag", false, "Defragment storage after test is finished.")
return cmd
}

@ahrtr
Copy link
Member Author

ahrtr commented Sep 19, 2022

There are some memory model changes in go 1.18 & 1.19.

Go 1.18 includes non-heap sources of garbage collector work, so there may be some big change on the memory usage if there are lots of stack memory relative to heap memory. The workaround is to adjust GOGC. I don't think etcd has such issue, but of course, it's worthwhile to verify it.

Go 1.19 supports soft memory limit, and it should be a very good feature to resolve the OOM issue out of the box. Previously some companies resolves the OOM of go program built with go version < 1.19 by some workaround solutions, which are either non-portable or hard to understand & maintain. It's exactly the reason why go 1.19 introduce the soft memory limit. I believe it's also a good benefit for etcd, and there is no any action from etcd side because users limit the memory usage via env GOMEMLIMIT.

In summary, we need to bump go 1.19 anyway. But it's worthwhile to do some benchmark test, so that we can provide some guide & reference to users.

@ahrtr
Copy link
Member Author

ahrtr commented Sep 19, 2022

FYI. Kubernetes did similar test kubernetes/kubernetes#108357 (comment)

@ahrtr
Copy link
Member Author

ahrtr commented Sep 21, 2022

The performance result shows that etcd built with go1.19.1 is a little better than etcd built with 1.17.9.

Command

./etcd  --quota-backend-bytes=4300000000  
./benchmark txn-put --endpoints="http://127.0.0.1:2379" --clients=200 --conns=200 --key-space-size=4000000000 --key-size=128 --val-size=10240  --total=200000 --rate=40000

Linux server configuration

MemTotal:        8008956 kB

4 CPUs, each is Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz 

Simple summary

  1. The same CPU usage; both have about 20% idle CPU during the test;
  2. 2.32 GB resident_memory (etcd built with 1.17.9) vs 2.14 GB resident_memory (etcd built with 1.19.1)
  3. etcd built with 1.19.1 has a little better throughput, 5119 vs 5752

Result for etcd built with 1.17.9 on main branch

Summary:
  Total:	39.0681 secs.
  Slowest:	0.1801 secs.
  Fastest:	0.0041 secs.
  Average:	0.0386 secs.
  Stddev:	0.0200 secs.
  Requests/sec:	5119.2674

Response time histogram:
  0.0041 [1]	|
  0.0217 [32015]	|∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0393 [94517]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0569 [38175]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0745 [21987]	|∎∎∎∎∎∎∎∎∎
  0.0921 [9167]	|∎∎∎
  0.1097 [3185]	|∎
  0.1273 [739]	|
  0.1449 [117]	|
  0.1625 [50]	|
  0.1801 [47]	|

Latency distribution:
  10% in 0.0197 secs.
  25% in 0.0241 secs.
  50% in 0.0315 secs.
  75% in 0.0498 secs.
  90% in 0.0665 secs.
  95% in 0.0802 secs.
  99% in 0.1002 secs.
  99.9% in 0.1283 secs.
 # curl http://127.0.0.1:2379/metrics | grep memory
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  155k    0  155k    0     0  12.0M      0 --:--:-- --:--:-- --:--:-- 12.6M
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 2.323267584e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 8.417144832e+09
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19

Result for etcd built with 1.19.1 on bump_go_1.19 branch

Summary:
  Total:	34.7683 secs.
  Slowest:	0.1630 secs.
  Fastest:	0.0031 secs.
  Average:	0.0343 secs.
  Stddev:	0.0163 secs.
  Requests/sec:	5752.3596

Response time histogram:
  0.0031 [1]	|
  0.0191 [22397]	|∎∎∎∎∎∎∎∎
  0.0351 [108348]	|∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎∎
  0.0511 [34954]	|∎∎∎∎∎∎∎∎∎∎∎∎
  0.0671 [25549]	|∎∎∎∎∎∎∎∎∎
  0.0831 [6477]	|∎∎
  0.0990 [1683]	|
  0.1150 [495]	|
  0.1310 [70]	|
  0.1470 [5]	|
  0.1630 [21]	|

Latency distribution:
  10% in 0.0188 secs.
  25% in 0.0225 secs.
  50% in 0.0287 secs.
  75% in 0.0439 secs.
  90% in 0.0585 secs.
  95% in 0.0658 secs.
  99% in 0.0849 secs.
  99.9% in 0.1082 secs.
# curl http://127.0.0.1:2379/metrics | grep memory
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  155k    0  155k    0     0  12.5M      0 --:--:-- --:--:-- --:--:-- 13.7M
# HELP process_resident_memory_bytes Resident memory size in bytes.
# TYPE process_resident_memory_bytes gauge
process_resident_memory_bytes 2.142928896e+09
# HELP process_virtual_memory_bytes Virtual memory size in bytes.
# TYPE process_virtual_memory_bytes gauge
process_virtual_memory_bytes 8.49461248e+09
# HELP process_virtual_memory_max_bytes Maximum amount of virtual memory available in bytes.
# TYPE process_virtual_memory_max_bytes gauge
process_virtual_memory_max_bytes 1.8446744073709552e+19

@ahrtr
Copy link
Member Author

ahrtr commented Sep 21, 2022

@serathius @mitake @spzala @ptabor This PR should be ready to go. Please take a look.

Just as I mentioned in #14463 (comment), the benefit of go 1.19.1 is that users can resolve the OOM issue by setting the env GOMEMLIMIT, and there is no any action from etcd side. Of course, we might need to update the doc or changelog.

Copy link
Member

@serathius serathius left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, any expected breaking changes we should inform users in changelog?

For example there were some problematic changes between go 1.16 and 1.17

@ahrtr
Copy link
Member Author

ahrtr commented Sep 21, 2022

LGTM, any expected breaking changes we should inform users in changelog?

For example there were some problematic changes between go 1.16 and 1.17

No any breaking change for users, including applications which depend on the etcd packages and the etcd binaries or image users. Go 1.19 still keeps the Go 1 promise of compatibility

Signed-off-by: Benjamin Wang <wachao@vmware.com>
Signed-off-by: Benjamin Wang <wachao@vmware.com>
1. run ./scripts/fix.sh;
2. cd tools/mod; gofmt -w . & go mod tidy;

Signed-off-by: Benjamin Wang <wachao@vmware.com>
Some comments in the file are formatted automatically into ugly style,
because the hierarchical structure is missing. After removing the
leading numbers in the comments, `go fmt` will not format the comments
anymore.

Signed-off-by: Benjamin Wang <wachao@vmware.com>
@ahrtr
Copy link
Member Author

ahrtr commented Sep 22, 2022

Just rebased this PR. Will merge when the test green.

@ahrtr ahrtr merged commit 997260a into etcd-io:main Sep 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants