Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Etcd reboot causes RSS to double #7822

Closed
davissp14 opened this issue Apr 26, 2017 · 10 comments
Closed

Etcd reboot causes RSS to double #7822

davissp14 opened this issue Apr 26, 2017 · 10 comments

Comments

@davissp14
Copy link
Contributor

davissp14 commented Apr 26, 2017

Version: 3.1.5

I am doing some fail testing with Etcd and noticed that RSS appears to double in the event of a reboot and seems settle after 5 minutes or so. My initial assumption was a process was being forked and the RSS doubling was just due to copy-on-write. Is my assumption correct? Or is there something else at play here?

This isn't a great image i know, but the red line is RSS and orange is Cache.
screen shot 2017-04-18 at 10 55 42 am

My second question is what metrics should we be listening on scaling purposes? I initially thought that RSS would be good enough, but after a lot of testing it seems RSS is too spiky to be a reliable metric to listen on. I see quite a few situations where RSS quickly spikes to memory capacity then drops quickly as it sends things to SWAP.

Thanks

@gyuho
Copy link
Contributor

gyuho commented Apr 26, 2017

quite a few situations where RSS quickly spikes to memory capacity

How was etcd being stressed? Any details on the workloads?
Just curious, because I've never seen such spikes (c.f. https://github.com/coreos/dbtester/tree/master/test-results/2017Q1-01-etcd-zookeeper-consul#write-1m-keys-256-byte-key-1kb-value-value-clients-1-to-1000)

@davissp14
Copy link
Contributor Author

davissp14 commented Apr 26, 2017

We are using the Etcd benchmark tool.

./benchmark --endpoints="$ENDPOINTS" --conns=100 --clients=1000 put --key-size=100 --key-space-size=10 --sequential-keys --total=2000000 --val-size=20

@davissp14
Copy link
Contributor Author

davissp14 commented Apr 26, 2017

Here's another less extreme example of RSS spiking.
image

@gyuho
Copy link
Contributor

gyuho commented Apr 26, 2017

@davissp14 Thanks for that command. Let me try to profile that workloads.
Is that 3-node cluster?

@davissp14
Copy link
Contributor Author

davissp14 commented Apr 26, 2017

Is that 3-node cluster?

Yep!

@gyuho
Copy link
Contributor

gyuho commented Apr 26, 2017

(pprof) list unsafeRange
Total: 161.04MB
ROUTINE ======================== github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend.unsafeRange in /home/gyuho/go/src/github.com/coreos/etcd/gopath/src/github.com/coreos/etcd/cmd/vendor/github.com/coreos/etcd/mvcc/backend/batch_tx.go
  103.48MB   103.48MB (flat, cum) 64.26% of Total
         .          .    102:	if limit <= 0 {
         .          .    103:		limit = math.MaxInt64
         .          .    104:	}
         .          .    105:	c := bucket.Cursor()
         .          .    106:	for ck, cv := c.Seek(key); ck != nil && bytes.Compare(ck, endKey) < 0; ck, cv = c.Next() {
   51.74MB    51.74MB    107:		vs = append(vs, cv)
   51.74MB    51.74MB    108:		keys = append(keys, ck)
         .          .    109:		if limit == int64(len(keys)) {
         .          .    110:			break
         .          .    111:		}
         .          .    112:	}
         .          .    113:	return keys, vs, nil
(pprof)

Confirmed that most memory usage comes from unsafeRange when restoring boltdb kvstore (during reboot). Think it's just Go GC? After 5-min, this gets reclaimed and back to normal memory usage (while rebooting, spike up to 700MB, and back to 400MB after 5-min).

/cc @heyitsanthony @xiang90
Any thoughts?

@xiang90
Copy link
Contributor

xiang90 commented Apr 26, 2017

yea. it is just go GC.

@heyitsanthony
Copy link
Contributor

There's a TODO in kvstore.go's restore code about loading all the keys into memory, which is probably why it's eating so much memory on boot.

@xiang90
Copy link
Contributor

xiang90 commented Apr 26, 2017

@heyitsanthony yea... i am not sure if this worth to fix. this is the first time someone bring it up.

@davissp14
Copy link
Contributor Author

davissp14 commented Apr 26, 2017

For additional context I am working on trying to figure out a scaling scheme for Etcd on Compose. Right now in my testing, nodes are being prematurely scaled due to the RSS spikes during reboots, which isn't ideal. If it didn't take 5 minutes to settle it wouldn't be such an issue, but right now I don't have a very good scaling story using RSS. Any other suggestions on how to deal with this?

heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue May 9, 2017
Loading all keys at once would cause etcd to use twice as much
memory than it would need to serve the keys, causing RSS to spike on
boot. Instead, load the keys into the mvcc by chunk. Uses pipelining
for some concurrency.

Fixes etcd-io#7822
@heyitsanthony heyitsanthony self-assigned this May 9, 2017
heyitsanthony pushed a commit to heyitsanthony/etcd that referenced this issue May 10, 2017
Loading all keys at once would cause etcd to use twice as much
memory than it would need to serve the keys, causing RSS to spike on
boot. Instead, load the keys into the mvcc by chunk. Uses pipelining
for some concurrency.

Fixes etcd-io#7822
gyuho pushed a commit that referenced this issue Jun 1, 2017
Loading all keys at once would cause etcd to use twice as much
memory than it would need to serve the keys, causing RSS to spike on
boot. Instead, load the keys into the mvcc by chunk. Uses pipelining
for some concurrency.

Fixes #7822
yudai pushed a commit to yudai/etcd that referenced this issue Oct 5, 2017
Loading all keys at once would cause etcd to use twice as much
memory than it would need to serve the keys, causing RSS to spike on
boot. Instead, load the keys into the mvcc by chunk. Uses pipelining
for some concurrency.

Fixes etcd-io#7822
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

4 participants