Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Best practices for automating compaction / defrag #7607

Closed
davissp14 opened this issue Mar 27, 2017 · 10 comments
Closed

Question: Best practices for automating compaction / defrag #7607

davissp14 opened this issue Mar 27, 2017 · 10 comments
Assignees
Milestone

Comments

@davissp14
Copy link
Contributor

davissp14 commented Mar 27, 2017

Reading through: https://github.com/coreos/etcd/blob/master/Documentation/op-guide/maintenance.md

Looking at compaction:

Since etcd keeps an exact history of its keyspace, this history should be periodically compacted to avoid performance degradation and eventual storage space exhaustion

What metrics should we be using to determine when compaction is necessary or at the very least a good idea?

After compacting the keyspace, the backend database may exhibit internal fragmentation. Any internal fragmentation is space that is free to use by the backend but still consumes storage space. The process of defragmentation releases this storage space back to the file system. Defragmentation is issued on a per-member so that cluster-wide latency spikes may be avoided

What metrics should be used to monitor fragmentation? Initially I assumed that monitoring HeapAlloc and HeapInuse would get me close, but seems I have mistaken.

Any thoughts or advise?

Thanks in advance
.

@xiang90
Copy link
Contributor

xiang90 commented Mar 27, 2017

What metrics should we be using to determine when compaction is necessary or at the very least a good idea?

It depends on your application. If your application is OK with 1hr duration, do 1hr. The shorter duration the better. The limit factor is the total db size though, do not let it grow more than 2GB if you use a normal cloud machine.

What metrics should be used to monitor fragmentation?

This is actually about disk fragmentation. But if you do not suddenly remove a lot of keys and want to reclaim the disk space immediately, you do not need to defrag.

@davissp14
Copy link
Contributor Author

davissp14 commented Mar 27, 2017

Information below pertains to version 3.1.4 using API v3.

I went ahead and ran a few benchmarks in to see if I could get a better understanding of what's going on.

After seeding some data, I went ahead and ran a compact / defrag. I know you mentioned the defrag is a disk fragmentation thing, but it also seems to reclaim the cache along with it.
screenshot from 2017-03-26 18-58-20

Normally I wouldn't really care about the cache as it should be freed upon memory pressure, but, in the benchmarks we are seeing quite a few failcnts starting at the time the cache fills up. Appears that the cache isn't being freed very efficiently.

Note: This is a separate benchmark.
screenshot from 2017-03-27 07-10-25

Failcnts associated with the above benchmark.
screenshot from 2017-03-27 07-10-53

It appears we are seeing a negative performance impact due to the aggressive caching. It also doesn't seem like constantly defragging is a great solution to this problem.

Any thoughts?

@xiang90
Copy link
Contributor

xiang90 commented Mar 27, 2017

It appears we are seeing a negative performance impact due to the aggressive caching. It also doesn't seem like constantly defragging is a great solution to this problem.

Can you share the benchmark result?

@davissp14
Copy link
Contributor Author

davissp14 commented Mar 27, 2017

Benchmark used:

./benchmark --endpoints=$ENDPOINTS --conns=100 --clients=1000 put --key-size=100 --key-space-size=10 --sequential-keys --total=5000000 --val-size=20 --user=root:$PASSWORD

screenshot from 2017-03-27 10-43-38

@xiang90
Copy link
Contributor

xiang90 commented Mar 27, 2017

How did you figure out the cache size has an impact over the benchmark result? I want to see a bench perf over cache size graph. Also note that the more keys you put into etcd the less throughput you might get if you run etcd on slow hdd. The level of btree grows so more io will be needed.

@davissp14
Copy link
Contributor Author

davissp14 commented Mar 27, 2017

I guess I should clarify. Etcd seems to quickly gobble up available memory for caching purposes. Once all available memory has been allocated to cache, we start seeing a lot of cache evictions, which is expected. The result of the constant evictions however, is failcnts and slower response times.

The results below are from a cluster that has been allocated just enough memory to not have to force cache evictions when running the benchmark. I can work on creating a better comparison, but may take a bit due to other obligations.

The AWS instance types I have been testing in are:
i2.4xlarges with ephemeral disk in raid 0
r3.4xlarge with a single SSD

The only resource limitation I am enforcing via cgroups is memory. There doesn't seem to be any significant performance differences between the two instance types. Both have pretty solid I/O though.

screenshot from 2017-03-27 12-26-28

@xiang90
Copy link
Contributor

xiang90 commented Mar 28, 2017

@davissp14

The avg latency is very high for 1.5k throughput. I am more interested in the comparison. I want to see the bad impact of cache. The evicted memory should on the old tree node. If you do seq write, the tree nodes should be all in memory (xMB should be far than enough), I am surprised it has an observable impact.

@gyuho
Copy link
Contributor

gyuho commented Apr 20, 2018

We will document how to monitor this in #9438.

Let's move this discussion to #9438.

@regardfs
Copy link

What metrics should be used to monitor fragmentation?

This is actually about disk fragmentation. But if you do not suddenly remove a lot of keys and want to reclaim the disk space immediately, you do not need to defrag.
@xiang90 That just means no fragmentation will not influence the performance of etcd cluster but only no more disk space freed?

@regardfs
Copy link

@gyuho,hi,What will happen if no defrag proceed, Only no free for disk space? Does it influence the performance of etcd cluster?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

5 participants