"standard" benchmark #440

Dieterbe · 2016-12-27T09:34:37Z

some thoughts:

chunkspan of 2 minutes, so that every 2 minutes we persist to cassandra, so that if we benchmark for a few minutes, it becomes "fair" that this will always be incorporated (but we must check that the queues can drain)
secondly data, so that chunks contain 120 points which is a good number.
vegeta must return 100% OK's. Mt must not log any errors
no aggregations - unless benchmark needs to verify something specific to aggregations - because they get saved at longer chunkspans which would make comparing results harder. plus, makes it easier to do benchmark with aggregations enabled and see effect of aggregations.
stats and profiletrigger can be set at 1s since this is what we recommend for prod.

should take snapshots which should be standard MT dashboard + cpu usage graph via collectd/snap
so we can easily compare cpu, memory, golang GC, and vegeta output

the other remaining factor that can now be randomly part of a bench run, that I'm aware of, is our own GC routine which frees us metrics. we could set gc-interval to 0 to disable (but that's not realistic). looking at the code, this currently runs always at 1min after whatever the clean interval is based on the setting. this means that to incorporate this, we could set the interval to a minute.

if we run benchmarks for exactly 2 minutes, then the amount of persist runs as well as our GC routine should be fixed to two, though they may run at different times during the benchmark based on when we started the benchmark, but I think this shouldn't be a big problem.

replay · 2016-12-27T11:00:45Z

Having a standardized benchmark sounds great, and that all makes sense. The thing I'm concerned about at the moment is that when looking at the chunk cache I think there are quite a few more factors that matter.
For example: Is the queried data bigger than the chunk cache size? If so then metrictank needs to evict stuff from cache, which might slow the overall cache down because of locks.
Or: In order to have a realistic workload shouldn't we from time to time query metrics that have not been queried (and cached) yet. I'm not sure what's the best way to add this factor into a standardized test.

stale · 2020-04-04T11:39:25Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

Dieterbe added the perf label Jun 28, 2017

Dieterbe mentioned this issue Oct 1, 2017

Tag index #729

Merged

stale bot added the stale label Apr 4, 2020

stale bot closed this as completed Apr 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"standard" benchmark #440

"standard" benchmark #440

Dieterbe commented Dec 27, 2016 •

edited

Loading

replay commented Dec 27, 2016

stale bot commented Apr 4, 2020

"standard" benchmark #440

"standard" benchmark #440

Comments

Dieterbe commented Dec 27, 2016 • edited Loading

replay commented Dec 27, 2016

stale bot commented Apr 4, 2020

Dieterbe commented Dec 27, 2016 •

edited

Loading