Rounding of percentiles (premature optimization?) #255

rnewson · 2015-03-16T13:36:31Z

I noticed that all my percentile values appear to be rounded. On investigation, it's deliberate (commit 7f0b2d6) for bandwidth reasons.

Is it true that sending 58000 is more efficient than sending 58459? Is it true enough to warrant discarding this much data?

The text was updated successfully, but these errors were encountered:

heyman · 2015-03-16T13:59:28Z

Hi!

It's not that sending the integer 58000 takes up less space than 58459. We construct a dict whose keys are the response time, and the values are the number of requests that are completed in that response time.

So let's say we have a test running whose response times are between 200 ms and 20k ms. If you run that test long enough (without the rounding), you're going to have a dict that contains almost 20k keys, and dicts that have at least a couple of thousand keys are going to be sent from each slave node to the master node every third second (and you might have many hundreds of slave nodes). Compare this to the current version (when we do rounding) where the dict wouldn't have more than 310 keys.

Also, I can't see any load testing scenario where the lost precision of the rounding would matter.

rnewson · 2015-03-16T14:28:43Z

ok, I see how that reduces the size of the dict, though it raises further questions (for another time).

Perhaps the reporting of percentiles should 'x' out the values to more clearly demonstrate that the values have been rounded? I first starting looking at it when I saw all my stats rounded.

heyman · 2015-03-16T15:26:21Z

The reasoning behind it is that +/-10ms rounding is okay for for requests between 100 ms and 10k ms, and +/-100 is OK for >1k < 10k, etc. There might be cases where you would benefit from knowing that the 95th percentile is exactly 58459 ms rather than between 57500 and 58499, but if they exist they should be extremely rare.

I don't think we should "X out" the zeroes (since that might complicate further processing of the data), but it could probably be a good idea to document this behaviour better.

rehevkor5 · 2021-10-06T00:04:36Z

I am not sure that the histogram bucketing approach is necessarily a problem, although I believe there are probably ways to implement a fully accurate report at the end without worrying about streaming data to a master node during test execution. And I think that the UI should communicate some kind of accuracy estimate, such as "+/- X" where X is some value.

Specifically, I think the target of opportunity is the logic here

locust/locust/stats.py

Line 77 in bdf5943

for response_time in sorted(response_times.keys(), reverse=True):

in calculate_response_time_percentile which ceilings the latency to the bucket rather than interpolating. The idea is that there's a difference (mathematically speaking) between the number of requests in the percentile vs. the number of requests in the buckets. If the difference is 0% of the last bucket, then indeed the bucket is the value. However, if the difference is 99% of the last bucket, then the value should be far closer to the previous bucket.

You can see an example of this in Prometheus' code for histogram_quantile https://github.com/prometheus/prometheus/blob/main/promql/quantile.go#L114

You may also be interested in Micrometer's approach to choosing the bucket values, in this class: https://github.com/micrometer-metrics/micrometer/blob/main/micrometer-core/src/main/java/io/micrometer/core/instrument/distribution/PercentileHistogramBuckets.java

However, Locust has no particular need to use hard-coded histogram buckets. That being the case, it might make more sense to use a more accurate summary mechanism such as http://hdrhistogram.org/

cyberw · 2021-10-06T05:22:25Z

Hi @rehevkor5 ! That sounds like a nice improvement. I’ve always felt that the percentiles are not accurate enough. Do you have time to look into implementing it?

rehevkor5 · 2021-10-06T17:55:45Z

Cool. I will put it on my list but I can't make any guarantees :)

heyman closed this as completed Mar 16, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rounding of percentiles (premature optimization?) #255

Rounding of percentiles (premature optimization?) #255

rnewson commented Mar 16, 2015

heyman commented Mar 16, 2015

rnewson commented Mar 16, 2015

heyman commented Mar 16, 2015

rehevkor5 commented Oct 6, 2021

cyberw commented Oct 6, 2021

rehevkor5 commented Oct 6, 2021

Rounding of percentiles (premature optimization?) #255

Rounding of percentiles (premature optimization?) #255

Comments

rnewson commented Mar 16, 2015

heyman commented Mar 16, 2015

rnewson commented Mar 16, 2015

heyman commented Mar 16, 2015

rehevkor5 commented Oct 6, 2021

cyberw commented Oct 6, 2021

rehevkor5 commented Oct 6, 2021