Add standard deviation to benchmarking #1604

jimon · 2019-04-18T10:48:37Z

Description

This PR adds standard deviation calculation to benchmark feature.
Deviation is useful when benchmarking OS dependent things: for example how long it takes to create a thread. Normal benchmark will only return a mean value like 100 us, but deviation will return a sigma, for example 20 us. Via 68–95–99.7 rule we can conclude that 95% of threads creations will be completed in [100 - 2 * 20, 100 + 2 * 20] = [60, 140] us.

Please let me know how can I improve it :)

horenmar · 2019-04-18T13:34:40Z

Quick note before I start going through the commit -- please ensure that when using C stdlib, the includes and namespacing aligns. In other words, you should qualify the call as std::sqrt.

Not doing so breaks VxWorks and QNX.

codecov · 2019-04-18T19:09:56Z

Codecov Report

Merging #1604 into master will decrease coverage by 0.42%.
The diff coverage is 0%.

@@            Coverage Diff            @@
##           master   #1604      +/-   ##
=========================================
- Coverage   80.72%   80.3%   -0.42%     
=========================================
  Files         121     121              
  Lines        3424    3442      +18     
=========================================
  Hits         2764    2764              
- Misses        660     678      +18

codecov · 2019-04-18T19:09:57Z

Codecov Report

Merging #1604 into master will decrease coverage by 0.24%.
The diff coverage is 0%.

@@            Coverage Diff             @@
##           master    #1604      +/-   ##
==========================================
- Coverage   80.72%   80.49%   -0.24%     
==========================================
  Files         121      121              
  Lines        3424     3434      +10     
==========================================
  Hits         2764     2764              
- Misses        660      670      +10

jimon · 2019-04-18T19:13:18Z

@horenmar good catch, fixed it.

horenmar · 2019-04-20T12:35:13Z

Okay, so I went through the changes and I think the PR needs to be done a bit differently.

I think the end state should be that there is still just one BenchmarkLooper type, and one macro to use it, BENCHMARK, with the number of samples to get for the stddev calculation being an optional argument to it.

horenmar · 2019-04-20T15:02:07Z

I added a commit that shows the desired interface and usage 😃

Also the resulting output will need to be able to differentiate between the number of measurements and the number of iterations in a single measurement, so that will need some extra changes in the reporter and the BenchmarkInfo/BenchmarkStats struct.

jimon · 2019-04-20T18:40:55Z

@horenmar I see your point. as I've also started by just changing base looper :) There are two points related to that I would like to discuss:

If precision of the timer is not low enough, and we go in 10x+ loops, then I think deviation might not make much sense as we get average smoothing of sample times. My math understanding says that deviation number that we get this way will be lower than actual number, which might be not good.
Current benchmark looper has very tight loop, if I start adding more branches or measuring sample time every loop, I'm afraid I'll degrade it ability to do precise measurements.

What do you think about this two? How should we solve them?

horenmar · 2019-04-21T09:17:49Z

The deviation should still be measured properly, if it is done right. Let's say there are 10 iterations per measurement, and the time taken per each measurement is T1, T2, ..., Tn. Ta is then the average of single iteration, that is a sum over Ti divided by N * 10. Variance is then the sum over all iterations as (Ti/10 - Ta)^2 / (N * 10).

jimon · 2019-04-23T10:42:38Z

Ah you already fixed everything :) (long weekend here in Sweden :) )

I still don't think we can call it sigma in case if timer resolution is not enough. Because the value that we get is smaller than a value we would get with infinitely precise timer. Let me explain it like this:

Some test code, infinitely precise timer, 10 samples. We measure: 5ns 4ns 6ns ... And we calculate sigma based on this values. This is as precise as we can get.
Same test code, less precise timer, 10 samples, 10 iterations per samples. We actually run code-under-test 100 times and real time that code would take would be: 5ns 4ns 6ns ... (100 times), but what we measure is 50ns 51ns 49ns ... (10 times). Our measurements are amount of time it took to run code-under-test 10 times, we don't have data to calculate variance of individual runs. So while sigma numbers we get are "close enough", it would be incorrect to call it stddev sigma as it will be <= of true sigma.

horenmar · 2019-04-28T17:22:10Z

On further inspection I agree, but I also think I am going to close this PR in favour of #1616, which will bring in a fully featured benchmarking and is going forward faster than I originally expected.

Dmytro Ivanov added 2 commits April 18, 2019 12:42

Add standard deviation to benchmarking

10357be

Adding cmath for sqrt.

9b6d54a

Using std::sqrt

1ec30e7

Removed unused code

65e0f24

Unify the loopers

e0b14fe

Fix MSVC warning

16a2dd0

horenmar added 2 commits April 21, 2019 20:03

deviation measurements in standard looper

fc94f6a

Warning cleanup

38e4e1d

horenmar closed this May 26, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add standard deviation to benchmarking #1604

Add standard deviation to benchmarking #1604

jimon commented Apr 18, 2019

horenmar commented Apr 18, 2019

codecov bot commented Apr 18, 2019

codecov bot commented Apr 18, 2019 •

edited

Loading

jimon commented Apr 18, 2019

horenmar commented Apr 20, 2019

horenmar commented Apr 20, 2019

jimon commented Apr 20, 2019

horenmar commented Apr 21, 2019

jimon commented Apr 23, 2019

horenmar commented Apr 28, 2019

Add standard deviation to benchmarking #1604

Add standard deviation to benchmarking #1604

Conversation

jimon commented Apr 18, 2019

Description

horenmar commented Apr 18, 2019

codecov bot commented Apr 18, 2019

Codecov Report

codecov bot commented Apr 18, 2019 • edited Loading

Codecov Report

jimon commented Apr 18, 2019

horenmar commented Apr 20, 2019

horenmar commented Apr 20, 2019

jimon commented Apr 20, 2019

horenmar commented Apr 21, 2019

jimon commented Apr 23, 2019

horenmar commented Apr 28, 2019

codecov bot commented Apr 18, 2019 •

edited

Loading