Native & WasmJs benchmark: executor incorrectly estimates the number of repetitions within each measurement iteration #177

fzhinkin · 2024-01-10T10:46:57Z

During the warmup phase, the native benchmarks executor estimates the number of repetitions for each measurement iteration (i.e. the number of times the inner loop should invoke a benchmarked function).
Unfortunately, for short-running benchmarks, the machinery required for the warmup adds an overhead that skews the average time per benchmark invocation. That causes the estimated repetitions number to be too low, and as a result, the executor spends significantly less time in each measurement iteration than a user configured.

Consider the following example (https://github.com/fzhinkin/kt-64361-benchmarks/blob/main/kmp-benchmarks/src/commonMain/kotlin/SignumBenchmarks.kt):

… org.example.LongSignumBenchmark.signBitExtractingSignum
Warm-up #0: 24.6405 ns/op
Warm-up #1: 24.7149 ns/op
Warm-up #2: 24.7893 ns/op
Warm-up #3: 24.6666 ns/op
Warm-up #4: 24.6982 ns/op
Iteration #0: 3.97785 ns/op
Iteration #1: 3.99065 ns/op
Iteration #2: 3.95236 ns/op
...

As you can see, the average time reported during warmup is 6 times lower compared to what was measured later.
Not sure how to represent it in a textual form, but if you run the benchmark locally and watch the log messages being printed, you'll notice that each measurement iteration takes significantly less time compared to the warmup.
In my case, the duration of a single iteration was set to 1 second, and during warmup, it felt like it took about that time for each iteration. But when it comes to measurement, each iteration ends almost instantly.

The problem here is that the warmup phase makes a call to get a current timestamp after every single benchmark method call. For long-running benchmarks, it's hard to notice a difference, but for short-running benchmarks, it adds a significant overhead.

As it may not be possible to create a separate thread performing the warmup so it could be interrupted by the timer on all platforms (or we can simply create a pthread?), maybe we can adjust the procedure to check the elapsed time every N benchmarks calls?

The text was updated successfully, but these errors were encountered:

fzhinkin · 2024-03-28T15:40:18Z

The same issue is also present in WasmJs executor

qurbonzoda · 2024-05-27T17:08:10Z

Related issue: #19

fzhinkin added the bug Something isn't working label Jan 10, 2024

fzhinkin changed the title ~~Native benchmark: executor incorrectly estimates the number of repetitions within each measurement iteration~~ Native & WasmJs benchmark: executor incorrectly estimates the number of repetitions within each measurement iteration Mar 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Native & WasmJs benchmark: executor incorrectly estimates the number of repetitions within each measurement iteration #177

Native & WasmJs benchmark: executor incorrectly estimates the number of repetitions within each measurement iteration #177

fzhinkin commented Jan 10, 2024

fzhinkin commented Mar 28, 2024

qurbonzoda commented May 27, 2024

Native & WasmJs benchmark: executor incorrectly estimates the number of repetitions within each measurement iteration #177

Native & WasmJs benchmark: executor incorrectly estimates the number of repetitions within each measurement iteration #177

Comments

fzhinkin commented Jan 10, 2024

fzhinkin commented Mar 28, 2024

qurbonzoda commented May 27, 2024