You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ideas from Jon Gjenset's talk "Towards Impeccable Rust" at Rust Nation UK 2024 on Wednesday 27th March 2024:
Know thyself
Benchmarks should capture the entire performance profile
That means:
pathological cases
micro ...
and macro
under, at, and over capacity
on all relevant targets (for me: Windows, MacOS, local disks, Google Cloud Storage, S3, Azure, etc.)
Are you benchmarks useful?
Don't measure time. Instead, measure instructions. The iai-callgrind crate is useful here. Although valgrind can slow things down a lot.
Another trick is to run the benchmark against the main branch and then immediately run against the PR branch. So, if the machine is running slowly today then this reduces noise. The tango crate helps.
Minimise measurement noise: ideally, benchmark on a dedicated host, when it's not under load.
Don't just measure speed. Measure all that matter: Also measure:
"goodput" (the throughput of useful work... e.g. not just returning error messages!)
Memory usage (avg and max)
Latency
Outcomes:
Simulate real-life inputs
Measure system outputs
Compare to ground truth (mostly relevant for ML projects)
Simple benchmarks lie. aka "I ran it in a loop and got a bigger number".
How you benchmark matters: open vs closed vs partly-open: representative workloads.
What you record matters: mean, median, histogram, cdf. criterion.rs already does a lot of this for us.
Ideas from Jon Gjenset's talk "Towards Impeccable Rust" at Rust Nation UK 2024 on Wednesday 27th March 2024:
That means:
Are you benchmarks useful?
iai-callgrind
crate is useful here. Although valgrind can slow things down a lot.main
branch and then immediately run against the PR branch. So, if the machine is running slowly today then this reduces noise. Thetango
crate helps.Don't just measure speed. Measure all that matter: Also measure:
Outcomes:
Simple benchmarks lie. aka "I ran it in a loop and got a bigger number".
What you record matters: mean, median, histogram, cdf.
criterion.rs
already does a lot of this for us.Related
The text was updated successfully, but these errors were encountered: