Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rigorous benchmarking #119

Open
14 tasks
Tracked by #120 ...
JackKelly opened this issue Mar 27, 2024 · 0 comments
Open
14 tasks
Tracked by #120 ...

Rigorous benchmarking #119

JackKelly opened this issue Mar 27, 2024 · 0 comments
Labels
testing_benchmarking_CI Automatically ensuring the code behaves

Comments

@JackKelly
Copy link
Owner

JackKelly commented Mar 27, 2024

Ideas from Jon Gjenset's talk "Towards Impeccable Rust" at Rust Nation UK 2024 on Wednesday 27th March 2024:

Know thyself
Benchmarks should capture the entire performance profile

That means:

  • pathological cases
  • micro ...
  • and macro
  • under, at, and over capacity
  • on all relevant targets (for me: Windows, MacOS, local disks, Google Cloud Storage, S3, Azure, etc.)

Are you benchmarks useful?

  • Don't measure time. Instead, measure instructions. The iai-callgrind crate is useful here. Although valgrind can slow things down a lot.
  • Another trick is to run the benchmark against the main branch and then immediately run against the PR branch. So, if the machine is running slowly today then this reduces noise. The tango crate helps.
  • Minimise measurement noise: ideally, benchmark on a dedicated host, when it's not under load.

Don't just measure speed. Measure all that matter: Also measure:

  • "goodput" (the throughput of useful work... e.g. not just returning error messages!)
  • Memory usage (avg and max)
  • Latency

Outcomes:

  • Simulate real-life inputs
  • Measure system outputs
  • Compare to ground truth (mostly relevant for ML projects)

Simple benchmarks lie. aka "I ran it in a loop and got a bigger number".

  • How you benchmark matters: open vs closed vs partly-open: representative workloads.

What you record matters: mean, median, histogram, cdf. criterion.rs already does a lot of this for us.

Related

@JackKelly JackKelly added the testing_benchmarking_CI Automatically ensuring the code behaves label Mar 27, 2024
@JackKelly JackKelly moved this to Todo in light-speed-io Mar 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
testing_benchmarking_CI Automatically ensuring the code behaves
Projects
Status: Todo
Development

No branches or pull requests

1 participant