-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
proposal: testing: add B method for adding stats #26037
Comments
On a second read, perhaps |
Not a float64? /cc @aclements |
I couldn't think of any non-integral metrics, and an int has the advantage of making it clear that the intended usage is to report a total count, not count/b.N. But I'm not opposed to a float64. We'd also want to make sure that downstream tools (like benchstat) are prepared to parse floats, on pain of ending up in a weird situation similar to testing.AllocsPerRun: |
Seems fine to me and @ianlancetaylor, @aclements, @rsc? |
Yes please! Though I'm not so sure about implicitly dividing by b.N or automatically adding "/op". Many custom benchmarks report metrics that are not per-op. gcbench reports many metrics that are aggregated differently across iterations, such as quantiles and time-rates. x/benchmarks reports things like peak RSS (arguably also a quantile). benchcmd is similar. Users can easily divide by b.N themselves, or we could provide two methods to make the distinction clear. In this case, the type should clearly be float64, though I would argue that it should be float64 even if the framework implicitly divides by b.N. We already made this mistake with the most common metric: "time" is not an integral resource, and we have to multiply it by 109 in order to pretend that it is. |
This may be a dup of #16110, though the discussion there got a bit confusing. |
I agree. I'll close that one in favor of this, since this one seems to be moving along.
I'm inclined to add only a single method, and use examples and docs to get the rest of the way there. About out of laptop time, will update this proposal with something concrete soon. |
New proposal: // ReportMetric adds n/unit to the reported benchmark results.
// If the metric is per-iteration, the caller should divide by b.N,
// and by convention units should end in "/op".
// ReportMetric will panic if unit is the empty string
// or if unit contains any whitespace.
func (b *testing.B) ReportMetric(n float64, unit string) Examples to include something like a swap count (sorting) and a cache hit rate (sync.Pool? other?). BenchmarkResult should also have a new field:
Open question: What should we do if ReportMetric is called multiple times with the same unit? (Overwrite? Panic? Ignore?) What if ReportMetric is called with a reserved unit like "ns/op"? I'm inclined to answer overwrite and panic, respectively, although documenting that is annoying. Ready for a new round of input, thanks! |
SGTM.
I agree that it should overwrite if called multiple times with the same unit (anything else seems more annoying or needlessly constrained). I worry more about panicking on reserved units. Is "allocs/op" always reserved or only with How does this interact with parallel benchmarks? |
In the description two messages up, "n/unit" should be "n unit" I think. Otherwise this seems OK. Will leave for Austin to accept the proposal when ready. |
@josharian, @aclements, any ideas about parallel benchmarks or other details needed to accept this proposal? |
I think if we're taking the value and unit from the user at face value, parallel benchmarks don't present any particular complication. We'll just report the value and unit in the benchmark line, whether or not it's parallel. (I'm not sure what, if anything, I had in mind when I questioned how this would interact with parallel benchmarks.) So, tweaking Josh's proposal, how about: // ReportMetric adds n unit to the reported benchmark results.
// If the metric is per-iteration, the caller should divide by b.N,
// and by convention units should end in "/op".
// If ReportMetric is called multiple times for the same unit,
// later calls override earlier calls.
// If ReportMetric is called for a unit normally reported by the
// benchmark framework itself (such as "ns/op"), n will override
// the value reported by the benchmark framework.
// ReportMetric will panic if unit is the empty string
// or if unit contains any whitespace.
func (b *testing.B) ReportMetric(n float64, unit string) plus extending
|
@aclements LGTM, thanks. |
I started prototyping this and ran into two problems.
I'm not sure how to implement this. The built-in units are computed from fields in Also, what should |
I'm inclined to say that we do what it says on the box: (I was originally drawn to panicking here. You asked, though: 'Is "allocs/op" always reserved or only with -benchmem? What if we add more built-in units in the future?' I think the answer to the first question is probably "always", but I don't see any good answer to the second question.)
Agreed. |
Reminds me of "User-defined counters" in Google Benchmark: https://github.com/google/benchmark#user-defined-counters |
What is the status on this work? |
There are some outstanding questions—see Austin’s last post, above. Austin, perhaps you could post your WIP CL so someone could see whether my proposed answers above make sense in context and/or suggestion other answers. (That someone might be me, but I can’t commit to doing it in a timely way.) |
Ok, I would love to see progress made on this topic. I might be able to spend some cycles on this. |
Change https://golang.org/cl/160097 mentions this issue: |
I found a use case for this. I just wrote a benchmark where the only purpose is to measure STW time. The ns/op of the benchmark itself are irrelevant, so it would be nice to override this metric so the "op" is GC STW. |
Nice. Makes sense. I think that my proposed approach should be able to handle this.
Can you put up your prototype, whatever shape it is in? Maybe we can collectively get it finished. |
I've been following this issue and noticing many times when it would be helpful in my own work. I am also available to work on or review this feature in the next couple of months. Is it feasible to get this in for 1.13? |
One other detail question: currently, if MB/s is 0 we don't report it (which is probably just an artifact of that being the default value), but we report ns/op even if it's 0. Should we not? That will never happen in a "real" benchmark, and would give a way for users to suppress the ns/op metric if it wasn't meaningful for a particular benchmark. |
Change https://golang.org/cl/166717 mentions this issue: |
Suppressing it when 0 sounds sensible to me. |
One possible flaw with suppressing a metric when zero: Consider what happens if I try to run comparisons between two benchmark result sets, and one of them actually manages to, say, eliminate allocations from a particular use case. I no longer have a reported allocations for that benchmark, which means the comparison may not work as expected. |
In the discussion in the CL we (tentatively) settled zero-is-special only for ns/op. |
In which release should we expect to see this cool functionality? |
It's now committed, so it should appear in the next release (Go 1.13). |
Probably 1.13. (There’s always a chance it gets reverted.) |
Thanks, looking forward to using it. |
Package testing has built-in support for three statistics: wall time, allocations, and memory allocated. There is discussion of adding more: #24918.
I propose that we also expose to users the ability to report their own benchmark-specific statistics. An example comes from package sort (proximally via #26031): The number of calls to Swap and Less.
Proposed API:
This will cause benchmark invocations to include
<n/b.N> <units>/op
in the benchmark output. Then the sort of output reported in #26031 could be used as input to benchstat.The text was updated successfully, but these errors were encountered: