proposal: testing: custom benchmark labels #28398

mmcloughlin · 2018-10-25T22:20:19Z

Currently go benchmark output reports some labels:

$ go version
go version go1.11.1 linux/amd64
$ go test -run NONE -bench . crypto/aes
goos: linux
goarch: amd64
pkg: crypto/aes
BenchmarkEncrypt-8   	100000000	        10.5 ns/op	1516.95 MB/s
BenchmarkDecrypt-8   	100000000	        10.2 ns/op	1566.61 MB/s
BenchmarkExpand-8    	20000000	        57.3 ns/op
PASS
ok  	crypto/aes	3.310s

These are added by

go/src/testing/benchmark.go

Lines 277 to 283 in dd78955

    
           labelsOnce.Do(func() { 
        
           	fmt.Fprintf(b.w, "goos: %s\n", runtime.GOOS) 
        
           	fmt.Fprintf(b.w, "goarch: %s\n", runtime.GOARCH) 
        
           	if b.importPath != "" { 
        
           		fmt.Fprintf(b.w, "pkg: %s\n", b.importPath) 
        
           	} 
        
           })

It would be great if the user could add custom labels relevant to the performance of their code. For example:

CPU type/Microarchitecture/CPUID bits
Software version (perhaps git tag)
Presumably many others

I have noticed that the perf.golang.org service appears to support this, but I haven't looked into exactly how. Apologies if I've overlooked some existing feature.

The text was updated successfully, but these errors were encountered:

josharian · 2018-10-26T02:28:00Z

Dup of #26037? I’d love to see that go in for 1.12 but I don’t think either @aclements or I have the bandwidth to push it through.

josharian · 2018-10-26T02:28:44Z

Oh, reading closer, not a dup. Sorry. At least I’ve now cc’d Austin. :)

mmcloughlin · 2018-10-26T02:36:27Z

Seemed this should have the proposal: prefix. Regarding bandwidth, I would be happy to put together a CL if the proposal is accepted.

ianlancetaylor · 2018-10-31T17:52:18Z

What exactly are you proposing? Adding an option to go test which prints labels? It seems that you could print those yourself just as easily.

mmcloughlin · 2018-10-31T19:33:57Z

An option to go test seems less useful to me. This feature is more important when it comes to labels that require code to derive (such as hardware parameters).

Proposal

My concrete proposal is an additional function in the testing package.
Perhaps something like:

// SetBenchmarkLabel records a value relevant to benchmark performance. This
// will be included in benchmark output ahead of individual benchmark results.
// Labels "goos", "goarch" and "pkg" are set by default.
SetBenchmarkLabel(key, value string)

Not sure about naming, but hopefully this gives the idea.

"Case Study"

I recently implmented Meow hash for Golang. This hash is designed to take advantage of hardware AES instructions, therefore the implementation dispatches to one of three backends based on CPUID flags at runtime (pure Go, AES-NI and VAES-256/512).

To provide useful context in the test/benchmark output, I have two test functions that exist
purely to dump internal parameters:

https://github.com/mmcloughlin/meow/blob/7358a9c1d22772fbddc8c47fd74cf501c0136ec1/dispatch_amd64_test.go#L10-L17
https://github.com/mmcloughlin/meow/blob/7358a9c1d22772fbddc8c47fd74cf501c0136ec1/meow_test.go#L23-L25

I would prefer to include these as "official" labels:

testing.SetBenchmarkLabel("backend", implementation)
testing.SetBenchmarkLabel("hasaesni", cpu.HasAES)
testing.SetBenchmarkLabel("hasavx", cpu.HasAVX)
...

Comparison: Google Benchmark

Note that Google benchmark also provides context with benchmark runs. For
example:

$ ./benchmark
2018-10-31 12:31:20
Running ./benchmark
Run on (4 X 3500 MHz CPU s)
CPU Caches:
  L1 Data 32K (x2)
  L1 Instruction 32K (x2)
  L2 Unified 262K (x2)
  L3 Unified 4194K (x1)
--------------------------------------------------------------------
Benchmark                             Time           CPU Iterations
--------------------------------------------------------------------
meow_hash_benchmark/1                14 ns         14 ns   45173239   66.5225MB/s
meow_hash_benchmark/2                15 ns         15 ns   46804897   125.632MB/s
meow_hash_benchmark/4                16 ns         16 ns   44735009   240.535MB/s
...

It would be great it was also possible to add CPU/cache information in Go
benchmark output.

Benchmark Format

Note that this output has been formalized in the following document:

https://github.com/golang/proposal/blob/master/design/14313-benchmark-format.md#configuration-lines

In particular these "labels" are called "Configuration Lines".

Moreover, this document makes go bench output the official format for
representing benchmark results. While it is always possible to output this
kind of metadata in a separate program, that makes it harder to integrate with
automated tooling for running and storing benchmarks.

gopherbot · 2018-11-02T03:43:26Z

Change https://golang.org/cl/146897 mentions this issue: testing: add SetBenchmarkLabel

mmcloughlin · 2018-11-02T03:48:53Z

Added a CL to concretely demonstrate the proposal (and get something on the record before the freeze).

seebs · 2018-11-14T17:05:37Z

I'd use it if it existed, which I like to think is an argument in favor.

rsc · 2018-11-28T19:05:20Z

Does SetBenchmarkLabel keep track of what's already been printed and not repeat prints? In theory you can change the labels between runs.

mmcloughlin · 2018-11-28T21:47:17Z

The existing implementation (and prototype CL) print the labels inside a sync.Once. Therefore any labels added or modified after the first print would be ignored. My expectation was that this would be used in init or other setup functions. This behavior seems reasonable to me but should be documented for clarity, perhaps:

// SetBenchmarkLabel records a value relevant to benchmark performance. This
// will be included in benchmark output ahead of individual benchmark results.
// Any SetBenchmarkLabel calls after the first benchmark is executed will be
// ignored.  Labels "goos", "goarch" and "pkg" are set by default.
SetBenchmarkLabel(key, value string)

Did you have some other behavior in mind?

rsc · 2018-12-12T19:06:43Z

SetBenchmarkLabel should track what it has printed (a map[string]string) and silence duplicate prints but not throw away prints that change the value associated with a given key.

Otherwise this seems OK to me. @aclements?

rsc · 2019-01-23T19:18:40Z

Ping @aclements

mmcloughlin · 2019-01-24T00:51:58Z

Yes it would be great to get @aclements input.

SetBenchmarkLabel should track what it has printed (a map[string]string) and silence duplicate prints but not throw away prints that change the value associated with a given key.

This seems easy enough, but I'd like to understand the use case you have in mind. Are you thinking about SetBenchmarkLabel calls from inside a benchmark function?

ianlancetaylor · 2019-03-20T17:49:09Z

Is this addressed by the recent fix for #26037?

CC @aclements

josharian · 2019-03-20T17:54:24Z

Nope, this is not a dup of #26037. That was my initial reaction as well though. :) See the beginning of the thread.

This is about the labels that are printed prior to the benchmark lines, not the information that is included in the benchmark lines.

andybons · 2019-03-27T17:46:26Z

Adding to the ping party for @aclements

aclements · 2019-03-27T20:34:12Z

SetBenchmarkLabel should track what it has printed (a map[string]string) and silence duplicate prints but not throw away prints that change the value associated with a given key.

I'd like to understand what @rsc is proposing here a bit better, too.

It does seem a bit odd that SetBenchmarkLabel is only meaningful before the first benchmark run, but I also don't see how it's useful to call between benchmarks (or even when you would be able to do that). If a benchmark has multiple variations, I would expect it to use sub-benchmarks and the line-oriented "BenchmarkName/k=v/..." syntax.

SetBenchmarkLabel should specify that it constrains the syntax of the key and the value to legal configuration lines according to the benchmark format spec. (We did this in #26037 for the metric names.)

rsc · 2019-05-28T22:20:51Z

I was thinking that this would be something done inside a Benchmark function, but Austin points out that at that point the "BenchmarkName " prefix has been printed, making it inconvenient to print new key:value pairs. Also, the examples in the initial request are all global to the process, so maybe it makes sense to scope down to something global to the process that gets printed before any benchmarks start.

Then the only question is how do we know benchmarks are about to start. TestMain could print them unconditionally even today. If we give TestMain more access to the list of intended tests and benchmarks (see #28592) then it could avoid the print if no benchmarks will run. But maybe the lines in question would not be so bad to print always, so maybe we don't need to do anything special at all. For that matter, func init would work too.

As far as the case study, that could be done today with:

func init() {
    fmt.Printf("backend: %s\n", implementation)
    fmt.Printf("hasaesni: %v\n", cpu.HasAES)
    fmt.Printf("hasavx: %v\n", cpu.HasAVX)
}

rsc · 2019-06-04T20:52:56Z

Based on my comment from last week, I'm leaning toward declining this proposal. The answer to the original question is just "print statements".

mmcloughlin · 2019-06-04T22:02:40Z

In my opinion there's value in making this a "first class citizen". Printing works but feels hacky.

Here's a hypothetical (but I believe not far fetched) reason why this could matter down the line. Google Benchmark offers --benchmark_out_format=<json|console|csv>. What if go test wanted to offer a similar feature. (The behavior of the -json option to go test is truly perplexing to me, but that's a topic for another time.) So if we wanted to have a go test -bench . -benchfmt json option the output might look like this:

{
	"labels": {
		"goos": "linux",
		"goarch": "amd64",
		"pkg": "crypto/aes"
	},
	"benchmarks": [
		{
			"full_name": "BenchmarkFib-4",
			"n": 5000000,
			"name": "Fib",
			"ns_per_op": 313,
			"procs": 4
		},
		...
		{
			"full_name": "BenchmarkFrobInPlace-4",
			"mb_per_sec": 2155.61,
			"n": 3000000,
			"name": "FrobInPlace",
			"ns_per_op": 475,
			"procs": 4
		}
	]
}

Now my print statements give me:

backend: aes-ni
hasaesni: true
hasavx: true
{
	"labels": {
		"goos": "linux",
		"goarch": "amd64",
		"pkg": "crypto/aes"
	},
	...
}

I'm not arguing for this specific feature right now. My point is just that if people are printing labels, you've lost the semantics, and are unable to manipulate the labels in any structured way.

Finally, I think adding this as a function encourages people to use it, which I think is a positive too.

rsc · 2019-06-25T20:54:40Z

The JSON output is converted by a different program by parsing the actual benchmark output. Printing will not affect that. Let's let people do print statements for now and revisit once we can see more usage.

mmcloughlin changed the title ~~testing: custom benchmark labels~~ proposal: testing: custom benchmark labels Oct 26, 2018

gopherbot added this to the Proposal milestone Oct 26, 2018

gopherbot added the Proposal label Oct 26, 2018

rsc closed this as completed Jun 25, 2019

martisch mentioned this issue May 22, 2020

testing: add CPU name to standard benchmark labels #39214

Closed

golang locked and limited conversation to collaborators Jun 24, 2020

gopherbot added the FrozenDueToAge label Jun 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: testing: custom benchmark labels #28398

proposal: testing: custom benchmark labels #28398

mmcloughlin commented Oct 25, 2018

josharian commented Oct 26, 2018

josharian commented Oct 26, 2018

mmcloughlin commented Oct 26, 2018

ianlancetaylor commented Oct 31, 2018

mmcloughlin commented Oct 31, 2018

gopherbot commented Nov 2, 2018

mmcloughlin commented Nov 2, 2018

seebs commented Nov 14, 2018

rsc commented Nov 28, 2018

mmcloughlin commented Nov 28, 2018

rsc commented Dec 12, 2018

rsc commented Jan 23, 2019

mmcloughlin commented Jan 24, 2019

ianlancetaylor commented Mar 20, 2019

josharian commented Mar 20, 2019

andybons commented Mar 27, 2019

aclements commented Mar 27, 2019

rsc commented May 28, 2019

rsc commented Jun 4, 2019

mmcloughlin commented Jun 4, 2019

rsc commented Jun 25, 2019

proposal: testing: custom benchmark labels #28398

proposal: testing: custom benchmark labels #28398

Comments

mmcloughlin commented Oct 25, 2018

josharian commented Oct 26, 2018

josharian commented Oct 26, 2018

mmcloughlin commented Oct 26, 2018

ianlancetaylor commented Oct 31, 2018

mmcloughlin commented Oct 31, 2018

Proposal

"Case Study"

Comparison: Google Benchmark

Benchmark Format

gopherbot commented Nov 2, 2018

mmcloughlin commented Nov 2, 2018

seebs commented Nov 14, 2018

rsc commented Nov 28, 2018

mmcloughlin commented Nov 28, 2018

rsc commented Dec 12, 2018

rsc commented Jan 23, 2019

mmcloughlin commented Jan 24, 2019

ianlancetaylor commented Mar 20, 2019

josharian commented Mar 20, 2019

andybons commented Mar 27, 2019

aclements commented Mar 27, 2019

rsc commented May 28, 2019

rsc commented Jun 4, 2019

mmcloughlin commented Jun 4, 2019

rsc commented Jun 25, 2019