Complete overhaul of the difference reporter #124

dsnet · 2019-02-27T02:41:59Z

The previous implementation of the reporter simply listed all differences,
each qualified by the full path to the difference.
This method of reporting is exact, but difficult for humans to parse.
It is one of the more common sources of complaints by users and a significant
reason why cmp is not preferred over competing libraries.

This change reimplements the reporter to format the output as a
structured literal in pseudo-Go syntax. The output resembles literals that
the user would likely have in their test code. Differences between the
x and y values are denoted by a '-' or '+' prefix at the start of the line.

An overview of the new implementation is as follows:

report.go: The defaultReporter type implements the Reporter interface.
report_value: Through the PushStep/PopStep API, the defaultReporter is able
to contruct an in-memory valueNode tree representing the comparison of
x and y as cmp.Equal walks the sub-values.
report_compare.go: After report_value.go constructs an AST-representation
of the compared values, report_compare.go formats the valueNode tree as a
textNode tree, which is the textual output in a tree form.
Some relevant design decisions include:
- The format logic goes through effort to avoid printing ignored nodes.
- Some number of surrounding equal (but not ignored) struct fields,
  slice elements, or map entries are printed for context.
- cmp.Equal may declare two sub-reflect.Values to be equal, but are
  different values when printed. In order to present a unified view on
  this "equal" node, the logic formats both values and arbitrarily choses
  the one with the shorter string.
- Transformed nodes are formatted with the pseudo-Go syntax of:
  Inverse(TransformerName, OutputType{...})
  where Inverse is some magical pseudo-function that inverts the
  transformation referred to by TransformerName. The OutputType literal
  is the output of the transformation.
report_reflect.go: This contains logic to pretty-print reflect.Values and
is relied upon by report_compare.go to format the leaves of the tree.
Note that the leaves of the tree can be any arbitrary Go type and value
(including cyclic data structures).
report_text.go: This contains logic for purely lexicographical formatting
and is depended upon by the other report_*.go files.

Advantages:

The output is more familiar as it uses pseudo-Go syntax for literals
It provides context about surrounding struct fields, slice elements, or
map entries that were equal
Inserted and removed elements in a slice are easier to visualize
Related diffs lie on the same indentation
For diffs in a deeply nested value, the output is easier to visualize
than having a list of all the full paths to the diff.

Disadvantages:

The implementation is drastically more complex.
In most cases, the output is longer (though more sparse)

dsnet · 2019-02-27T02:42:48Z

\cc @neild
\cc @rogpeppe (since you mentioned in the past that you found the reporter output hard to grok)

Take a look at compare_test.go for how the new report diff looks like.

cybrcodr

Really cool diff output. Just a first passed on this. I've not reviewed the report*.go files yet.

I'm curious why use the pseudo-function name Inverse? Why not just use the transform name as the pseudo-func name?

cmp/internal/value/zero_test.go

cybrcodr · 2019-02-27T07:08:24Z

cmp/compare_test.go

-{[16]uint8}[15]:
-	-: 0x61
-	+: 0x8f`,
+  [16]uint8{


Just a thought. The nice thing with the previous version is that it has the index, this new one doesn't have it.

Yea. That's one downside of the new output. In theory, we could put a comment on the right that keeps track of the index.

cybrcodr · 2019-02-27T08:08:57Z

cmp/compare_test.go

-***{***int}:
-	-: 0
-	+: 1`,
+  &&&int(


At first glance, I'd expect that this should be ***int instead since the comparison is on the dereferenced value. And the example that uses intPtr() should be &int instead of *int.

But then I realized on types like map[*X]Y, the diff shows the key with pointer value instead of the dereferenced value and also it is not syntactically correct to have map[&X]Y. But perhaps this is an exception, similar to []*T, map[S]*T, but the actual value for *T here would have &T{ ... } anyways unless it is a map key.

So, I think I have a minor preference for ***int on this one.

In Go syntax, * means dereference, while & means to take the address of.

In the previous reporter output, the path was an expression from the root variable to the specific value. In that situation, you want to dereference pointers (hence the * operator).

In the new reporter output, we're constructing a pseudo Go-literal. There is no existing value to de-reference. Instead we are taking the address of the inner-most literal.

dsnet · 2019-02-27T10:15:39Z

I'm curious why use the pseudo-function name Inverse? Why not just use the transform name as the pseudo-func name?

If the transform name itself is used directly, it's subtly lying. I need someway to indicate that its the opposite of the transformation. I opted for a magic Inverse function rather than appending an "Inverse" prefix to the transform name since the transform name itself may possible have an "Inverse" prefix.

rogpeppe

This looks like a great improvement in general, thanks.
I haven't had time to review most of the core logic - I mostly just looked at the output.

There's one thing I'd like to bring up. We use cmp a lot in tests, and thus I end up looking at a lot of its output. Almost every time I look at cmp output, I have re-read the key to remind myself of which is + and which is -.

In my head, what you're expecting is old (it's already in the source code) and what you've got is new (it just occurred in the test), so intuitively I expect to see cmp print a description of what things aren't in old (subtracted) and are in new (added).

I understand why cmp chooses the sense that it does (we always write if got != want {) but it still seems wrong to me every time I see it, because diffs are always phrased with - meaning "old" and + meaning "new".

Is there any chance of changing the sense?

cmp/compare_test.go

rogpeppe · 2019-02-27T07:48:34Z

cmp/report_reflect.go

+	if testflags.Deterministic {
+		p = 0xdeadf00f // Only used for stable testing purposes
+	}
+	return fmt.Sprintf("⟪0x%x⟫", p)


FWIW the Go font (which I use) doesn't seem to contain these bracket characters. How about using the more commonly available "«" and "»" (00AB and 00BB) ?

rogpeppe · 2019-02-27T09:48:04Z

cmp/compare_test.go

-λ(λ(λ({uint8}))):
-	-: 0x00
-	+: 0x01`,
+  uint8(Inverse(λ, uint16(Inverse(λ, uint32(Inverse(λ, uint64(


I think it would be good to document what's going with Inverse here. Also, what if you have a type named "Inverse"? ISTM that could be a bit confusing.

Not-sure-if-I'm-serious:

uint8(λ⁻¹(uint16(λ⁻¹(uint32(λ⁻¹(uint64(

Any type named Inverse will at least be qualified by the package name (e.g., mypkg.Inverse).

The λ⁻¹ is interesting. I'll see how it renders on a number of different terminals.

Nice idea, but FWIW that superscript minus doesn't render correctly on the browser on my phone or in my terminal/editor (using the Go font).

rogpeppe · 2019-02-27T09:52:01Z

cmp/compare_test.go

-	-: map[*testprotos.Stringer]*testprotos.Stringer{s"hello": s"world"}
-	+: map[*testprotos.Stringer]*testprotos.Stringer(nil)`,
+  map[*testprotos.Stringer]*testprotos.Stringer(
+- 	{⟪0xdeadf00f⟫: s"world"},


Assuming this s" notation implies "this string is the output of a String method", it could probably use some documentation as that's not necessarily clear.

cmp/compare_test.go

rogpeppe · 2019-02-27T10:01:19Z

cmp/internal/value/zero.go

+// IsZero reports whether v is the zero value.
+// This does not rely on Interface and so can be used on unexported fields.
+func IsZero(v reflect.Value) bool {
+	switch v.Kind() {


should this cope with reflect.Invalid too (e.g. when IsZero is called in reflect.ValueOf(nil))?

rogpeppe · 2019-02-27T10:01:40Z

cmp/internal/value/zero_test.go

+		in   interface{}
+		want bool
+	}{
+		{0, true},


how about a test case or two for structs?

cmp/report_reflect.go

cmp/report_value.go

cmp/report_text.go

dsnet · 2019-02-27T18:19:08Z

@rogpeppe

I understand why cmp chooses the sense that it does (we always write if got != want {) but it still seems wrong to me every time I see it, because diffs are always phrased with - meaning "old" and + meaning "new".

Is there any chance of changing the sense?

This was debated very heavily inside Google. The current stylistic recommendation is to manually provide the (-want +got) legend in the test printout.

That said, whether to use - or + or got or want is kind of beyond the scope of this PR. Want to file an issue for it?

The previous implementation of the reporter simply listed all differences, each qualified by the full path to the difference. This method of reporting is exact, but difficult for humans to parse. It is one of the more common sources of complaints by users and a significant reason why cmp is not preferred over competing libraries. This change reimplements the reporter to format the output as a structured literal in pseudo-Go syntax. The output resembles literals that the user would likely have in their test code. Differences between the x and y values are denoted by a '-' or '+' prefix at the start of the line. An overview of the new implementation is as follows: * report.go: The defaultReporter type implements the Reporter interface. * report_value: Through the PushStep/PopStep API, the defaultReporter is able to contruct an in-memory valueNode tree representing the comparison of x and y as cmp.Equal walks the sub-values. * report_compare.go: After report_value.go constructs an AST-representation of the compared values, report_compare.go formats the valueNode tree as a textNode tree, which is the textual output in a tree form. Some relevant design decisions include: * The format logic goes through effort to avoid printing ignored nodes. * Some number of surrounding equal (but not ignored) struct fields, slice elements, or map entries are printed for context. * cmp.Equal may declare two sub-reflect.Values to be equal, but are different values when printed. In order to present a unified view on this "equal" node, the logic formats both values and arbitrarily choses the one with the shorter string. * Transformed nodes are formatted with the pseudo-Go syntax of: Inverse(TransformerName, OutputType{...}) where Inverse is some magical pseudo-function that inverts the transformation referred to by TransformerName. The OutputType literal is the output of the transformation. * report_reflect.go: This contains logic to pretty-print reflect.Values and is relied upon by report_compare.go to format the leaves of the tree. Note that the leaves of the tree can be any arbitrary Go type and value (including cyclic data structures). * report_text.go: This contains logic for purely lexicographical formatting and is depended upon by the other report_*.go files. Advantages: * The output is more familiar as it uses pseudo-Go syntax for literals * It provides context about surrounding struct fields, slice elements, or map entries that were equal * Inserted and removed elements in a slice are easier to visualize * Related diffs lie on the same indentation * For diffs in a deeply nested value, the output is easier to visualize than having a list of all the full paths to the diff. Disadvantages: * The implementation is drastically more complex. * In most cases, the output is longer (though more sparse)

dsnet · 2019-03-12T00:36:11Z

Merging this. The exact way inverse and pointers and stuff are represented can be addressed in a future PR.

tcard · 2019-06-06T15:58:33Z

cmp/report_text.go

+type indentMode int
+
+func (n indentMode) appendIndent(b []byte, d diffMode) []byte {
+	if flags.Deterministic || randBool {


Why?

I was using the output of cmp.Diff in a Go runnable example, and it was failing intermittently. I'd like to understand why do this.

(Sorry, not sure whether it's OK to comment in an old pull request... I got here looking for rationale.)

The output of cmp.Diff is not defined. It can and will change over time. For example, this CL completely changed the output; the new output is better, but the change breaks any tests which depended on the old one.

Making the output trivially nondeterministic ensures that anyone depending on stable output from cmp.Diff will discover the problem immediately rather than at some future point in time when the output changes in a more significant way.

@dneil is correct. I mailed out #145 to document this.

dsnet requested a review from cybrcodr February 27, 2019 02:42

cybrcodr reviewed Feb 27, 2019

View reviewed changes

dsnet force-pushed the report-diff branch 2 times, most recently from 6e3e363 to 11e6ded Compare February 27, 2019 10:22

rogpeppe reviewed Feb 27, 2019

View reviewed changes

neild reviewed Feb 27, 2019

View reviewed changes

cmp/report_text.go Show resolved Hide resolved

cybrcodr approved these changes Mar 11, 2019

View reviewed changes

dsnet force-pushed the report-diff branch from 11e6ded to 0ea5a36 Compare March 12, 2019 00:33

dsnet merged commit 2940eda into master Mar 12, 2019

dsnet deleted the report-diff branch March 12, 2019 00:36

dsnet mentioned this pull request Mar 12, 2019

Allow configure diff reporter to output all diffs #99

Closed

tcard reviewed Jun 6, 2019

View reviewed changes

This was referenced Aug 6, 2019

TestDeepEqualFailure fails w/ go 1.12.2 gotestyourself/gotest.tools#155

Closed

Fix tests with latest go cmp gotestyourself/gotest.tools#158

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Complete overhaul of the difference reporter #124

Complete overhaul of the difference reporter #124

dsnet commented Feb 27, 2019

dsnet commented Feb 27, 2019 •

edited

Loading

cybrcodr left a comment

cybrcodr Feb 27, 2019

dsnet Feb 27, 2019

cybrcodr Feb 27, 2019

dsnet Feb 27, 2019

dsnet commented Feb 27, 2019

rogpeppe left a comment

rogpeppe Feb 27, 2019

rogpeppe Feb 27, 2019

neild Feb 27, 2019

dsnet Feb 27, 2019

rogpeppe Feb 28, 2019

rogpeppe Feb 27, 2019

dsnet Mar 12, 2019

rogpeppe Feb 27, 2019

rogpeppe Feb 27, 2019

dsnet commented Feb 27, 2019

dsnet commented Mar 12, 2019

tcard Jun 6, 2019

neild Jun 6, 2019

dsnet Jun 6, 2019

Complete overhaul of the difference reporter #124

Complete overhaul of the difference reporter #124

Conversation

dsnet commented Feb 27, 2019

dsnet commented Feb 27, 2019 • edited Loading

cybrcodr left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsnet commented Feb 27, 2019

rogpeppe left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsnet commented Feb 27, 2019

dsnet commented Mar 12, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dsnet commented Feb 27, 2019 •

edited

Loading