-
Notifications
You must be signed in to change notification settings - Fork 210
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Diff output is sometimes confusing and/or hard to read #274
Comments
When it comes to free-form text, the reporter logic is just a bunch of heuristics to try and choose between several different ways to present the data. We can adjust the heuristics, but it's a hard problem. Bug reports like this help us tune the heuristics. That said, have you considered doing a more structured diff of the data? https://play.golang.org/p/29x4dsDurlQ |
Hi @dsnet, thanks for the reply. I have tried approaches similar to your example some, but didn't necessarily find them more useful due to times like below where the rendered representation shows integer data in scientific notation and/or as a float:
I saw another approach that splits the input string on newlines (e.g., https://play.golang.org/p/opdEWiJjXCJ). The output is more consistent, but not as nice as the triple-quote format. However, I've had some difficultly getting the package to generate the triple-quote format. That said, I understand that this is a hard problem. I found my examples particularly confusing because the inputs are fairly similar and could only generate the first example with the |
For this case, the cause is because the logic determined that it was "more efficient" to use that representation due to this heuristic: Lines 148 to 150 in 395a0ac
Technically, the heuristic is correct since this representation only has a single insertion operation, while the triple-quote representation would have had a insertion+deletion.
For the second case in your example, the cause is because we avoid special string formatting for anything shorter than 64 bytes: Lines 83 to 84 in 395a0ac
we can decrease this 32 or something. |
Thanks for the information. Another issue that I've noticed is that running the same test with the same input/output will sometimes output the |
Without looking into it, that's probably due to the deliberate non-determinism with the internal diffing algorithm where one way of doing it returns results that are slightly more optimal than another. It's been on my TODO list to make that algorithm more optimal. The current algorithm favors performance over being optimal. It is guaranteed to be O(n), while an optimal algorithm is O(n^2). It'd be nice if we can get closer to optimal without being O(n^2). |
Understood re: non-determinism. Outside of a magical way to solve this complex problem for all cases, It seems there isn't really anything concrete to do here, so I'll close the case. Thanks for the discussion. |
I sent out #275 to fix the one that looks the worst. |
I've been working with a test apparatus for comparing JSON data. For ease-of-use and readability, I'd like to be able to compare prettified/indented JSON strings and print the differences in an easy-to-read way.
I've found that the
cmp.Diff
output can be quite varied even on similar inputs and can be either easy-to-read, somewhat hard to read, or hard to read and confusing depending on the exact input for non-obvious reasons.See https://play.golang.org/p/mu-TsXoGEEL (pasted below):
Code:
Output:
The text was updated successfully, but these errors were encountered: