Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplified reporting for string Diff? #195

Closed
karlkfi opened this issue Apr 5, 2020 · 6 comments · Fixed by #212
Closed

Simplified reporting for string Diff? #195

karlkfi opened this issue Apr 5, 2020 · 6 comments · Fixed by #212

Comments

@karlkfi
Copy link

karlkfi commented Apr 5, 2020

So, I'm trying to compare two multi-lined strings with cmp.Diff and it returns a diff that looks pretty bulky and kind of confusing given the string is not part of another object.

Example:

  strings.Join({
  	"{",
- 	` "accountType": "self",`,
  	` "name": "account",`,
  	` "environment": "self",`,
  	` "providerVersion": "v2",`,
  	` "type": "kubernetes",`,
  	` "skin": "v2",`,
+ 	` "accountType": "self",`,
  	` "cloudProvider": "kubernetes"`,
  	"}",
  }, "\n")

I get that the objectification of strings is there to make it obvious that it's a string inside an object, but is there a way to pass in an option or reporter to Diff so that it has simplified output?

Example:

  {
-  "accountType": "self",
   "name": "account",
   "environment": "self",
   "providerVersion": "v2",
   "type": "kubernetes",
   "skin": "v2",
+ "accountType": "self",
   "cloudProvider": "kubernetes"
  }

If this was possible, it would be a boon for comparison of json/yaml strings without having to Unmarshal them first (especially in cases where unmarshalling might fail). It would make Diff more useful for comparing file contents.

@karlkfi karlkfi changed the title Simplified reporting for known types? Simplified reporting for string DIff? Apr 5, 2020
@dsnet
Copy link
Collaborator

dsnet commented Apr 7, 2020

I get that the objectification of strings is there to make it obvious that it's a string inside an object,

You are correct that is the rationale for why. The reporter output prioritizes accurate reporting over readability (though both are important goals).

there a way to pass in an option or reporter to Diff so that it has simplified output?

Philosophically, we're generally adverse to adding lots of options to control how things look. I'd prefer pursuing an approach that (by default) produces output that is never ambiguous, but decently readable.

@karlkfi
Copy link
Author

karlkfi commented May 3, 2020

I understand the reasoning. It just means I'll have to use a different tool to get the clean output I want for json/yaml file/string comparison without having to unmarshal them first (which can fail on bad syntax).

@dsnet
Copy link
Collaborator

dsnet commented May 13, 2020

How would something like the following look.

For a standalone string:

  string(
      """
      one
-     two
+     TWO
      three
      """
  )

For a struct string field:

  Foo{
    StringField: string(
        """
        ... // 23 identical lines
        zero
        one
-       two
+       TWO
        three
        four
        ... // 534 identical lines
        """
    ),
    IntField: 5,
  }

For a slice of strings:

  []string{
    "equal single line",
-   "inequal single line",
+   "INEQUAL single line",
    string(
        """
        {
-        "accountType": "self",
         "name": "account",
         "environment": "self",
         "providerVersion": "v2",
         "type": "kubernetes",
         "skin": "v2",
+        "accountType": "self",
         "cloudProvider": "kubernetes"
        }
        """
    ),
    "equal\nmultiple\nlines",
    "another string",
+   "added string",
  }

Specifics of how it works:

  • The syntax uses a custom triple-quoted string syntax to represent multi-line strings.
  • We use the nicer syntax if the following conditions hold:
    • The string has a sufficient number of lines, is valid UTF-8, and consists of printable characters.
    • """ does not appear at the start of any line in the diff output
    • ... does not appear at the start of any line in the diff output
    • For lines that change, the differences cannot be due to whitespace (otherwise it becomes really hard to tell the difference between things like spaces and non-breaking spaces or due to a trailing carriage return).
  • Otherwise it falls back on the strings.Join({...}) syntax, which is unambigious, but ugly.

@dsnet dsnet changed the title Simplified reporting for string DIff? Simplified reporting for string Diff? May 13, 2020
@karlkfi
Copy link
Author

karlkfi commented May 13, 2020

I think the triple quotes is a great compromise for this, as long as it's still clear where the indentation starts. Whitespace changes are common when comparing formatted json/yaml for linting.

Is the string() wrapper necessary? Won't triple quotes indicate it's unambiguously a string?

@dsnet
Copy link
Collaborator

dsnet commented May 13, 2020

it's still clear where the indentation starts.

The starting """ and ending """ will always be aligned on the same column, so you can tell any additional indentation after that point.

Is the string() wrapper necessary? Won't triple quotes indicate it's unambiguously a string?

Not strictly. It was to give some degree of indention to help things read nicer. Consider the []string example without it:

  []string{
    "equal single line",
-   "inequal single line",
+   "INEQUAL single line",
    """
    {
-    "accountType": "self",
     "name": "account",
     "environment": "self",
     "providerVersion": "v2",
     "type": "kubernetes",
     "skin": "v2",
+    "accountType": "self",
     "cloudProvider": "kubernetes"
    }
    """,
    "equal\nmultiple\nlines",
    "another string",
+   "added string",
  }

Visually, it's hard to distinguish between the individual strings that belong to the []string and the individual lines that belong to the triple-quoted string.

dsnet added a commit that referenced this issue Jun 10, 2020
Using strings.Join to denote differences in a multi-line string is
visually noisy due to extensive use of quotes and escape sequences.
Add a custom triple-quote syntax that unambiguously shows line
differences with less visual noise.

If the triple-quote syntax cannot unmabiguously show differences,
then the reporter falls back on using the strings.Join format,
which is never ambiguous.

Fixes #195
@dsnet
Copy link
Collaborator

dsnet commented Jun 10, 2020

See #212.

dsnet added a commit that referenced this issue Jun 10, 2020
Using strings.Join to denote differences in a multi-line string is
visually noisy due to extensive use of quotes and escape sequences.
Add a custom triple-quote syntax that unambiguously shows line
differences with less visual noise.

If the triple-quote syntax cannot unmabiguously show differences,
then the reporter falls back on using the strings.Join format,
which is never ambiguous.

Fixes #195
dsnet added a commit that referenced this issue Jun 10, 2020
Using strings.Join to denote differences in a multi-line string is
visually noisy due to extensive use of quotes and escape sequences.
Add a custom triple-quote syntax that unambiguously shows line
differences with less visual noise.

If the triple-quote syntax cannot unmabiguously show differences,
then the reporter falls back on using the strings.Join format,
which is never ambiguous.

Fixes #195
dsnet added a commit that referenced this issue Jun 10, 2020
Using strings.Join to denote differences in a multi-line string is
visually noisy due to extensive use of quotes and escape sequences.
Add a custom triple-quote syntax that unambiguously shows line
differences with less visual noise.

If the triple-quote syntax cannot unmabiguously show differences,
then the reporter falls back on using the strings.Join format,
which is never ambiguous.

Fixes #195
dsnet added a commit that referenced this issue Jun 10, 2020
Using strings.Join to denote differences in a multi-line string is
visually noisy due to extensive use of quotes and escape sequences.
Add a custom triple-quote syntax that unambiguously shows line
differences with less visual noise.

If the triple-quote syntax cannot unmabiguously show differences,
then the reporter falls back on using the strings.Join format,
which is never ambiguous.

Fixes #195
dsnet added a commit that referenced this issue Jun 10, 2020
Using strings.Join to denote differences in a multi-line string is
visually noisy due to extensive use of quotes and escape sequences.
Add a custom triple-quote syntax that unambiguously shows line
differences with less visual noise.

If the triple-quote syntax cannot unmabiguously show differences,
then the reporter falls back on using the strings.Join format,
which is never ambiguous.

Fixes #195
dsnet added a commit that referenced this issue Jun 10, 2020
Using strings.Join to denote differences in a multi-line string is
visually noisy due to extensive use of quotes and escape sequences.
Add a custom triple-quote syntax that unambiguously shows line
differences with less visual noise.

If the triple-quote syntax cannot unmabiguously show differences,
then the reporter falls back on using the strings.Join format,
which is never ambiguous.

Fixes #195
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants