Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

syntax: Fix hex and octal int and errors highlight #2584

Merged
merged 13 commits into from
Nov 18, 2019
Merged

Conversation

k33nice
Copy link
Contributor

@k33nice k33nice commented Nov 13, 2019

According to language specification it is allowed to use underscore "_"
for readability in integer literals. Hence syntax matches for
goHexadecimalInt and goOctalInt must allow it. Also goOctalError regexp
mustn't match on accessing slice or array element by index.

@bhcleek
Copy link
Collaborator

bhcleek commented Nov 14, 2019

This highlights valid hexadecimals goOctalError.

@bhcleek
Copy link
Collaborator

bhcleek commented Nov 14, 2019

FWIW, I added some tests for verifying the syntax highlight groups of the numeric literals: #2587. You can rebase your branch off of master and use the tests to make sure your changes match as expected.

@bhcleek
Copy link
Collaborator

bhcleek commented Nov 14, 2019

This does not highlight 07a as goOctalError.

@k33nice
Copy link
Contributor Author

k33nice commented Nov 14, 2019

Hey @bcarrell! I'm currently trying to run the tests with docker but get some errors from gopls :(
About your question on 07a - no, it doesn't highlight it as an error right now. I tried to explain it in the commit message:

    The regular expression was changed that way to match all wrong sequences
    of characters. To do not make a false matching on other language structures
    the regexp should detect only falsy octal literals based on
    the allowed symbols: [0-7], "o", "O", "_". The exclusion is "8" and "9"
    because it might be very easy to make that kind of mistake.

    The regexp breakdown is described below:

    0                  |<-- "0" digit
    \(                 |
        [0-7oO_]*      |<-- 0+ [0-7] or "o" or "O" or "_" -----+
        \(             |                                       |
            [89]\+     |<-- 1+ "8" or "9" digit ----+          |
                \|     |                            |   +---+  |   +---+
            [oO]\{2,\} |<-- 2+ "o" or "O" character |<--| 1 |  |<--| 2 |
                \|     |                            |   +---+  |   +---+
            _\{2,\}    |<-- 2+ "_" character -------+          |
        \)             |                                       |
        [0-7oO_]*      |<-- 0+ [0-7] or "o" or "O" or "_" -----+
    \)\+               |
    \S*                |<-- 0+ "non space" characters

    1. One of the rules in this group should appear exactly once.
    2. All of the rules in this group should appear once or more.

TBO, as a user of vim-go I'm not sure what kind of sequences I want to be treated as an octal error. What do you think about it?

@k33nice
Copy link
Contributor Author

k33nice commented Nov 14, 2019

I can change [89]\+ to [89A-Fa-f]\+ so that kind of sequences like 07a will be highlighted as goOctalError. But I'm wondering is that should be goOctalError or goHexadecimalError 🤔

@k33nice
Copy link
Contributor Author

k33nice commented Nov 14, 2019

I've also added some changes to other regexp here 809c94c related to #2588

@bhcleek
Copy link
Collaborator

bhcleek commented Nov 14, 2019

@k33nice you can ignore the gopls errors; they're due to some kind of regression in gopls that is already being addressed by the Go tools team.

Generally speaking, the regular expression changes you're making are going to fail to highlight some patterns that are correctly highlighted on master. Here's a few examples:

syn match       goDecimalInt        "\<-\=\d\+\%([Ee][-+]\=\d\+\)\=\>"

This is now going to regress and match as a decimal instead of an octdal when the number begins with a 0.

syn match       goOctalError        "\<-\=0\([0-7oO_]*\([89]\+\|[oO]\{2,\}\|_\{2,\}\)[0-7oO_]*\)\+\S*\>"

This treats 8 and 9 specially for some reason when there's really no need to. The scope of matching invalid characters should be much wider - as it already is on master.

It also introduces an inconsistency in how goOctalError handle more than one _ vs how goBinaryError and goHexadecimalError handle them.

I'd also strongly recommend that you rebase your branch onto master, because I added tests that will be useful for changing these highlight groups.

Try hard not to break existing behaviors. You're right that ideally _ would be handled in all numerics and that octals should also allow 0, 0o, and 0O as leading characters, but if we can't get those changes done in a way that doesn't negatively impact performance and that is relatively easy to read, then I'm ok with leaving master as-is.

@k33nice
Copy link
Contributor Author

k33nice commented Nov 14, 2019

Thanks for your clarification! A quick note on goDecimalInt: that change https://github.com/fatih/vim-go/pull/2587/files#diff-6bbebb95ce25cd40d0072544e2dad91eR165 was broke highlighting single zero as a number.

@bhcleek
Copy link
Collaborator

bhcleek commented Nov 14, 2019

Good catch. I'll fix that in a separate PR.

@k33nice
Copy link
Contributor Author

k33nice commented Nov 14, 2019

I'll try to rework this pr a little bit. Also just my point of view, but I'm pretty convinced that much better to receive false negatives than positives for error highlighting. Especially for the current case where we have these syntax rules are enabled by default.

@k33nice k33nice changed the title syntax: Fix hex and octal int and errors highlight [WIP] syntax: Fix hex and octal int and errors highlight Nov 14, 2019
@bhcleek
Copy link
Collaborator

bhcleek commented Nov 14, 2019

I agree that false negatives for errors is better than false positives. However, I see no reason to introduce false negatives where they don't exist now.

@k33nice
Copy link
Contributor Author

k33nice commented Nov 15, 2019

@bhcleek I added support of "_" for decimal, hex, octal and binary and also some more test cases for it. I tried don't regress previous behavior. I'd be glad to get any suggestions or thoughts.

@k33nice k33nice changed the title [WIP] syntax: Fix hex and octal int and errors highlight syntax: Fix hex and octal int and errors highlight Nov 15, 2019
Copy link
Collaborator

@bhcleek bhcleek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do the *DoublePrefix tests need to exist? The existing *ErrorLeading tests should cover that case fine.

This is getting closer to being mergeable. I have some questions and requests for changes in-line.

Can you also squash these change so that your merge commits aren't in the branch?

Finally, as an aside and a quick tip to help you get PRs merged more quickly in the future: try to make your commits tell a story of adding incremental and easily reviewable value. For instance, a commit that adds support for 0o and 0O for octal numbers and tests for those, then a commit that adds support and tests for the underscores. When the set of changes taken together is large, complex, or otherwise noisy, it's useful for reviewers to be able to review commit by commit. Reviewing regular expressions is hard. Difficult to review PRs are less likely to be merged because it's hard to assess their risk vs value.

autoload/go/highlight_test.vim Outdated Show resolved Hide resolved
autoload/go/highlight_test.vim Outdated Show resolved Hide resolved
autoload/go/highlight_test.vim Outdated Show resolved Hide resolved
autoload/go/highlight_test.vim Outdated Show resolved Hide resolved
syntax/go.vim Outdated Show resolved Hide resolved
syntax/go.vim Outdated Show resolved Hide resolved
syntax/go.vim Outdated
syn match goOctalInt "\<-\=0[oO]\?_\?\(\o\+_\?\)\+\>"
syn match goOctalError "\<-\=0\([0-7oO_]*\([^ \t0-7oOxX\]\}\_]\+\|[oO]\{2,\}\|_\{2,\}\)[0-7oO_]*\)\+\S*\>"
syn match goBinaryInt "\<-\=0[bB]_\?\([01]\+_\?\)\+\>"
syn match goBinaryError "\<-\=0[bB]_\?\([01]\+_\?\)*\([^ \t01_]\S*\|_\{2,\}\S*\|_\)\>"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for both goBinaryInt and goBinaryError, wouldn't [01_]\+ suffic instead of [01]\+_\??

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[01_]\+ matches more than one sequential underscore, and that is syntaxly incorrect.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's syntactically in incorrect, but that's what goBinaryError is trying to catch. Can goBinaryError be simplified to \<-\=0[bB]\([01_]*[^ \t01_]\S*\|[01]*__\S*\)\>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Technically no, cuz it doesn't match on 0b_01__01 or 0b_01_01_ but I'll try to simplify it.

syntax/go.vim Outdated
syn match goHexadecimalInt "\<-\=0[xX]_\?\(\x\+_\?\)\+\>"
syn match goHexadecimalError "\<-\=0[xX]_\?\(\x\+_\?\)*\(\([^ \t0-9A-Fa-f_]\|_\{2,\}\)\S*\|_\)\>"
syn match goOctalInt "\<-\=0[oO]\?_\?\(\o\+_\?\)\+\>"
syn match goOctalError "\<-\=0\([0-7oO_]*\([^ \t0-7oOxX\]\}\_]\+\|[oO]\{2,\}\|_\{2,\}\)[0-7oO_]*\)\+\S*\>"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might end up regressing 69d5dad.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed it a bit already. But I cannot see any sequences that the previous regexp covered and the new does not. Of course, if not count those that it should not match (e.g. 0o1, 0_1 0123/5 , 0123+5and so on). Maybe I missed smth. Could you please elaborate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

69d5dad handles list indexes as reported in the issue that commit fixed.

Copy link
Contributor Author

@k33nice k33nice Nov 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sigh. You're right, thanks! I've checked it on [0] and []int{0} but forgot about [0][0] and [][]int{{0},{0}}. I'll expand it.

According to language specification it is allowed to use underscore "_"
for readability in integer literals. Hence syntax matches for
goHexadecimalInt and goOctalInt must allow it. Also goOctalError regexp
mustn't match on accessing slice or array element by index.
The regular expression was changed that way to match all wrong sequences
of characters. To do not make a false matching on other language
structures the regexp should detect only falsy octal literals based on
the allowed symbols: [0-7], "o", "O", "_". The exclusion is "8" and "9"
because it might be very easy to make that kind of mistake.

The regexp breakdown is described below:

0                  |<-- "0" digit
\(                 |
    [0-7oO_]*      |<-- 0+ [0-7] or "o" or "O" or "_" -----+
    \(             |                                       |
        [89]\+     |<-- 1+ "8" or "9" digit ----+          |
            \|     |                            |   +---+  |   +---+
        [oO]\{2,\} |<-- 2+ "o" or "O" character |<--| 1 |  |<--| 2 |
            \|     |                            |   +---+  |   +---+
        _\{2,\}    |<-- 2+ "_" character -------+          |
    \)             |                                       |
    [0-7oO_]*      |<-- 0+ [0-7] or "o" or "O" or "_" -----+
\)\+               |
\S*                |<-- 0+ "non space" characters

1. One of the rules in this group should appear exaclty once.
2. All of the rules in this group should appears once or more.
*Changes in regexp*
- Added [A-Fa-f] to goOctalError to sequences like `07a` considered as
errors
- Added "_" to goHexadecimalError to be matched on wrong characters
after correct groups (e.g. `0x_123_123_1yy`)
- Added "_" to goBinaryInt and goBinaryError to correctly handle groups in
binary integers
*Changes*
- Add new goDecimalErrorInt hi
- Add support of "_" in goDecimalInt
- Fix goOctalError to support more wide patterns of errors
- Correct support of goups by "_" in goBinaryError
*Fix*
- goDecimalInt: add support for sequneces like `1_234567890`
- goBinaryInt: add support for sequneces like `0b_01`
*Changes*
- removed redundant characters from exclusion pattern
- fixed tests so that test include only one testing error
- added test case for explicitly test trailing "o" or "O"
@k33nice
Copy link
Contributor Author

k33nice commented Nov 15, 2019

@bhcleek

Why do the *DoublePrefix tests need to exist? The existing *ErrorLeading tests should cover that case fine.

Yup, *ErrorLeading should cover that. I just added them as a common typo and since we need to explicitly check it in goOctalError. I got rid of them though.

I really appreciate your advice! I know that the review is the hard work especially for something tedious as regexp. I just decided to help little cuz errors started to annoyed during my work. I'll try to do it better and not so dreary for reviewers next time. Again, thanks a lot for all of your work and generally for maintaining this project!

syntax/go.vim Outdated
syn match goOctalInt "\<-\=0[oO]\?_\?\(\o\+_\?\)\+\>"
syn match goOctalError "\<-\=0\([0-7oO_]*\([^ \t0-7oOxX\]\}\_]\+\|[oO]\{2,\}\|_\{2,\}\)[0-7oO_]*\)\+\S*\>"
syn match goBinaryInt "\<-\=0[bB]_\?\([01]\+_\?\)\+\>"
syn match goBinaryError "\<-\=0[bB]_\?\([01]\+_\?\)*\([^ \t01_]\S*\|_\{2,\}\S*\|_\)\>"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's syntactically in incorrect, but that's what goBinaryError is trying to catch. Can goBinaryError be simplified to \<-\=0[bB]\([01_]*[^ \t01_]\S*\|[01]*__\S*\)\>

\ 'octalErrorDoubleUnderscore': {'group': 'goOctalError', 'value': '0o__1234567'},
\ 'octalErrorDoubleTrailingUnderscore': {'group': 'goOctalError', 'value': '0o_1234567__'},
\ 'octalErrorTrailingUnderscore': {'group': 'goOctalError', 'value': '0o_123456_7_'},
\ 'octalErrorTrailingO': {'group': 'goOctalError', 'value': '0o_123456_7o'},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What value does this add over octalErrorTrailing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For checking a specific subgroup that matches on "oO". We need this group cuz we allow 'oO' in this subgroup [^ \t0-7oOxX_/]. And just realized that we need to expand that specific subgroup to [oOxX] to match on 0123x.

syntax/go.vim Outdated
syn match goHexadecimalInt "\<-\=0[xX]_\?\(\x\+_\?\)\+\>"
syn match goHexadecimalError "\<-\=0[xX]_\?\(\x\+_\?\)*\(\([^ \t0-9A-Fa-f_]\|_\{2,\}\)\S*\|_\)\>"
syn match goOctalInt "\<-\=0[oO]\?_\?\(\o\+_\?\)\+\>"
syn match goOctalError "\<-\=0\([0-7oO_]*\([^ \t0-7oOxX\]\}\_]\+\|[oO]\{2,\}\|_\{2,\}\)[0-7oO_]*\)\+\S*\>"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

69d5dad handles list indexes as reported in the issue that commit fixed.

Copy link
Collaborator

@bhcleek bhcleek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are there tests for double trailing underscores? Shouldn't those be tested already by virtue of the *TrailingUnderscore tests?

I really appreciate your advice! I know that the review is the hard work especially for something tedious as regexp. I just decided to help little cuz errors started to annoyed during my work. I'll try to do it better and not so dreary for reviewers next time. Again, thanks a lot for all of your work and generally for maintaining this project!

I appreciate your help on this. Most open source work happens because someone gets started resolving things that bother or annoy them.

I'll admit that part of me thinks we should just drop the error groups for numerics entirely. They're complicated and arguably don't add too much value.

@bhcleek
Copy link
Collaborator

bhcleek commented Nov 16, 2019

Thanks for sticking with this. I know it's tedious to get through this review; you managed to pick up a piece of work for your first contribution that complex, high visibility, affects performance quite a bit, and that will be noticed by the community very quickly if it introduces false positives.

**Changes**
- Added "]" and "}" to whitelist subgroup. To doesn't match on [0][0]
and [][]int{{0},{0}}.
- Expanded subgroup for trailing [oO] to [oOxX].
- Added test case for trailing 'x'.
@k33nice
Copy link
Contributor Author

k33nice commented Nov 16, 2019

Why are there tests for double trailing underscores? Shouldn't those be tested already by virtue of the *TrailingUnderscore tests?

Agree. Them seems to be useless. Added them cuz was keeping in mind that case all the way.

Yup, can't disagree. That might be the best solution. I got the same thoughts when they started to falsy emerge during my working. Handling by regexp these grouped integers is way too tricky. Furthermore, AFAIK VSCode doesn't support them.

syntax/go.vim Show resolved Hide resolved
@bhcleek
Copy link
Collaborator

bhcleek commented Nov 16, 2019

Is that the last of the changes?

@k33nice
Copy link
Contributor Author

k33nice commented Nov 17, 2019

Is that the last of the changes?

Yup, I think that's it. Not sure that you saw all of my responses but it seems like all your reviews were address. One more thing that I've found out that we have the same goDecimalInt, goHexadecimalInt and others in syntax/gotexttmpl.vim. I'm not sure that we want to change them here as well. It might be better to do in separate PR. What do you think?

@bhcleek
Copy link
Collaborator

bhcleek commented Nov 17, 2019

Let's leave the gotexttmpl for another PR.

I'll merge this later today.

@bhcleek bhcleek merged commit 9a3808d into fatih:master Nov 18, 2019
bhcleek added a commit that referenced this pull request Nov 18, 2019
clrpackages pushed a commit to clearlinux-pkgs/vim-go that referenced this pull request Feb 1, 2020
v1.22 - (January 30, 2020)

BACKWARDS INCOMPATIBILITIES:
* Drop support for Vim 7.4. The minimum required version of Vim is now 8.0.1453.
  [[GH-2495]](fatih/vim-go#2495)
  [[GH-2497]](fatih/vim-go#2497)
* Drop support for `gometalinter`
  [[GH-2494]](fatih/vim-go#2494)

IMPROVEMENTS:
* Highlight the `go` keyword in go.mod files.
  [[GH-2473]](fatih/vim-go#2473)
* Use echo functions consistently.
  [[GH-2458]](fatih/vim-go#2458)
* Add support for managing goroutines in debugger.
  [[GH-2463]](fatih/vim-go#2463)
  [[GH-2527]](fatih/vim-go#2527)
* Document `g:go_doc_popup_window`.
  [[GH-2506]](fatih/vim-go#2506)
* Make `g:go_doc_popup_window=1` work for Neovim, too.
  [[GH-2451]](fatih/vim-go#2451)
  [[GH-2512]](fatih/vim-go#2512)
* Handle errors jumping to a definition in a file open in another Vim process
  better.
  [[GH-2518]](fatih/vim-go#2518)
* Improve the UX when the gopls binary is missing.
  [[GH-2522]](fatih/vim-go#2522)
* Use gopls instead of guru for `:GoSameIds`.
  [[GH-2519]](fatih/vim-go#2519)
* Use gopls instead of guru for `:GoReferrers`.
  [[GH-2535]](fatih/vim-go#2535)
* Update documentation for `g:go_addtags_transform`.
  [[GH-2541]](fatih/vim-go#2541)
* Install most helper tools in module aware mode.
  [[GH-2545]](fatih/vim-go#2545)
* Add a new option, `g:go_referrers_mode` to allow the user to choose whether
  to use gopls or guru for finding references.
  [[GH-2566]](fatih/vim-go#2566)
* Add options to control how gopls responds to completion requests.
  [[GH-2567]](fatih/vim-go#2567)
  [[GH-2568]](fatih/vim-go#2568)
* Add syntax highlighting for binary literals.
  [[GH-2557]](fatih/vim-go#2557)
* Improve highlighting of invalid numeric literals.
  [[GH-2571]](fatih/vim-go#2571)
  [[GH-2587]](fatih/vim-go#2587)
  [[GH-2589]](fatih/vim-go#2589)
  [[GH-2584]](fatih/vim-go#2584)
  [[GH-2597]](fatih/vim-go#2597)
  [[GH-2599]](fatih/vim-go#2599)
* Add highlighting of sections reported by gopls diagnostics' errors
  and warnings.
  [[GH-2569]](fatih/vim-go#2569)
  [[GH-2643]](fatih/vim-go#2643)
* Make the highlighting of fzf decls configurable.
  [[GH-2572]](fatih/vim-go#2572)
  [[GH-2579]](fatih/vim-go#2579)
* Support renaming with gopls.
  [[GH-2577]](fatih/vim-go#2577)
  [[GH-2618]](fatih/vim-go#2618)
* Add an option, `g:go_gopls_enabled`, to allow gopls integration to be
  disabled.
  [[GH-2605]](fatih/vim-go#2605)
  [[GH-2609]](fatih/vim-go#2609)
  [[GH-2638]](fatih/vim-go#2638)
  [[GH-2640]](fatih/vim-go#2640)
* Add a buffer level option, `b:go_fmt_options`, to control formatting options
  per buffer.
  [[GH-2613]](fatih/vim-go#2613)
* Use build tags when running `:GoVet`.
  [[GH-2615]](fatih/vim-go#2615)
* Add new snippets for UltiSnips.
  [[GH-2623]](fatih/vim-go#2623)
  [[GH-2627]](fatih/vim-go#2627)
* Expand completions as snippets when `g:go_gopls_use_placeholders` is set.
  [[GH-2624]](fatih/vim-go#2624)
* Add a new function, `:GoDiagnostics` and an associated mapping for seeing
  `gopls` diagnostics. Because of the performance implications on large
  projects, `g:go_diagnostics_enabled` controls whether all diagnostics are
  processed or only the diagnostics for the current buffer.
  [[GH-2612]](fatih/vim-go#2612)
* Explain how to find and detect multiple copies of vim-go in the FAQ.
  [[GH-2632]](fatih/vim-go#2632)
* Update the issue template to ask for the gopls version and
  `:GoReportGitHubIssue` to provide it.
  [[GH-2630]](fatih/vim-go#2630)
* Use text properties when possible for some highlighting cases.
  [[GH-2652]](fatih/vim-go#2652)
  [[GH-2662]](fatih/vim-go#2662)
  [[GH-2663]](fatih/vim-go#2663)
  [[GH-2672]](fatih/vim-go#2672)
  [[GH-2678]](fatih/vim-go#2678)

BUG FIXES:
* Fix removal of missing directories from gopls workspaces.
  [[GH-2507]](fatih/vim-go#2507)
* Change to original window before trying to change directories when term job
  ends.
  [[GH-2508]](fatih/vim-go#2508)
* Swallow errors when the hover info cannot be determined.
  [[GH-2515]](fatih/vim-go#2515)
* Fix errors when trying to debug lsp and hover.
  [[GH-2516]](fatih/vim-go#2516)
* Reset environment variables on Vim <= 8.0.1831 .
  [[GH-2523]](fatih/vim-go#2523)
* Handle empty results from delve.
  [[GH-2526]](fatih/vim-go#2526)
* Do not overwrite `updatetime` when `g:go_auto_sameids` or
  `g:go_auto_type_info` is set.
  [[GH-2529]](fatih/vim-go#2529)
* Fix example for `g:go_debug_log_output` in docs.
  [[GH-2547]](fatih/vim-go#2547)
* Use FileChangedShellPost instead of FileChangedShell so that reload messages
  are not hidden.
  [[GH-2549]](fatih/vim-go#2549)
* Restore cwd after `:GoTest` when `g:go_term_enabled` is set.
  [[GH-2556]](fatih/vim-go#2556)
* Expand struct variable correctly in the variables debug window.
  [[GH-2574]](fatih/vim-go#2574)
* Show output from errcheck when there are failures other than problems it can
  report.
  [[GH-2667]](fatih/vim-go#2667)

Signed-off-by: Patrick McCarty <patrick.mccarty@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants