Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of comments in import groups #11

Merged
merged 1 commit into from
May 24, 2020

Conversation

nick-jones
Copy link
Contributor

My apologies, I only noticed #10 introduced a slight regression after it was merged. Hopefully the commit message explains this well enough:

I noticed #10 introduced a regression in that a comments sitting on
a separate line are now ignored - leading to the `importInfo` "value"
being set to an empty string. This trips things up further along as
empty string implies a separate group. I've tried to address this
by retaining "value" as the raw value (renamed to `lineValue` for
clarity), and adding a new field to `importInfo` which contains the
import path, if valid for the given line.

I noticed pavius#10 introduced a regression in that a comments sitting on
a separate line are now ignored - leading to the `importInfo` "value"
being set to an empty string. This trips things up further along as
empty string implies a separate group. I've tried to address this
by retaining "value" as the raw value (renamed to `lineValue` for
clarity), and adding a new field to `importInfo` which contains the
import path, if valid for the given line.
@pavius pavius merged commit 3cdbde3 into pavius:master May 24, 2020
nick-jones added a commit to utilitywarehouse/impi that referenced this pull request May 24, 2020
I mentioned in pavius#10 that there are still potential gotchas by extracting
import info line-by-line. This attempts to address that along switching
to an even more reliable way of reading import information.

Given we're already parsing the the file for import line numbers, I
explored the idea of leaning on this information even more. As it turns
out this was viable - both comments and import path strings can be
obtained from the parsing output. This PR switches to using information
extacted solely via the parser, instead of the existing approach of
line-by-line evaluation with some information supplemented by the
parser.

Switching to the parser opened up some different possibilities. Most
notably it is now possible to deal with comments in a slightly better
way; instead of trying to pay attention but also ignore these at the
same time, the comment is handled in the same way the parser deals
with them; it is associated with the ajoining import line. This also
addresses one of the gotchas mentioned before: multi-line comments.
Assuming the multi-line comment is immediately ajoining, this is now
treated no different to single-line comments. Whilst visually these
comments may make the imports appear to be grouped separately,
semantically there is no difference. So IMO they should be treated the
same. In switching to this one other potential gotcha came to mind -
should comments be permitted that are standalone (i.e. unattached from
any import). Historically this tool has been sort of allowing for that
(certain bugs may have existed priot to switching to go/scanner). This
change allows for it, and I've covered that in the test case targetting
comment peculiarities.

Beyond that, I addressed one gotcha that wasn't discussed before: it is
entirely legal to have multiple import declarations. This is one point
where this tool likely has to become opinionated - should mutliple be
permitted? Well, IMO, no (with the exception of the CGO `import "C"`
declaration). The existing implementation wouldn't take issue with this,
and when parsing line by line it would end up evaluating lines outside
of declarations (I think this is limited to comments, LPAREN and RPAREN
- which likely wouldn't trip things up after the switch to go/scanner).
The verifier now returns an error if there are multiple declarations
(`import "C"` excluded).

Finally, this addresses one final issue that pavius#11 didn't cover - an
improperly formatted file could have unnecessary whitespace between
groups. Trimming the value of whitespace could have addressed this,
but we get a fix for free from this; the whitespace becomes irrelevant.
nick-jones added a commit to utilitywarehouse/impi that referenced this pull request May 24, 2020
I mentioned in pavius#10 that there are still potential gotchas by extracting
import info line-by-line. This attempts to address that along switching
to an even more reliable way of reading import information.

Given we're already parsing the the file for import line numbers, I
explored the idea of leaning on this information even more. As it turns
out this was viable - both comments and import path strings can be
obtained from the parsing output. This PR switches to using information
extacted solely via the parser, instead of the existing approach of
line-by-line evaluation with some information supplemented by the
parser.

Switching to the parser opened up some different possibilities. Most
notably it is now possible to deal with comments in a slightly better
way; instead of trying to pay attention but also ignore these at the
same time, the comment is handled in the same way the parser deals
with them; it is associated with the ajoining import line. This also
addresses one of the gotchas mentioned before: multi-line comments.
Assuming the multi-line comment is immediately ajoining, this is now
treated no different to single-line comments. Whilst visually these
comments may make the imports appear to be grouped separately,
semantically there is no difference. So IMO they should be treated the
same. In switching to this one other potential gotcha came to mind -
should comments be permitted that are standalone (i.e. unattached from
any import). Historically this tool has been sort of allowing for that
(certain bugs may have existed priot to switching to go/scanner). This
change allows for it, and I've covered that in the test case targetting
comment peculiarities.

Beyond that, I addressed one gotcha that wasn't discussed before: it is
entirely legal to have multiple import declarations. This is one point
where this tool likely has to become opinionated - should mutliple be
permitted? Well, IMO, no (with the exception of the CGO `import "C"`
declaration). The existing implementation wouldn't take issue with this,
and when parsing line by line it would end up evaluating lines outside
of declarations (I think this is limited to comments, LPAREN and RPAREN
- which likely wouldn't trip things up after the switch to go/scanner).
The verifier now returns an error if there are multiple declarations
(`import "C"` excluded).

Finally, this addresses one final issue that pavius#11 didn't cover - an
improperly formatted file could have unnecessary whitespace between
groups. Trimming the value of whitespace could have addressed this,
but we get a fix for free from this; the whitespace becomes irrelevant.
nick-jones added a commit to utilitywarehouse/impi that referenced this pull request May 24, 2020
I mentioned in pavius#10 that there are still potential gotchas by extracting
import info line-by-line. This attempts to address that along switching
to an even more reliable way of reading import information.

Given we're already parsing the the file for import line numbers, I
explored the idea of leaning on this information even more. As it turns
out this was viable - both comments and import path strings can be
obtained from the parsing output. This PR switches to using information
extracted solely via the parser, instead of the existing approach of
line-by-line evaluation with some information supplemented by the
parser.

Switching to the parser opened up some different possibilities. Most
notably it is now possible to deal with comments in a slightly better
way; instead of trying to pay attention but also ignore these at the
same time, the comment is handled in the same way the parser deals
with them; it is associated with the ajoining import line. This also
addresses one of the gotchas mentioned before: multi-line comments.
Assuming the multi-line comment is immediately ajoining, this is now
treated no different to single-line comments. Whilst visually these
comments may make the imports appear to be grouped separately,
semantically there is no difference. So IMO they should be treated the
same. In switching to this one other potential gotcha came to mind -
should comments be permitted that are standalone (i.e. unattached from
any import). Historically this tool has been sort of allowing for that
(certain bugs may have existed prior to switching to go/scanner). This
change allows for it, and I've covered that in the test case targeting
comment peculiarities.

Beyond that, I addressed one gotcha that wasn't discussed before: it is
entirely legal to have multiple import declarations. This is one point
where this tool likely has to become opinionated - should multiple be
permitted? Well, IMO, no (with the exception of the CGO `import "C"`
declaration). The existing implementation wouldn't take issue with this,
and when parsing line by line it would end up evaluating lines outside
of declarations (I think this is limited to comments, LPAREN and RPAREN
- which likely wouldn't trip things up after the switch to go/scanner).
The verifier now returns an error if there are multiple declarations
(`import "C"` excluded).

Finally, this addresses one final issue that pavius#11 didn't cover - an
improperly formatted file could have unnecessary whitespace between
groups. Trimming the value of whitespace could have addressed this,
but we get a fix for free from this; the whitespace becomes irrelevant.
nick-jones added a commit to utilitywarehouse/impi that referenced this pull request May 24, 2020
I mentioned in pavius#10 that there are still potential gotchas by extracting
import info line-by-line. This attempts to address that along switching
to an even more reliable way of reading import information.

Given we're already parsing the the file for import line numbers, I
explored the idea of leaning on this information even more. As it turns
out this was viable - both comments and import path strings can be
obtained from the parsing output. This PR switches to using information
extracted solely via the parser, instead of the existing approach of
line-by-line evaluation with some information supplemented by the
parser.

Switching to the parser opened up some different possibilities. Most
notably it is now possible to deal with comments in a slightly better
way; instead of trying to pay attention but also ignore these at the
same time, the comment is handled in the same way the parser deals
with them; it is associated with the adjoining import line. This also
addresses one of the gotchas mentioned before: multi-line comments.
Assuming the multi-line comment is immediately adjoining, this is now
treated no different to single-line comments. Whilst visually these
comments may make the imports appear to be grouped separately,
semantically there is no difference. So IMO they should be treated the
same. In switching to this one other potential gotcha came to mind -
should comments be permitted that are standalone (i.e. unattached from
any import). Historically this tool has been sort of allowing for that
(certain bugs may have existed prior to switching to go/scanner). This
change allows for it, and I've covered that in the test case targeting
comment peculiarities.

Beyond that, I addressed one gotcha that wasn't discussed before: it is
entirely legal to have multiple import declarations. This is one point
where this tool likely has to become opinionated - should multiple be
permitted? Well, IMO, no (with the exception of the CGO `import "C"`
declaration). The existing implementation wouldn't take issue with this,
and when parsing line by line it would end up evaluating lines outside
of declarations (I think this is limited to comments, LPAREN and RPAREN
- which likely wouldn't trip things up after the switch to go/scanner).
The verifier now returns an error if there are multiple declarations
(`import "C"` excluded).

Finally, this addresses one final issue that pavius#11 didn't cover - an
improperly formatted file could have unnecessary whitespace between
groups. Trimming the value of whitespace could have addressed this,
but we get a fix for free from this; the whitespace becomes irrelevant.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants