Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser: weird formatting in a trailing comment causes all comments for next element to be ignored #10648

Closed
jhump opened this issue Sep 23, 2022 · 3 comments · Fixed by #10660
Labels

Comments

@jhump
Copy link
Contributor

jhump commented Sep 23, 2022

Here's an example source:

message Foo {
  optional string s = 1; /* is this a trailer
                            comment? */ // is this detached?

  // this is clearly a detached comment
  
  // and this is clearly a leading comment
  optional int32 i = 2;
}

Because the trailer comment above does not have a newline between it and the subsequent "is this detached?" comment, the parser gets confused and gives up on all of the rest of the comments from there to the next element.

So the above source code results in no comments in source code info in the resulting descriptor.

The issue is that it gets confused here, thinking that the next token is on the same line (even though the next "token" is actually another comment).

If it were actually a non-comment token, the behavior makes sense: it is not clear whether the comment should be attributed as a trailing comment for s or a leading comment for the next element, so it gets dropped.

But since the next thing is actually another comment, I'd suggest that the correct interpretation is:

  • s trailing comment "is this a trailer\ncomment?"
  • i detached comments
    • "is this detached?"
    • "this is clearly a detached comment"
  • i leading comment "and this is clearly a leading comment"

I could see an argument that the first two comments are weird and might be dropped, instead resulting in the following:

  • s has no trailing comment
  • i detached comment "this is clearly a detached comment"
  • i leading comment "and this is clearly a leading comment"

But I think the current behavior, where it fails to attribute any detached or leading comments to i is clearly wrong.

FWIW, I think the first interpretation above is the easier one to implement (and seems useful to preserve more comments).

@jhump jhump added the untriaged auto added to all issues by default when created. label Sep 23, 2022
@fowles fowles added protoc release notes: no and removed untriaged auto added to all issues by default when created. labels Sep 23, 2022
@fowles
Copy link
Contributor

fowles commented Sep 23, 2022

I agree with your assessment and we would be happy to accept a PR that implements your first suggestion.

@jhump
Copy link
Contributor Author

jhump commented Sep 27, 2022

@fowles, in making this change, I think it also makes sense -- for the sake of consistency -- to handle the following as annotated:

message Foo
  string s = 1; /* detached (because ambiguous) */ uint64 i = 2;
}

message Foo
  string s = 1; /* this multi-line
                  comment is actually
                 ambiguous and thus
                will be detached */ uint64 i = 2;
}

message Bar {
  string s = 1; /* trailing */ /* leading */ uint64 i = 2;
}

message Baz {
  string s = 1; /* trailing */ /* detached */ /* leading */ uint64 i = 2;
}

message Baz {
  string s = 1; /* trailing */ /* detached #1 */ /* detached #2 */ /* leading */ uint64 i = 2;
}

WDYT? Preserving the current behavior (specifically the current intended behavior, not buggy behavior) would mean that all of these are discarded. So does it make more sense to classify the comments as above or to drop them?

@jhump
Copy link
Contributor Author

jhump commented Sep 27, 2022

FWIW, the PR I just pushed proceeds with the handling in the previous comment. But it's trivial to adapt to drop the less clear attributions. Just let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants