Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix compiler error for lambda parameter with non-letter #5087

Merged
merged 4 commits into from
Sep 9, 2020

Conversation

findepi
Copy link
Member

@findepi findepi commented Sep 7, 2020

Fix a generated code error when lambda parameter contains a character
that is not part of a valid Java identifier.

Fixes #5085

@findepi findepi requested a review from martint September 7, 2020 08:28
@cla-bot cla-bot bot added the cla-signed label Sep 7, 2020
*/
private static String parameterName(int argumentPosition, String lambdaArgumentName)
{
return "lambda_" + argumentPosition + "_" + lambdaArgumentName.replaceAll("[^a-zA-Z_]", "");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I realize "lambda_" + argumentPosition is enough from generated code perspective.
I retained argument name in case there is any need to debug the generated code.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Praveen2112 was there a similar escaping somewhere already? I think you did one, but I can't remember where was it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes !! It was in scalar function .. #2081

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, updated

@findepi findepi force-pushed the lamb branch 2 times, most recently from b5484f3 to 973f08c Compare September 7, 2020 12:47
@findepi findepi requested a review from sopel39 September 7, 2020 12:49
*/
private static String parameterName(int argumentPosition, String lambdaArgumentName)
{
return "lambda_" + argumentPosition + "_" + lambdaArgumentName.replaceAll("[^a-zA-Z_]", "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Praveen2112 was there a similar escaping somewhere already? I think you did one, but I can't remember where was it.

@findepi findepi force-pushed the lamb branch 2 times, most recently from 4feb1eb to d9508af Compare September 8, 2020 12:13
/**
* Parameter name for the generated method.
*/
private static String parameterName(int argumentPosition, String lambdaArgumentName)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current approach in the PR makes sense, considering that the construction of the class definition shouldn't make any assumptions about the symbols, and do this sanitization before creating parameter names.

Another place to do this sanitization could be SymbolAllocator#newSymbol. As a side-effect, it may improve plan readability (a minor benefit though). For example, if we truncate the string while "sanitizing" the symbols for the issue mentioned in #5088, the explain plan would display the truncated symbol. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another place to do this sanitization could be SymbolAllocator#newSymbol.

I thought about that. IMO the change here does not preclude change in SymbolAllocator.
And the change in SymbolAllocator does not make the change here obsolete:
compiler code should make as few assumptions about planner as possible.
It would feel wrong if planner had to consider Java class format constraints.

For #5088, as you pointed out, there are other benefits from keeping symbols short.
Plan readability, but also guard against any potential perf penalties
(at least in theory, a potential DOS attack vector).

Although the unqualified name in JVM apparently _can_ contain braces `(`
or `)`, the intention was to remove them, retaining only letters,
digits, underscore and dollar sign.
Fix a generated code error when lambda parameter contains a character
that is not part of a valid Java identifier.
@findepi
Copy link
Member Author

findepi commented Sep 9, 2020

CI failed -- #4568

@findepi
Copy link
Member Author

findepi commented Sep 9, 2020

CI failed -- #4973

@findepi findepi merged commit 8732f67 into trinodb:master Sep 9, 2020
@findepi findepi deleted the lamb branch September 9, 2020 09:06
@findepi findepi added this to the 342 milestone Sep 9, 2020
@findepi findepi mentioned this pull request Sep 9, 2020
9 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

Query fails when lambda expression references column containing a dot: ClassFormatError: Illegal field name
4 participants