Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace HiveQlTranslation with a non-regex-based parser #4258

Merged
merged 1 commit into from
Jul 7, 2020

Conversation

jirassimok
Copy link
Member

The new parser does the same thing the regular expression was supposed to do; it just replaces the quotation marks and makes sure that any contained quotation marks are properly escaped. No other escape sequences are handled (which will cause unexpected behavior if there's an escape sequence in a Hive view).

Also, the error thrown when parsing fails needs to be replaced, but I'm not sure what fits, besides adding a new error to HiveErrorCode. Maybe HIVE_PARSE_ERROR?

This addresses #3266.

@cla-bot cla-bot bot added the cla-signed label Jun 29, 2020
@findepi findepi requested a review from alexjo2144 June 29, 2020 11:07
@jirassimok jirassimok force-pushed the fix-hive-view-translation branch from cc02495 to 9edd087 Compare June 29, 2020 12:35
@jirassimok jirassimok force-pushed the fix-hive-view-translation branch 4 times, most recently from 8b094e4 to 54faac1 Compare June 29, 2020 15:38
@jirassimok jirassimok force-pushed the fix-hive-view-translation branch from 54faac1 to 71762ad Compare June 29, 2020 16:59
if (!input.hasNext()) {
break; // skip to end-of-input error
}
// Don't handle escape sequences, just drop the backslash.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems wrong. We should fail if there are escape sequences if we're not going to handle them. Otherwise, we are returning a known invalid result.

Copy link
Member Author

@jirassimok jirassimok Jun 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. I originally tried to figure out what escape sequences Hive supports, but eventually decided against it and ended up leaving the behavior the same as it was before.

I could pretty easily cover the common cases if we want, or I suspect Hive uses ...hive.ql.parse.BaseSemanticAnalyzer.unescapeSQLString or something similar so I could try using that (I'd basically just search for the end of the string, pass the whole thing to that function, then escape only single quotes from that output).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first new commit rejects all escape sequences, and the second uses the function I mentioned above to translate all of them. I can squash them or drop the last one depending on what we want to do.

@jirassimok jirassimok force-pushed the fix-hive-view-translation branch from 71762ad to 74f6399 Compare June 30, 2020 13:19
@jirassimok jirassimok requested a review from electrum June 30, 2020 13:57
@jirassimok jirassimok force-pushed the fix-hive-view-translation branch from 4d1aa49 to 33841f2 Compare July 1, 2020 14:56
Copy link
Member

@electrum electrum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please squash the commits

@jirassimok jirassimok force-pushed the fix-hive-view-translation branch 2 times, most recently from 8d6dd51 to 6f99d54 Compare July 1, 2020 19:01
Squashed:
- Use Hive's string unescaping to translate escape sequences in strings
@jirassimok jirassimok force-pushed the fix-hive-view-translation branch from 6f99d54 to 4cb7387 Compare July 1, 2020 23:27
@findepi
Copy link
Member

findepi commented Jul 3, 2020

CI failed -- #3161

@electrum electrum merged commit 7a243d0 into trinodb:master Jul 7, 2020
@electrum
Copy link
Member

electrum commented Jul 7, 2020

Thanks!

@jirassimok jirassimok deleted the fix-hive-view-translation branch October 5, 2020 17:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

5 participants