Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle overflow in IntegerNumberToVarcharCoercer #14179

Conversation

findepi
Copy link
Member

@findepi findepi commented Sep 18, 2022

Before the change, IntegerNumberToVarcharCoercer could produce a Slice value that's not valid value for destination VarcharType.

Fixes #5015

Before the change, `IntegerNumberToVarcharCoercer` could produce a
`Slice` value that's not valid value for destination `VarcharType`.
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #5015

I think we decided that the behaviour should match Hive (since this is coercion within connector not the engine) which ignores the bounds?

(However I also think that the Hive behaviour makes no sense and I prefer this implementation)

@findepi
Copy link
Member Author

findepi commented Sep 19, 2022

I think we decided that the behaviour should match Hive (since this is coercion within connector not the engine) which ignores the bounds?

It cannot ignore bounds. The code before change is obviously wrong: it can produce a value of length n+1 for a varchar(n) data type, which is obviously illegal.

Thus, @hashhar I am not removing any compatibility with Hive. I am replacing silent correctness problem with explicit check.

(However I also think that the Hive behaviour makes no sense and I prefer this implementation)

I actually didn't check what's the Hive behavior. And also prefer this version to alternatives (truncate, replace with NULL, etc).
Anyway, now the problem is explicit in the code and we can improve/change/align with Hive later as needed.

@findepi findepi requested a review from hashhar September 19, 2022 09:03
Copy link
Member

@hashhar hashhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I misunderstood your goal here.

LGTM.

@hashhar
Copy link
Member

hashhar commented Sep 19, 2022

Hive behaviour is documented in #5015

In Hive connector a coercion like INTEGER 12345 to VARCHAR(1) produces 12345 instead of the expected 1 according to Hive 3.1 semantics.

Another point open to discussion is whether a coercion like INTEGER -12345 to VARCHAR(1) should produce - (matching Hive 3.1) or is it worth it for Trino to aim for more sane coercion behaviour.

Copy link
Member

@losipiuk losipiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testable?

@findepi
Copy link
Member Author

findepi commented Sep 19, 2022

testable?

yes, but not worth it IMO

if we want to implement actual hive logic from #5015 (comment), this should go with a test.

@findepi findepi merged commit c59633d into trinodb:master Sep 19, 2022
@findepi findepi deleted the findepi/handle-overflow-in-integernumbertovarcharcoercer-b2b07d branch September 19, 2022 09:15
@github-actions github-actions bot added this to the 397 milestone Sep 19, 2022
long value = fromType.getLong(block, position);
Slice converted = utf8Slice(String.valueOf(value));
if (!toType.isUnbounded() && countCodePoints(converted) > toType.getBoundedLength()) {
throw new TrinoException(INVALID_ARGUMENTS, format("Varchar representation of %s exceed %s bounds", value, toType));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exceed --> exceeds

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@colebow
Copy link
Member

colebow commented Sep 19, 2022

Do we want a release note for this?

@findepi findepi added the no-release-notes This pull request does not require release notes entry label Sep 19, 2022
@findepi
Copy link
Member Author

findepi commented Sep 19, 2022

Nope

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla-signed no-release-notes This pull request does not require release notes entry
Development

Successfully merging this pull request may close these issues.

IntegerNumberToVarchar coercer doesn't respect Varchar bounds
6 participants