-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evalEngine: Implement string INSERT
#15201
Conversation
Signed-off-by: Noble Mittal <noblemittal@outlook.com>
Signed-off-by: Noble Mittal <noblemittal@outlook.com>
Signed-off-by: Noble Mittal <noblemittal@outlook.com>
Signed-off-by: Noble Mittal <noblemittal@outlook.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
"5", | ||
"0", | ||
"2", | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this list should also include a bunch of non ideal cases. Like NULL
(see the earlier comment), decimals, floats etc. I think for example inputBitwise
is a good existing list with mostly numerical values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way to think about this is "how would I break this" by throwing explicitly bad / unexpected data at the function to see how it behaves then to make sure it's consistent with MySQL (and doesn't panic or error unexpectedly etc).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, I forgot to add overflow checks.
go/vt/vtgate/evalengine/fn_string.go
Outdated
switch { | ||
case str.isTextual(): | ||
default: | ||
c.asm.Convert_xce(4, str.Type, call.collate) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this convert to the collation given by call.collate
or by the other argument? Or should we merge collations with the appropriate logic? See also #15195 for how collations are handled there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we can merge the collations here, as MySQL returns the collations of the str
, not newstr
if it is textual.
This can be observed by running these 2 queries: SELECT COLLATION(INSERT(_latin1 0xFF, 1, 3, "asd"));
and SELECT COLLATION(INSERT("asd", 2, 3, _latin1 0xFF));
.
But, I think there was a mistake here, as it should be c.asm.Convert_xce(4, sqltypes.VarChar, c.collation)
We have handled a similar case in LPAD
/RPAD
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the pad functions, we convert to the collation of str
, and looking at how this works, we should do the same here?
So should it be c.asm.Convert_xce(4, sqltypes.VarChar, col.collation)
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah wait, that's already the case, but that's missing checks to not emit that conversion if it's not needed.
Signed-off-by: Noble Mittal <noblemittal@outlook.com>
|
||
front := charset.Slice(cs, str.bytes, 0, pos) | ||
var back []byte | ||
if pos <= math.MaxInt-l && pos+l < strLen { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pos <= math.MaxInt-l
can be used here to check overflow, as both pos and l are positive (checks added above).
go/vt/vtgate/evalengine/fn_string.go
Outdated
col = typedCoercionCollation(sqltypes.VarChar, c.collation) | ||
} | ||
|
||
c.asm.Convert_xce(1, sqltypes.VarChar, col.Collation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this conversion should be conditional? This is only needed if it's not already a string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
c.asm.Fn_INSERT(col) | ||
c.asm.jumpDestination(skip) | ||
|
||
return ctype{Type: sqltypes.VarChar, Col: col, Flag: flagNullable}, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there cases where this function can return NULL
if all inputs are not NULL
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if the resultant string size is greater than max_allowed_packet, I think I forgot to add the validation for that. Pushed now. 😅
Signed-off-by: Noble Mittal <noblemittal@outlook.com>
@beingnoble03 So there's a failure in the Otherwise I can also push up that change if you want. |
Signed-off-by: Noble Mittal <noblemittal@outlook.com>
@dbussink done. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## main #15201 +/- ##
==========================================
+ Coverage 67.28% 67.33% +0.05%
==========================================
Files 1560 1560
Lines 192264 192395 +131
==========================================
+ Hits 129366 129556 +190
+ Misses 62898 62839 -59 ☔ View full report in Codecov by Sentry. |
"0123", | ||
"0xAACC", | ||
"3.1415926", | ||
// MySQL has broken behavior for these inputs, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can reference mysql/mysql-server#517 here so we know what this is about in the future too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done.
Signed-off-by: Noble Mittal <noblemittal@outlook.com>
|
||
copy(res[:len(front)], front) | ||
copy(res[len(front):], newstr.bytes) | ||
copy(res[len(front)+len(newstr.bytes):], back) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vmg Do you think it's worth extending charset.Slice
to allow passing in an existing output []byte
or is that a bit premature just for this function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make for a more awkward API, don't you think? Right now Slice
doesn't allocate (it returns a sub-slice of the input), so if we add an out
parameter, we'd have to make it always allocate or have two code paths in the function for when out
is nil
.
Furthermore, if the Slice
API had an out
parameter, it would allocate more than once when used for INSERT
, since we wouldn't know the required capacity of the out
buffer, so it'd have to grow it at least twice.
With the current non-allocating Slice
, we just call the function twice and then we get to compose the final result outside of the function with a single allocation, since we know the final size. I quite dig the version of the code as it is written.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yeah, good point. It's already non-allocating so it doesn't really help in any way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lookin' great as usual!
|
||
copy(res[:len(front)], front) | ||
copy(res[len(front):], newstr.bytes) | ||
copy(res[len(front)+len(newstr.bytes):], back) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would make for a more awkward API, don't you think? Right now Slice
doesn't allocate (it returns a sub-slice of the input), so if we add an out
parameter, we'd have to make it always allocate or have two code paths in the function for when out
is nil
.
Furthermore, if the Slice
API had an out
parameter, it would allocate more than once when used for INSERT
, since we wouldn't know the required capacity of the out
buffer, so it'd have to grow it at least twice.
With the current non-allocating Slice
, we just call the function twice and then we get to compose the final result outside of the function with a single allocation, since we know the final size. I quite dig the version of the code as it is written.
Description
This PR adds implementation of
INSERT
func in evalengine.Related Issue(s)
Fixes part of #9647
Checklist
Deployment Notes