-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expression: check max_allowed_packet constraint for function insert #7502
Conversation
Kenan Yao seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
expression/builtin_string.go
Outdated
if uint64(pos-1+int64(len(newstr)))*uint64(mysql.MaxBytesOfCharacter) > b.maxAllowedPacket { | ||
b.ctx.GetSessionVars().StmtCtx.AppendWarning(errWarnAllowedPacketOverflowed.GenByArgs("insert", b.maxAllowedPacket)) | ||
return "", true, nil | ||
} | ||
return str[0:pos-1] + newstr, false, nil | ||
} | ||
return str[0:pos-1] + newstr + str[pos+length-1:], false, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we check whether len(str[0:pos-1] + newstr + str[pos+length-1:]) > b. maxAllowedPacket
?
expression/builtin_string.go
Outdated
@@ -3216,18 +3225,24 @@ func (b *builtinInsertBinarySig) evalString(row chunk.Row) (string, bool, error) | |||
} | |||
|
|||
if length > strLength-pos+1 || length < 0 { | |||
if uint64(pos-1+int64(len(newstr)))*uint64(mysql.MaxBytesOfCharacter) > b.maxAllowedPacket { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function len()
returns the number of bytes the string contains, no need to multiple mysql.MaxBytesOfCharacter
.
to #7153 |
- compute bytes resulted correctly; - cover more cases of insert parameters;
LGTM |
/run-all-tests |
/run-integration-common-test |
@winoros @lamxTyler PTAL |
expression/builtin_string.go
Outdated
length = runeLength - pos + 1 | ||
} | ||
|
||
if uint64(runeLength-length)*uint64(mysql.MaxBytesOfCharacter)+uint64(len(newstr)) > b.maxAllowedPacket { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the number of bytes of some character is less than MaxBytesOfCharacter
? Will this raise warning while it should not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this false-negative case exists, but to compute the exact bytes of these characters, we may need the charset info of the string, and then compute the length of specific characters for this charset, which incurs too much overhead, I guess that is why we use MaxBytesOfCharacter
in other functions such as builtinLpadSig::evalString
. @zz-jason should we compute the exact bytes for the string?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we can build the string first, and use len() to compute the consumed bytes, but is it possible that panic is raised in building the string because it is too large? I think the purpose of max_allowed_packet
check here is to prevent this kind of panic to some extent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in MySQL, the byte count for lpad
is calculated by:
byte_count = count * collation.collation->mbmaxlen;
as for the insert
function, seems MySQL calculates the exact bytes for the string:
/*
There is one exception not handled (intentionaly) by the character set
aggregation code. If one string is strong side and is binary, and
another one is weak side and is a multi-byte character string,
then we need to operate on the second string in terms on bytes when
calling ::numchars() and ::charpos(), rather than in terms of characters.
Lets substitute its character set to binary.
*/
if (collation.collation == &my_charset_bin) {
res->set_charset(&my_charset_bin);
res2->set_charset(&my_charset_bin);
}
/* start and length are now sufficiently valid to pass to charpos function */
start = res->charpos((int)start);
length = res->charpos((int)length, (uint32)start);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why first build it? It just constitutes of three parts, so sum them would be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems there isn't a convenient and efficient way to calculate the byte count of a []rune
. @tiancaiamao any idea?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another method is to write a utf-8 code point iterator like this: https://gist.github.com/zz-jason/078110974bb931b7f8e3432775ecfd05, we can iterate on the origin []byte
, find the code point located at pos
and pos+length
, and the count the bytes for each []rune
prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh you are right... updated
/run-all-tests |
2 similar comments
/run-all-tests |
/run-all-tests |
/rebuild |
/run-all-tests |
/run-all-tests |
tests green, @zz-jason , @lamxTyler PTAL, thx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
What problem does this PR solve?
Return
NULL
and a warning when the result of functioninsert
exceedsmax_allowed_packet
.Before this PR,
SELECT INSERT('abc', 1, -1, 'abcd');
would return result'abcd'
even if@@global.max_allowed_packet
is set3
.After this PR:
What is changed and how it works?
valString
method ofbuiltinInsertSig
andbuiltinInsertBinarySig
againstmax_allowed_packet
;Check List
Tests
Related changes