Fix offset check for insert position overflow #517
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When multibyte characters are used, the behavior for the insert function is incorrect. It inserts a character even if the given position is beyond the input string. See the following example:
According to the documentation and if compared to the behavior for single byte characters, this should not have inserted the lowercase
a
.The reason for this is that the code first steps for byte lengths with the following check:
In the case of a multibyte character, the start position won't be beyond the length since the byte length of the input string here is 3 bytes. This checks is a shortcut though and after this the correct offset is computed with charpos.
One thing overlooked here though is that the following changes the meaning of the start value:
It's no longer a value starting with 1, but now with 0. This means that we can't do the same
start > orig_len
because now start will be one less.Hence the check here after the multibyte aware position is calculated, should use
start >= orig_len
instead ofstart > orig_len
.A test for this bug is also added.