Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix offset check for insert position overflow #517

Closed

Conversation

dbussink
Copy link

@dbussink dbussink commented Feb 9, 2024

When multibyte characters are used, the behavior for the insert function is incorrect. It inserts a character even if the given position is beyond the input string. See the following example:

mysql> select insert('Å', 2, 1, 'a');
+--------------------------+
| insert('Å', 2, 1, 'a')   |
+--------------------------+
| Åa                       |
+--------------------------+
1 row in set (0.00 sec)

According to the documentation and if compared to the behavior for single byte characters, this should not have inserted the lowercase a.

The reason for this is that the code first steps for byte lengths with the following check:

  if ((start < 1) || (start > orig_len))
    return res;  // Wrong param; skip insert

In the case of a multibyte character, the start position won't be beyond the length since the byte length of the input string here is 3 bytes. This checks is a shortcut though and after this the correct offset is computed with charpos.

One thing overlooked here though is that the following changes the meaning of the start value:

  --start;  // Internal start from '0'

It's no longer a value starting with 1, but now with 0. This means that we can't do the same start > orig_len because now start will be one less.

Hence the check here after the multibyte aware position is calculated, should use start >= orig_len instead of start > orig_len.

A test for this bug is also added.

When multibyte characters are used, the behavior for the insert function
is incorrect. It inserts a character even if the given position is
beyond the input string. See the following example:

mysql> select insert('Å', 2, 1, 'a');
+--------------------------+
| insert('Å', 2, 1, 'a')   |
+--------------------------+
| Åa                       |
+--------------------------+
1 row in set (0.00 sec)

According to the documentation and if compared to the behavior for
single byte characters, this should not have inserted the lowercase 'a'.

The reason for this is that the code first steps for byte lengths with
the following check:

  if ((start < 1) || (start > orig_len))
    return res;  // Wrong param; skip insert

In the case of a multibyte character, the start position won't be beyond
the length since the byte length of the input string here is 3 bytes.
This checks is a shortcut though and after this the correct offset is
computed with charpos.

One thing overlooked here though is that the following changes the
meaning of the start value:

  --start;  // Internal start from '0'

It's no longer a value starting with 1, but now with 0. This means that
we can't do the same start > orig_len because now start will be one
less.

Hence the check here after the multibyte aware position is calculated,
should use start >= orig_len instead of start > orig_len.

A test for this bug is also added.
@mysql-oca-bot
Copy link

Hi, thank you for your contribution. Please confirm this code is submitted under the terms of the OCA (Oracle's Contribution Agreement) you have previously signed by cutting and pasting the following text as a comment:
"I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it."
Thanks

@dbussink
Copy link
Author

I confirm the code being submitted is offered under the terms of the OCA, and that I am authorized to contribute it.

@mysql-oca-bot
Copy link

Hi, thank you for your contribution. Your code has been assigned to an internal queue. Please follow
bug http://bugs.mysql.com/bug.php?id=113962 for updates.
Thanks

@dbussink dbussink deleted the dbussink/fix-insert-func-bug branch May 21, 2024 11:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants