Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Full Width Chars Casing #45079

Merged
merged 6 commits into from
Nov 24, 2020
Merged

Conversation

tarekgh
Copy link
Member

@tarekgh tarekgh commented Nov 22, 2020

#44789

When using ICU and doing string compare operations on the Full Width Chars, we are customizing the behavior when using CompareOptions.IgnoreWidth to equalize the half width with the full width character. When not having CompareOptions.IgnoreWidth option but having CompareOptions.IgnoreCase we customize the range to make the normal width characters less than the full width characters. This customization broke the casing of the full width characters e.g. (\uFF41) == (\uFF21).

The change here is:

  • Fix the full width casing
  • Fix the ICU issue with the last entered customization collation rule which used to be ignored. This fixed the last character in the full width range we are customizing.
  • Did some refactoring/optimization. We used to allocate a native memory in 3 places and copy the customized rules contents from one of these buffer to another. Now, we allocate once and avoid any copying. This refactoring also is important as I expect we going to add more customization rule to address some other issues.

@ghost
Copy link

ghost commented Nov 22, 2020

Tagging subscribers to this area: @tarekgh, @safern, @krwq
See info in area-owners.md if you want to be subscribed.

Issue Details

#44789

When using ICU and doing string compare operations on the Full Width Chars, we are customizing the behavior when using CompareOptions.IgnoreWidth to equalize the half width with the full width character. When not having CompareOptions.IgnoreWidth option but having CompareOptions.IgnoreCase we customize the range to make the normal width characters less than the full width characters. This customization broke the casing of the full width characters e.g. (\uFF41) == (\uFF21).

The change here is:

  • Fix the full width casing
  • Fix the ICU issue with the last entered customization collation rule which used to be ignored. This fixed the last character in the full width range we are customizing.
  • Did some refactoring/optimization. We used to allocate a native memory in 3 places and copy the customized rules contents from one of these buffer to another. Now, we allocate once and avoid any copying. This refactoring also is important as I expect we going to add more customization rule to address some other issues.
Author: tarekgh
Assignees: -
Labels:

area-System.Globalization

Milestone: -

@tarekgh
Copy link
Member Author

tarekgh commented Nov 22, 2020

@safern @eerhardt could you please have a look?

CC @ericstj @danmosemsft

@tarekgh
Copy link
Member Author

tarekgh commented Nov 23, 2020

@eerhardt I have addressed your feedback. please let me know if you have any more comments.

@tarekgh tarekgh merged commit 62f55e1 into dotnet:master Nov 24, 2020
@tarekgh tarekgh deleted the FixFullWidthCharCasing branch November 24, 2020 21:51
@ghost ghost locked as resolved and limited conversation to collaborators Dec 24, 2020
@danmoseley
Copy link
Member

Does this need backport to 5.0?

@tarekgh
Copy link
Member Author

tarekgh commented Jan 4, 2021

@danmosemsft this is limited to small range of characters. So, the plan is to watch if we get more complaints about that and then consider it for servicing at that time especially the fix is not trivial and will be good to have it tested for awhile in 6.0.

@danmoseley
Copy link
Member

@tarekgh that makes good sense.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants