-
-
Notifications
You must be signed in to change notification settings - Fork 759
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ICU-20575 fix broken default locale mapping for C.UTF-8 #634
Conversation
Test case: <param name="locale.default">en_US_POSIX</param>
<param name="locale.default.bcp47">en-US-u-va-posix</param> Before this bug, they return |
I really don't like these mappings at this level. There has to be a way to handle "C.UTF-8" in the getDefaultLocale() code. |
FWIW, I believe this exacerbated the issue with the DateTimePatternGenerator (ICU-20558), as "C.UTF-8" would be treated as a default locale of root, rather than "en_US_POSIX". |
Markus, makes sense. I'll see what i can do. |
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
@markusicu PTAL at the updated approach. There was another bug, where |
Thanks for fixing it all via putil.cpp! Is the PR description still accurate? |
@markusicu welcome. the PR is accurate now (and the commit message already was) |
icu4c/source/common/putil.cpp
Outdated
// Over-allocate in case we replace "@" with "__". | ||
char *correctedPOSIXLocale = static_cast<char *>(uprv_malloc(uprv_strlen(posixID) + 1 + 1)); | ||
// Over-allocate in case we replace "@" with "__" or "C" with "en_US_POSIX" | ||
char *correctedPOSIXLocale = static_cast<char *>(uprv_malloc(uprv_strlen(posixID) + 11 + 1)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI +10 would suffice because en_US_POSIX has length 11 which is 10 longer than C.
Unless I am mis-counting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right.
Regression was in 1afef30 PR unicode-org#418 [ICU-20187] - We dropped the mapping from "C" in uloc_canonicalize, but then putil did not handle cases where a codepage was set (such as C.UTF-8). - Add an additional check in uprv_getDefaultLocaleID() for locales that end up as "C" or "POSIX" after removing codepage suffix. - Also fix regression where aa@bb would become aa__BB__BB (incorrectly doubled __BB)
Notice: the branch changed across the force-push!
~ Your Friendly Jira-GitHub PR Checker Bot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
- Floating patch for ICU 64.x - includes test case ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575 Backport of: unicode-org/icu#634 Fixes: nodejs#27418
- Floating patch for ICU 64.x - includes test case ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575 Backport of: unicode-org/icu#634 Fixes: #27418 PR-URL: #27435 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Richard Lau <riclau@uk.ibm.com> Reviewed-By: Michaël Zasso <targos@protonmail.com>
- Floating patch for ICU 64.x - includes test case ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575 Backport of: unicode-org/icu#634 Fixes: #27418 PR-URL: #27435 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Richard Lau <riclau@uk.ibm.com> Reviewed-By: Michaël Zasso <targos@protonmail.com>
- Floating patch for ICU 64.x - includes test case ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575 Backport of: unicode-org/icu#634 Fixes: #27418 PR-URL: #27435 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Richard Lau <riclau@uk.ibm.com> Reviewed-By: Michaël Zasso <targos@protonmail.com>
- Floating patch for ICU 64.x - includes test case ICU Bug: https://unicode-org.atlassian.net/browse/ICU-20575 Backport of: unicode-org/icu#634 Fixes: #27418 PR-URL: #27435 Reviewed-By: Refael Ackermann <refack@gmail.com> Reviewed-By: Richard Lau <riclau@uk.ibm.com> Reviewed-By: Michaël Zasso <targos@protonmail.com>
LANG=C.UTF-8
would not map toen_US_POSIX
but to invalid languagec
LANG=aa@BB
(and similar) would map toaa__BB__BB
Checklist