-
-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do we need CollatorType.cldrWithoutFFFx? #794
Comments
@macchiati do we need the cldrWithoutFFFx option? |
Hmmm. As I recall, the FFFE and FFFF are to allow users to have minimum and maximum collation elements. As long as we continue to keep those in the CLDR data, I think we are ok. |
Of course we are going to keep them in CLDR. --> https://www.unicode.org/reports/tr35/tr35-collation.html#tailored_noncharacter_weights It (still) makes sense that we have two choices for collators, but why three? class UCA -->
|
I don't recall any reason to keep the without ..
…On Wed, Aug 21, 2024, 08:51 Markus Scherer ***@***.***> wrote:
Hmmm. As I recall, the FFFE and FFFF are to allow users to have minimum
and maximum collation elements. As long as we continue to keep those in the
CLDR data, I think we are ok.
Of course we are going to keep them in CLDR. -->
https://www.unicode.org/reports/tr35/tr35-collation.html#tailored_noncharacter_weights
It (still) makes sense that we have *two* choices for collators, but why
*three*? class UCA -->
public enum CollatorType {
ducet,
cldr,
cldrWithoutFFFx
}
—
Reply to this email directly, view it on GitHub
<#794 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACJLEMBJWGYNKRH7AWL2UUTZSSZPTAVCNFSM6AAAAABHCPK3DWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDGMBSGQZDGMJQGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Thanks. Setting priority=high because the question is resolved, and it looks like the code change will be easy. |
WriteCollationData.getCollator(type) (issue #793 would move this function to class UCA) works with three types, one is cldrWithoutFFFx which builds a CLDR collator except that it leaves U+FFFE and U+FFFF with their DUCET mappings rather than their CLDR tailorings.
Strangely, FractionalUCA.java works with such a collator, even though it writes "SPECIAL MAX/MIN COLLATION ELEMENTS" for these noncharacters, corresponding to the CLDR tailorings.
This type is also used for UCA.Main option testCompatibilityCharacters.
Why? It seems confusing to have this third type, especially to get something different from what we actually output.
Try to remove it and only use either a DUCET collator or a CLDR collator.
If we need and keep this option, then at least consider changing buildCldrCollator(boolean) to buildCldrCollator(enum type) for readability.
@macchiati FYI
The text was updated successfully, but these errors were encountered: