-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Greek uppercasing behavior of ignoring accents #3552
Comments
Currently ICU4C's /// ICU4C's TestGreekUpper
#[test]
fn test_greek_upper() {
let cm = CaseMapping::new_with_locale(&locale!("el"));
// https://unicode-org.atlassian.net/browse/ICU-5456
assert_eq!(cm.to_full_uppercase_string("άδικος, κείμενο, ίριδα"), "ΑΔΙΚΟΣ, ΚΕΙΜΕΝΟ, ΙΡΙΔΑ");
// https://bugzilla.mozilla.org/show_bug.cgi?id=307039
// https://bug307039.bmoattachments.org/attachment.cgi?id=194893
assert_eq!(cm.to_full_uppercase_string("Πατάτα"), "ΠΑΤΑΤΑ");
assert_eq!(cm.to_full_uppercase_string("Αέρας, Μυστήριο, Ωραίο"), "ΑΕΡΑΣ, ΜΥΣΤΗΡΙΟ, ΩΡΑΙΟ");
assert_eq!(cm.to_full_uppercase_string("Μαΐου, Πόρος, Ρύθμιση"), "ΜΑΪΟΥ, ΠΟΡΟΣ, ΡΥΘΜΙΣΗ");
assert_eq!(cm.to_full_uppercase_string("ΰ, Τηρώ, Μάιος"), "Ϋ, ΤΗΡΩ, ΜΑΪΟΣ");
assert_eq!(cm.to_full_uppercase_string("άυλος"), "ΑΫΛΟΣ");
assert_eq!(cm.to_full_uppercase_string("ΑΫΛΟΣ"), "ΑΫΛΟΣ");
assert_eq!(cm.to_full_uppercase_string("Άκλιτα ρήματα ή άκλιτες μετοχές"), "ΑΚΛΙΤΑ ΡΗΜΑΤΑ Ή ΑΚΛΙΤΕΣ ΜΕΤΟΧΕΣ");
// http://www.unicode.org/udhr/d/udhr_ell_monotonic.html
assert_eq!(cm.to_full_uppercase_string("Επειδή η αναγνώριση της αξιοπρέπειας"), "ΕΠΕΙΔΗ Η ΑΝΑΓΝΩΡΙΣΗ ΤΗΣ ΑΞΙΟΠΡΕΠΕΙΑΣ");
assert_eq!(cm.to_full_uppercase_string("νομικού ή διεθνούς"), "ΝΟΜΙΚΟΥ Ή ΔΙΕΘΝΟΥΣ");
// http://unicode.org/udhr/d/udhr_ell_polytonic.html
assert_eq!(cm.to_full_uppercase_string("Ἐπειδὴ ἡ ἀναγνώριση"), "ΕΠΕΙΔΗ Η ΑΝΑΓΝΩΡΙΣΗ");
assert_eq!(cm.to_full_uppercase_string("νομικοῦ ἢ διεθνοῦς"), "ΝΟΜΙΚΟΥ Ή ΔΙΕΘΝΟΥΣ");
// From Google bug report
assert_eq!(cm.to_full_uppercase_string("Νέο, Δημιουργία"), "ΝΕΟ, ΔΗΜΙΟΥΡΓΙΑ");
// http://crbug.com/234797
assert_eq!(cm.to_full_uppercase_string("Ελάτε να φάτε τα καλύτερα παϊδάκια!"), "ΕΛΑΤΕ ΝΑ ΦΑΤΕ ΤΑ ΚΑΛΥΤΕΡΑ ΠΑΪΔΑΚΙΑ!");
assert_eq!(cm.to_full_uppercase_string("Μαΐου, τρόλεϊ"), "ΜΑΪΟΥ, ΤΡΟΛΕΪ");
assert_eq!(cm.to_full_uppercase_string("Το ένα ή το άλλο."), "ΤΟ ΕΝΑ Ή ΤΟ ΑΛΛΟ.");
// http://multilingualtypesetting.co.uk/blog/greek-typesetting-tips/
assert_eq!(cm.to_full_uppercase_string("ρωμέικα"), "ΡΩΜΕΪΚΑ");
assert_eq!(cm.to_full_uppercase_string("ή."), "Ή.");
} |
This seems like an i18n quality bug, and we don't want to advertise a component as stabilized with known i18n quality bugs. That's one of the checkboxes for stabilizing any component (along with FFI and docs). We can slip on feature coverage but not correctness |
Would someone be interested in trying to implement this? There's prototype code in https://icu.unicode.org/design/case/greek-upper, and ICU4C/ICU4J both have working implementations. I am unlikely to have time to get to this this week. |
Part of #3234
See https://unicode-org.atlassian.net/browse/ICU-5456, ICU4X does not implement this.
The special case is implemented in GreekUpper in ICU4C and ICU4J. Seems somewhat involved.
The text was updated successfully, but these errors were encountered: