-
Notifications
You must be signed in to change notification settings - Fork 136
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update to Unicode 15.1, 16.0 #666
Conversation
test262_errors.txt
Outdated
test262/test/built-ins/RegExp/property-escapes/generated/XID_Start.js:16: Test262Error: `\P{XID_Start}` should match U+001C89 (``) | ||
test262/test/built-ins/RegExp/property-escapes/generated/XID_Start.js:16: strict mode: Test262Error: `\P{XID_Start}` should match U+001C89 (``) | ||
test262/test/built-ins/RegExp/unicode_full_case_folding.js:20: Test262Error: \u1fd3 does not match \u0390 | ||
test262/test/built-ins/RegExp/unicode_full_case_folding.js:20: strict mode: Test262Error: \u1fd3 does not match \u0390 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The upgrade to 15.1 fixed more tests than it broke (16.0 is a different story) but special-casing codepoints U+FB05, U+1FD3 and U+1FE3 has the downside that regex properties like \p{Changes_When_Casefolded}
don't work.
I'm not sure how to best fix that. I spent quite some time on it but... they're just different enough that they don't play nice with the existing algorithms and data structures.
edit: #536 doesn't have this particular issue but it breaks test262/test/language/literals/regexp/u-case-mapping.js ("Test262Error: Case mapping is applied in the presence of the i
and u
flags") and that's even more suspect, IMO. At least here it's clear why it fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I managed to fix this by retrofitting those special-cased code points into the existing RUN_TYPE encodings.
The thing that threw me off is that it works differently between 15.1 and 16.0, making me think it didn't work at all when in fact it only didn't work with 15.1 (and even then it still works most of the time, see test262_errors.txt for that commit.)
@@ -288,4 +299,7 @@ DEF(XID_Start, "XIDS") | |||
/* internal tables with index */ | |||
DEF(Cased1, "") | |||
|
|||
/* unused by us */ | |||
DEF(InCB, "") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is Indic_Conjunct_Break and I believe it's not supposed to be JS-visible because it isn't in V8 and test262 doesn't test it but I did see a reference to it in Servo's code base so... dunno 🤷
It's not visible in qjs at any rate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than have them in the error file, perhaps we could skip them? I think there are already some Unicode tests skipped aren't they?
Seems I forgot to update the unicode_download.sh script when I upgraded from Unicode 14.0 to 15.0 in November 2023.
7f7e9ee
to
0eab017
Compare
Yes, for reasons of partial support and/or general slowness. I'll keep these in test262_errors.txt so they're not forgotten. E.g. the \p{Alphabetic} test failure is probably a relatively straightforward fix. |
For posterity: the failing tests are in large part because we're currently still at a test262 commit that tests Unicode v15. Upgrading to tc39/test262@77e98fb or newer gets us Unicode v16 coverage and that likely fixes a fair number of tests, including that |
Great! Shall we update then? |
Yep, working on that! |
Functional but imperfect. I'll review and point out issues.