Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to Unicode 15.1, 16.0 #666

Merged
merged 3 commits into from
Nov 9, 2024

Conversation

bnoordhuis
Copy link
Contributor

Functional but imperfect. I'll review and point out issues.

test262/test/built-ins/RegExp/property-escapes/generated/XID_Start.js:16: Test262Error: `\P{XID_Start}` should match U+001C89 (`Ᲊ`)
test262/test/built-ins/RegExp/property-escapes/generated/XID_Start.js:16: strict mode: Test262Error: `\P{XID_Start}` should match U+001C89 (`Ᲊ`)
test262/test/built-ins/RegExp/unicode_full_case_folding.js:20: Test262Error: \u1fd3 does not match \u0390
test262/test/built-ins/RegExp/unicode_full_case_folding.js:20: strict mode: Test262Error: \u1fd3 does not match \u0390
Copy link
Contributor Author

@bnoordhuis bnoordhuis Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The upgrade to 15.1 fixed more tests than it broke (16.0 is a different story) but special-casing codepoints U+FB05, U+1FD3 and U+1FE3 has the downside that regex properties like \p{Changes_When_Casefolded} don't work.

I'm not sure how to best fix that. I spent quite some time on it but... they're just different enough that they don't play nice with the existing algorithms and data structures.

edit: #536 doesn't have this particular issue but it breaks test262/test/language/literals/regexp/u-case-mapping.js ("Test262Error: Case mapping is applied in the presence of the i and u flags") and that's even more suspect, IMO. At least here it's clear why it fails.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to fix this by retrofitting those special-cased code points into the existing RUN_TYPE encodings.

The thing that threw me off is that it works differently between 15.1 and 16.0, making me think it didn't work at all when in fact it only didn't work with 15.1 (and even then it still works most of the time, see test262_errors.txt for that commit.)

@@ -288,4 +299,7 @@ DEF(XID_Start, "XIDS")
/* internal tables with index */
DEF(Cased1, "")

/* unused by us */
DEF(InCB, "")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is Indic_Conjunct_Break and I believe it's not supposed to be JS-visible because it isn't in V8 and test262 doesn't test it but I did see a reference to it in Servo's code base so... dunno 🤷

It's not visible in qjs at any rate.

Copy link
Contributor

@saghul saghul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than have them in the error file, perhaps we could skip them? I think there are already some Unicode tests skipped aren't they?

Seems I forgot to update the unicode_download.sh script when I upgraded
from Unicode 14.0 to 15.0 in November 2023.
@bnoordhuis
Copy link
Contributor Author

I think there are already some Unicode tests skipped aren't they?

Yes, for reasons of partial support and/or general slowness. I'll keep these in test262_errors.txt so they're not forgotten. E.g. the \p{Alphabetic} test failure is probably a relatively straightforward fix.

@bnoordhuis bnoordhuis merged commit 243b968 into quickjs-ng:master Nov 9, 2024
47 checks passed
@bnoordhuis bnoordhuis deleted the upgrade-unicode branch November 9, 2024 22:14
This was referenced Nov 9, 2024
@bnoordhuis
Copy link
Contributor Author

For posterity: the failing tests are in large part because we're currently still at a test262 commit that tests Unicode v15.

Upgrading to tc39/test262@77e98fb or newer gets us Unicode v16 coverage and that likely fixes a fair number of tests, including that \p{Alphabetic} test.

@saghul
Copy link
Contributor

saghul commented Nov 13, 2024

Great! Shall we update then?

@bnoordhuis
Copy link
Contributor Author

Yep, working on that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants