Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

formatter should translate HTML entities to code points where possible #477

Closed
bakkot opened this issue Aug 17, 2022 · 2 comments · Fixed by #481
Closed

formatter should translate HTML entities to code points where possible #477

bakkot opened this issue Aug 17, 2022 · 2 comments · Fixed by #481

Comments

@bakkot
Copy link
Contributor

bakkot commented Aug 17, 2022

cf tc39/ecma262#404 (comment)

See also #476.

Note to self: this should not include blank codepoints like   or ‎. "blank" probably means "Control, White_Space, or Default_Ignorable", in Unicode terms. (Linked page does not include "Control" but it seems an obvious thing to add; it is not a subset of DI, per its definition, and indeed \u0000 is not DI.)

Also there's 93 entities which expand to multiple code points, often including a variation selector or "U+0338 COMBINING LONG SOLIDUS OVERLAY". Should exclude any multi-code point ligatures where at least one code point is blank in the above sense, and probably any which include combining characters (i.e. gc=M).

I.e. exclude anything which expands to a sequence of code points which matches /\p{White_Space}|\p{DI}|\p{gc=M}|\p{gc=C}/u.

@ljharb
Copy link
Member

ljharb commented Aug 17, 2022

Additionally, can the linter require the formatter's output in this case?

@bakkot
Copy link
Contributor Author

bakkot commented Aug 17, 2022

There's a --check flag to the formatter which asserts that the input is already formatted in the form which would be output by the formatter, which is intended to serve that purpose. (We use that in CI on 262.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants