-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use v
flag instead of u
for pattern
RegExps
#7908
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as resolved.
Chrome 112 is set to include support for the RegExp Given that, I’d like to ask for initial feedback on this proposal. I’d be happy to write WPT tests + file implementation bugs, but wanted to double-check first if this is something worth pursuing. The functionality seems useful but perhaps the potential breakage is not worth it or warrants a different approach (like a separate attribute). Please let me know your thoughts! WDYT @domenic @annevk @zcorpan @domfarolino? |
Have you looked at httparchive for the breaking change cases, or implemented a use counter? |
This generally seems like a good idea, provided it's web compatible. |
I’ve fixed some of the existing |
@pthier is implementing a V8-level use counter for the cases listed above occurring in |
Proposed WPT tests: web-platform-tests/wpt#38325 The compatibility risk is smaller than I previously thought, since throwing patterns result in |
Some regular expression patterns that were valid with /u are invalid with /v flag. This CL adds a UseCounter for such usages in /u to get an idea how often they are used in the wild. This is important information w.r.t the proposal to use /v instead of /u for the pattern attribute (http://go/gh/whatwg/html/pull/7908). V8 CL: http://go/crrv/c/4212393 Bug: v8:11935 Change-Id: I3f049a2196c2f360eeeb8eb54438d3ea0d534345 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4221395 Commit-Queue: Patrick Thier <pthier@chromium.org> Reviewed-by: Michael Lippautz <mlippautz@chromium.org> Cr-Commit-Position: refs/heads/main@{#1101528}
I'm also supportive of this proposal, provided it proves to be web compatible. |
This patch removes the ScriptRegexp CharacterMode enum ({BMP, UTF16}) and instead introduces a UnicodeMode enum ({kBmpOnly, kUnicode, kUnicodeSets}) distinguishing non-u, u, and v RegExps respectively. The new kUnicodeSets value is required to implement a proposed change to the HTML pattern attribute [1], to be enabled in a separate CL. [1]: whatwg/html#7908 Bug: chromium:1412729 Change-Id: I40d3982476b62e517b85b799238f9c093f74e518 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4239709 Reviewed-by: Mason Freed <masonf@chromium.org> Commit-Queue: Mathias Bynens <mathias@chromium.org> Cr-Commit-Position: refs/heads/main@{#1104723}
A proposed change to the HTML pattern attribute [1] swaps the `u` RegExp flag for the new `v` flag [2], resulting in some potential incompatibility. Some previously valid patterns are now errors, specifically those with a character class including either an unescaped special character or a double punctuator: pattern="[(]" pattern="[)]" pattern="[[]" pattern="[{]" pattern="[}]" pattern="[/]" pattern="[-]" pattern="[|]" pattern="[&&]" pattern="[!!]" pattern="[##]" pattern="[$$]" pattern="[%%]" pattern="[**]" pattern="[++]" pattern="[,,]" pattern="[..]" pattern="[::]" pattern="[;;]" pattern="[<<]" pattern="[==]" pattern="[>>]" pattern="[??]" pattern="[@@]" pattern="[``]" pattern="[~~]" pattern="[_^^]" We don’t expect such patterns to be very common. This UseCounter aims to validate that assumption. Note that throwing patterns result in `inputElement.validity.valid === true` for any input value, so the only compatibility risk is that some value/pattern combinations that would previously result in `inputElement.validity.valid === false` now result in `inputElement.validity.valid === true`. [1]: whatwg/html#7908 [2]: https://v8.dev/features/regexp-v-flag Bug: chromium:1412729 Change-Id: Ifa8bcc27dbf6e8a2a7098643dbb27a7633bb97de Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4249120 Reviewed-by: Mason Freed <masonf@chromium.org> Commit-Queue: Mathias Bynens <mathias@chromium.org> Reviewed-by: Alexei Svitkine <asvitkine@chromium.org> Cr-Commit-Position: refs/heads/main@{#1105129}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If specified, the attribute's value must match the JavaScript Pattern[+UnicodeMode, +N] production.
It seems this is being modified by the "v" proposal so this probably needs updating as well?
Also, if we want to land this before it's integrated into JavaScript proper we'll need to add references to the proposal as we have done for other JavaScript proposals.
A proposed change to the HTML pattern attribute [1] swaps the `u` RegExp flag for the new `v` flag [2], enabling the use of three new features: - set notation - string literal syntax - Unicode properties of strings This CL is an alternative to [3], which simply enables the feature by default. Intent to Ship: https://groups.google.com/a/chromium.org/g/blink-dev/c/gIyvMw0n2qw [1]: whatwg/html#7908 [2]: https://v8.dev/features/regexp-v-flag [3]: https://chromium-review.googlesource.com/c/chromium/src/+/4253834 Bug: chromium:1412729 Change-Id: Ifa825789ad7ad8bb9347f8e652483d668f69b116 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4414859 Reviewed-by: Mason Freed <masonf@chromium.org> Code-Coverage: Findit <findit-for-me@appspot.gserviceaccount.com> Commit-Queue: Mathias Bynens <mathias@chromium.org> Cr-Commit-Position: refs/heads/main@{#1131075}
Update: The Intent just got approved. How do we feel about landing the WPT tests + this spec patch now? W.r.t. the
More details are linked for each entry. The TL;DR is that although there are many distinct sources of UseCounter hits, I haven’t found a single case that actually constituted a compat problem. |
A proposed change to the HTML pattern attribute [1] swaps the `u` RegExp flag for the new `v` flag [2], enabling the use of three new features: - set notation - string literal syntax - Unicode properties of strings [1]: whatwg/html#7908 [2]: https://v8.dev/features/regexp-v-flag Intent to Ship with LGTMs: https://groups.google.com/a/chromium.org/g/blink-dev/c/gIyvMw0n2qw/m/S8ZVYl89CgAJ Bug: chromium:1412729 Change-Id: I7165cfc3f862c1feb5417723681f82459d1be6d5 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4455084 Reviewed-by: Mason Freed <masonf@chromium.org> Commit-Queue: Mason Freed <masonf@chromium.org> Auto-Submit: Mathias Bynens <mathias@chromium.org> Reviewed-by: Yoav Weiss <yoavweiss@chromium.org> Cr-Commit-Position: refs/heads/main@{#1133838}
SGTM. I'll let @annevk do the honors. |
Thanks everyone! I’ve prepared the follow-up PR for if/when the TC39 proposal (in particular this PR) gets merged into the ECMAScript spec: #9213 |
…he RegExp `v` flag, a=testonly Automatic update from web-platform-tests HTML: tests for <input pattern> using the RegExp v flag HTML PR: whatwg/html#7908. -- wpt-commits: 000b1c60057e614908f7450353d72f9dda117d33 wpt-pr: 38547
…he RegExp `v` flag, a=testonly Automatic update from web-platform-tests HTML: tests for <input pattern> using the RegExp v flag HTML PR: whatwg/html#7908. -- wpt-commits: 000b1c60057e614908f7450353d72f9dda117d33 wpt-pr: 38547
Hi @mathiasbynens, As you previously noted tc39/ecma262#2418 has not been merged yet and https://github.com/tc39/proposal-regexp-v-flag is not part of the official ECMAScript spec yet, if I'm following the conversation correctly. In which case, why are whatwg members like Chrome enabling the Per https://chromestatus.com/feature/5149507107422208, Chrome 114 has Also, in cases where whatwg is moving forward with potentially breaking changes... shouldn't more effort be done to announce/communicate that to the wider developer community? Most existing documentation on the web includes no mention of the |
@poebrand that is unfortunate. We did follow our own process here:
Unfortunately we only added notifying MDN of changes eight months ago and PRs older than that haven't been updated to incorporate it. I'll try to pay better attention to that going forward, although for potentially breaking changes that might still not be sufficient I suppose. Not entirely sure what would be though. |
Filed:
As far as breakage, I believe the reasoning here was that websites should also do server-side validation (since users can remove the |
This makes the
pattern
attribute more powerful, enabling the use of RegExp set notation syntax and properties of strings in its values.Differences with the previous
u
flag-based behavior:[FEATURE] Previously invalid patterns now become valid, e.g.
[BREAKING CHANGE] Some previously valid patterns are now errors, specifically those with a character class including either an unescaped special character
(
)
[
]
{
}
/
-
\
|
or a double punctuator:Throwing patterns result in
inputElement.validity.valid === true
for any input value, so the only compatibility risk is that some value/pattern combinations that would previously result ininputElement.validity.valid === false
now result ininputElement.validity.valid === true
.Other previously valid patterns still behave the same. (Other than the abovementioned features, the
v
flags only differs in behavior from theu
flag w.r.t. case-insensitive matching, but thepattern
attribute uses case-sensitive matching.)Note that the breaking changes apply to somewhat esoteric edge cases that can easily be avoided. In the worst case, this could cause previously invalid input to now be considered valid (since throwing patterns result in
inputElement.validity.valid === true
for any input value). IMHO making the change is worth it given the powerful new functionality it brings, and the relatively small compatibility risk. This is reminiscent of the discussion in #439 (but in a different direction).For context, here’s a few pointers w.r.t. when we decided to implicitly enable the
u
flag for thepattern
attribute in the first place:v
flag web-platform-tests/wpt#38547(See WHATWG Working Mode: Changes for more details.)
/infrastructure.html ( diff )
/input.html ( diff )
/references.html ( diff )