-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attribute escaping difference between Gutenberg and WP #9915
Comments
Adding to this, if you use the classic editor to insert an image with an
|
So reading through the spec, it seems to me like https://www.w3.org/TR/html5/syntax.html#unquoted :
Then we have "Single-quoted attribute value syntax"
Because those rules are in addition to the above rules, we can't have a literal |
Confusing spec! The base attribute value is:
The spec for double-quoted attribute values is:
The |
This explains IE6. |
to confirm what @johngodley wrote, the restriction on |
Just to note this was reported again on Trac, see https://core.trac.wordpress.org/ticket/46114. See also the previous Trac ticket https://core.trac.wordpress.org/ticket/45387 |
I mentioned this at #9963 (comment), but I want to clarify it here as well: The spec allows <span data-foo="1 < 2">OK</span>
<span data-foo="1 < 2">OK</span>
<script>
console.log(
Array.from( document.querySelectorAll( 'span' ) )
.map( span => span.getAttribute( 'data-foo' ) )
); // ["1 < 2", "1 < 2"]
</script> Both are valid HTML. Both have exactly the same meaning. |
Describe the bug
There appears to be a difference between how Gutenberg (specifically React) encodes HTML attributes, and how WordPress does, and this can lead to some mangling.
React encodes attributes according to the HTML spec, and
>
is allowed. This meanstest > thing
is stored as-is in an attribute.WordPress has other ideas, though. Here's an image created by Gutenberg. Note the
alt
tag has a value oftest > thing
(I've added returns in examples to make them look better):WordPress renders this as:
Note how the terminating quote in the
alt
attribute has been smart-quoted, and then everything else falls to pieces.The culprit is
wptexturize
, and specifically the_get_wptexturize_split_regex
regex which 'parses' the HTML using regex. It sees the>
in the attribute, thinks it's an HTML tag, and the fun begins.To test:
Going further you can see the
>
causes the regex to split (array position 2 and 3):Changing
wptexturize
seems a huge headache. Maybe the best action is for Gutenberg to encode all attributes in a way compatible with_get_wptexturize_split_regex
, which probably means converting the>
to>
(or just disablingwptexturize
on every blog...):Note the above problem affects custom class names, and probably other blocks where data is directly output to an attribute.
This matches the problem described in #8779
The text was updated successfully, but these errors were encountered: