-
Notifications
You must be signed in to change notification settings - Fork 2.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Case-sensitive as="" for <link> is weird #1665
Comments
It was also a mistake, e.g., for HTTP methods. Everyone still thinks those are case-insensitive. I'd prefer we just did case-sensitive from now on so there will eventually be less confusion between markup and APIs that use enums, unless you want to make those case-insensitive too? |
A functional requirement for case sensitivity is likely to cause confusion. I'd be fine with such a conformance requirement, but ignoring matching values because of case seems over restrictive to me. Are there any precedents to that in HTML? |
I'd prefer we didn't fork the largely consistent behavior across attribute values here. Having rel@ be case-insensitive and is@ be case-sensitive is very strange and will lead to author confusion. As an implementor this is also unfortunate since we'd need a whitelist for the case-sensitivity of this one attribute in our generated code. |
The problem is that by making it case-insensitive here you would subset Fetch. We cannot keep doing that to standards HTML integrates with. |
I'm not sure what you mean by "subset fetch". If I was an author making a custom element I would just call toLowerCase() before passing the value to fetch, the platform can do the same. |
Because HTML would constrain the value space. E.g., as I said before, HTTP |
At the spec level I would object to any spec that wants to have an enumerated value that varies only by case. Fetch is not going to have values for as@ that are not lowercase. I would formally object to any spec being implemented in Blink that tried to do that. |
Also, with custom elements you run into the problem that the most convenient lowercasing operation in JavaScript is not compatible with what HTML uses. |
The assertion that no other attribute value does case-sensitive comparison is false.
A form-associated element's
There are a lot more examples of content attributes and features that use case-sensitive comparisons in HTML. Consistency is good but there is hardly any consistency about case-sensitivity in HTML/Web. There are even multiple definitions of case-insensitiveness, and some APIs are only case-insensitive inside a HTML document or with regards to HTML elements. Given the increased use of inline SVG elements and other XML features being incorporated into modern Web sites and web apps, the simplest thing we can do today is to make new APIs case-sensitive and treat insensibility as a compatibility feature for legacy APIs. |
Wow, thanks for that search @rniwa! I don't think the case-sensitive ID/name comparison is the same type of thing; mostly we're talking about cases where there's a finite set of enumerated values. And your selectionDirection example is an IDL attribute, not a content attribute; IDL attributes are indeed always case-sensitive. But the I still think it would be better to go with the vast majority and always be case-insensitive, but I no longer feel as strongly. I do think @esprehn is right that we shouldn't be concerned about value constraints here; Fetch should never add a destination that differs from other destinations only in case. We shouldn't let one bad case (HTTP methods) guide us toward a bad precedent. However, I agree with @annevk that using JS lowercasing in e.g. a custom element is very bad; you really should be using some kind of case-folding comparison, which isn't even something exposed to JavaScript. (At least, not in any way I know about; I'm not aware of all the Intl library stuff, however.) We can get away with toLowerCase() as long as nobody uses Unicode, but that's an unfortunate assumption to make. |
I also think that everywhere where we use case-insensitive comparison it is now generally seen as a mistake. Avoiding it going forward should make things more predictable as everyone will converge on canonical casing. |
I generally disagree with that. I think enumerated values being case-insensitive is more author-friendly and consistent with HTML in general (e.g. |
I'm not sure if "we rejected" is a good characterization of what has happened. It was more to do with backwards compatibility requiring case-insensibility. And the whole situation is a bit of mess due to case-insensibility behavior being inconstant across the platform. |
We rejected not having error handling and forward compat for syntax. It's much more subtle than "embracing tagsoup" as some like to portray it. |
And indeed, rejected not being backwards compatible. None of those are at issue here. |
Yes, I also disagree. Case-insensitivity for enumerated values constrained to be within the ASCII range is simple and easy for authors, and common throughout the web platform - CSS, in particular, uses it everywhere. If you're constrained to the ASCII range, the obvious JS method for lowercasing works 100% correctly. As @esprehn says, any spec defining enumerated values that has two distinct values that differ only be case is doing something terribly wrong; such a value would draw strong objections from implementors, for good reason. So, there's no practical concern about value clashing here, just aesthetic/design concerns. This is very distinct from arbitrary / open-ended names, particularly if you're allowing full Unicode (which is generally good practice). At that point there's no single "correct" way to lowercase, and so we should just be doing codepoint comparison. Again, CSS does precisely this.
As well, input names are "open-ended", so it's more natural that they're compared codepoint-wise. Thus the charset thing is ok. |
|
@annevk, please read the first part of @tabatkins's sentence that you quoted. |
@domenic how are attribute values constrained in that way? They're not. Or do you mean you first have to check that all the code points are in the ASCII range? That's a lot more complicated than writing |
They are in all the cases so far. We don't know yet whether authors plan to create enumerated attributes containing Turkish Is in their custom elements, but I think @tabatkins was assuming they would not, in his sentence which you misquoted. |
Again, if someone writes |
I'm saying that all existing enumerated attributes in HTML (and all we plan to add) are ascii-only. If you write an attribute value also in ASCII, JS's lowercasing works great. "Someone might use an API and input an attribute value that includes some random unicode characters that happen to JS-lowercase into ASCII values" is a super-bizarre case for us to care about. Why is this something we need to worry about and optimize our API design for? And if people do end up, in their custom elements, including non-ASCII values, they should follow the web's common practices and match those codepoint-wise. No lowercasing involved there at all. |
You said it handles it 100% correctly, but that is simply not true. And we should care about error handling since the web tends to start depending on the errors. Furthermore, it would be much easier if we required canonical case as then you can just copy it around without hassle. |
No, it's true in the context I gave. If the value-space is ASCII and people are working in ASCII, then it's fine. Your concern is just that polyfills will accidentally support people typing non-ASCII. I'm confused how that has anything to do with HTML itself; how would that freeze us into accepting non-ASCII? It's also just an incredibly weird thing to worry about imo.
What hassle is caused by ASCII-CI values? I don't see how copy-pasting is affected. |
They do not work in APIs that take enums. We are going in circles now. |
I think that's a problem that can trip up beginners but in practice is not much of a problem at all. We have the same situation with attribute names and reflecting IDL attributes (e.g. Maybe the Web platform should have convenience functions for ASCII-lowercasing etc. |
It seems like there is not agreement on changing this to be case-insensitive. Someone should add web platform tests for the reflection, though, since this is so unusual compared to existing reflections. My understanding is that Chrome would fail such tests. I will open an issue on web platform tests and on Chrome. |
Filed https://bugs.chromium.org/p/chromium/issues/detail?id=646387 and web-platform-tests/wpt#3703. Closing this issue. |
Case sensitive or not now came up for Feature Policy: w3c/webappsec-permissions-policy#54 (comment) |
Thanks @zcorpan, that would suggest that unless someone polices new specs very well, we'd keep getting some case-insensitive things even if we "decide" that we don't like it. Out of curiosity, for content attribute values, what are the places where it's definitely case-sensitive if we're adding new APIs? Keywords and tokens are all that I can think of that's actually compared with other strings and not just passed along or parsed somehow. |
What do you mean by places? IDs, classes, URLs, all have various degrees of being case-sensitive (and sometimes not (and where not it's been a source of problems historically)). |
You got my meaning, and keywords and tokens are the "places" where I know it's usually case-insensitive. IDs and classes I suppose are a mess due to history, if doing them today I think we'd want them to be case-sensitive since the value space is open, to avoid having to pick a certain kind of case-insensitivity. URLs aren't compared, but we clearly shouldn't have any case folding in any case. Namespaces are an odd case where we do compare against a set of known values, but do so case-sensitively, presumably because they look like URLs even though they're never resolved or otherwise used as URLs. I don't think we could formulate a principle that explains the current mess, but my preference going forward would be case-insensitive for tokens and keywords like to sandbox flags or track kinds, and case-sensitive for essentially everything else that I can think of. That is, if we're comparing attribute values against some (non-namespace) internal string, do so case-insensitively. |
So your preference would be to do it differently from how the surrounding programming environment and API would handle it? Anyway, if everyone wants that inconsistency, fine, but don't count on me cleaning up the resulting mess. |
Consistency isn't on the menu here, but yes, that is the particular kind of inconsistency that I'd prefer, perhaps because the taste is already familiar. FWIW, if I felt strongly the other way, I'd probably try to add use counters in the parsing of all existing attributes that could be turned into enums in the IDL if made case-sensitive. Then, if the usage was super low I'd argue that we should change them for better ergonomics even if it's already interoperable. Short of that, only inconsistency is on offer. |
Changing features that are already fully interoperable is rarely (if ever) worth it. Avoiding repeating past mistakes often is. |
I strongly agree! Which is why I am upset that this change was made during the speccing process, when we already have interoperable implementations of as="" that are case-insensitive. |
as="" is far from fully interoperable. Let's not play games. |
??? It's interoperably implemented as case-insensitive in both UAs that implement it at all. |
Okay, what I mean by fully interoperable is implemented in the same way by all browsers. See also |
Implemented in two UAs one way versus implemented in 0 UAs the way you specced it seems like a clear case of the editor overstepping. |
Let's not make it personal? I don't think I even wrote the text here. |
My apologies. My point was more that it goes against the process we are trying to embody. Such speculative changes without implementation interest should be left as PRs, not as part of the spec. |
So, hmm, given that these changes predate our recent adventures in editorial policy, let's treat it like any old decision that we'd like to revisit. None of us have a formula for making trade-offs like this or the data to plug in to it, so we're probably not going to convince one another about what's actually best here. If we just broadcast asking for implementer interest we will probably get none, so how about we try to find some relevant person for each engine to muster an opinion, which could be "don't care." From the list of usual suspects: Chromium: @dominiccooney or @tkent-google? Then, a week or two from now, our editor-in-chief @domenic can make the call if it's still not obvious. Yes? |
It would be good to hear what folks want for as="" and what they want going forward for new attributes. |
Good point, the answer may not be the same. |
Via email from @travisleithead:
|
I want simpler platform in the future, when possible, and case-insensitive handling makes it more complicated. And there doesn't seem to be strong reason to not have case-sensitiveness here. Given that 'as' is relatively new, using case-sensitive there sounds reasonable. My rule of thumb for cases when there isn't very strong reason for behavior X vs Y is to think what kind of behavior I'd like to see in the platform in general. And case-sensitiveness certainly is such. |
The Blink bug has been wont-fixed, and WebKit continues to maintain case-insensitivity in their implementation. I'd like to close this as 3/4 browsers prefer case-insensitivity. |
Or, I guess, not close this issue, since we still need to fix the spec to be case-insensitive for as="" and match the existing implementations. But close the discussion, and not let it block further work such as #2515. |
Closes #1665 by aligning with other enumerated attributes.
Closes #1665 by aligning with other enumerated attributes.
Closes whatwg#1665 by aligning with other enumerated attributes.
Closes whatwg#1665 by aligning with other enumerated attributes.
Closes whatwg#1665 by aligning with other enumerated attributes.
In #1449 @annevk talked us into making as="" for
<link>
case-sensitive. Part of Anne's argument was that HTML has been inconsistent on this so far.However, upon prompting from @esprehn, I checked all the other attributes in the HTML spec (using the attributes index). Everything that is comparing against a predefined set of values (including things like MIME types) is treated case-insensitively.
I think introducing this new, inconsistent way of matching attribute values was a mistake. I think we should make as="" a normal enumerated attribute that matches case-insensitively.
/cc @igrigorik @yoavweiss
The text was updated successfully, but these errors were encountered: