-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use empty host for file URL's #260
Conversation
base URL's host also must be checked it is empty/not empty string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall, but many nits.
url.bs
Outdated
<dd><p>a <a>path-absolute-non-Windows-file-URL string</a> if <a>base URL</a>'s <a for=url>host</a> | ||
is non-null | ||
is non-empty string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not always a string though. It could be an IPv4 address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably just say not an empty host.
url.bs
Outdated
@@ -254,7 +254,7 @@ point <a for=/>URLs</a> from <var>A</var> can come from untrusted sources. | |||
<h3 id=host-representation>Host representation</h3> | |||
|
|||
<p>A <dfn export id=concept-host>host</dfn> is a <a>domain</a>, an | |||
<a>IPv4 address</a>, an <a>IPv6 address</a>, or an <a>opaque host</a>. Typically a <a for=/>host</a> | |||
<a>IPv4 address</a>, an <a>IPv6 address</a>, an <a>opaque host</a>, or an <a>empty host</a>. Typically a <a for=/>host</a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't meet the 100 column wrapping requirements.
url.bs
Outdated
@@ -280,12 +280,14 @@ eight <dfn id=concept-ipv6-piece lt='IPv6 piece'>16-bit pieces</dfn>. | |||
<p class="note">Support for <code><zone_id></code> is | |||
<a href="https://www.w3.org/Bugs/Public/show_bug.cgi?id=27234#c2">intentionally omitted</a>. | |||
|
|||
<p>An <dfn export>opaque host</dfn> is an <a>ASCII string</a> holding data that can be used for | |||
<p>An <dfn export>opaque host</dfn> is a non-empty <a>ASCII string</a> holding data that can be used for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, I think.
url.bs
Outdated
@@ -768,7 +770,7 @@ no purpose other than being a location the algorithm can jump to. | |||
<a>IPv6 serializer</a> on <var>host</var>, | |||
followed by "<code>]</code>". | |||
|
|||
<li><p>Otherwise, <var>host</var> is a <a>domain</a> or <a>opaque host</a>, return <var>host</var>. | |||
<li><p>Otherwise, <var>host</var> is a <a>domain</a>, an <a>opaque host</a>, or an <a>empty host</a>, return <var>host</var>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And here.
url.bs
Outdated
@@ -1172,9 +1174,9 @@ switching on <a>base URL</a>'s <a for=url>scheme</a>: | |||
<dd><p>a <a>path-relative-scheme-less-URL string</a> | |||
<dt>"<code>file</code>" | |||
<dd><p>a <a>scheme-relative-file-URL string</a> | |||
<dd><p>a <a>path-absolute-URL string</a> if <a>base URL</a>'s <a for=url>host</a> is null | |||
<dd><p>a <a>path-absolute-URL string</a> if <a>base URL</a>'s <a for=url>host</a> is either null or the empty string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should say "an empty host" I think. A base URL is a URL record after all. But also, I thought the idea was that for file URLs the host could never be null so then we should leave that out too.
url.bs
Outdated
@@ -1199,7 +1201,7 @@ optionally followed by a <a>path-absolute-URL string</a>. | |||
<a>path-absolute-URL string</a>. | |||
|
|||
<p>An <dfn export>opaque-host-and-port string</dfn> must be either an empty | |||
<a>valid opaque-host string</a> or: a non-empty <a>valid opaque-host string</a>, optionally followed | |||
string or: a <a>valid opaque-host string</a>, optionally followed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's "the empty string" (and an empty host).
url.bs
Outdated
@@ -2066,10 +2068,10 @@ string <var>input</var>, optionally with a <a>base URL</a> <var>base</var>, opti | |||
<a>Windows drive letter</a>, then: | |||
|
|||
<ol> | |||
<li><p>If <var>url</var>'s <a for=url>host</a> is non-null, | |||
<li><p>If <var>url</var>'s <a for=url>host</a> is not the empty string, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here again we're talking about a URL record so we should say "not an empty host". Same below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both an empty host and an opaque host are strings by definition. So maybe it's fine to set host to the empty string in the parser? For example the opaque host parser returns string, but not an opaque host record.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair, I guess some level of implicit casting is probably okay.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The other thing I'm wondering about is if we should add some kind of explanatory text. That file URLs can have empty or non-opaque hosts, non-file-special URLs can have non-empty-non-opaque hosts, and non-special URLs can have empty, opaque, or null hosts. Such a note should probably go where we define the host field of URLs.
url.bs
Outdated
@@ -383,7 +385,7 @@ up to three <a>ASCII digits</a> per sequence, each representing a decimal number | |||
|
|||
XXX should we define the format inline instead just like STD 66? --> | |||
|
|||
<p>An <dfn export>valid opaque-host string</dfn> must be zero or more <a>URL units</a> or: | |||
<p>An <dfn export>valid opaque-host string</dfn> must be one or more <a>URL units</a> or: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change "An" to "A" here while we're changing this? I can also do it before merging.
url.bs
Outdated
@@ -768,7 +770,8 @@ no purpose other than being a location the algorithm can jump to. | |||
<a>IPv6 serializer</a> on <var>host</var>, | |||
followed by "<code>]</code>". | |||
|
|||
<li><p>Otherwise, <var>host</var> is a <a>domain</a> or <a>opaque host</a>, return <var>host</var>. | |||
<li><p>Otherwise, <var>host</var> is a <a>domain</a>, an <a>opaque host</a>, or an <a>empty | |||
host</a>, return <var>host</var>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it reads more naturally if you leave out the "an". So "A domain, opaque host, or empty host".
url.bs
Outdated
@@ -1172,9 +1175,10 @@ switching on <a>base URL</a>'s <a for=url>scheme</a>: | |||
<dd><p>a <a>path-relative-scheme-less-URL string</a> | |||
<dt>"<code>file</code>" | |||
<dd><p>a <a>scheme-relative-file-URL string</a> | |||
<dd><p>a <a>path-absolute-URL string</a> if <a>base URL</a>'s <a for=url>host</a> is null | |||
<dd><p>a <a>path-absolute-URL string</a> if <a>base URL</a>'s <a for=url>host</a> is an <a>empty | |||
host</a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No newlines inside phrasing/inline elements.
url.bs
Outdated
<a>validation error</a>. | ||
|
||
<li><p>Set <var>url</var>'s <a for=url>host</a> to null and replace the second | ||
<li><p>Set <var>url</var>'s <a for=url>host</a> to the empty string and replace the second |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is >100 columns. "second" needs to go to the next line.
Maybe explanatory text is good idea, I don't know. But I would like to note, what file URL still can have null host (two invalid URL's cases: Also I think about such note below the empty host definition: "An empty host is used by non-special and file URLs." |
I guess there's no real point trying to make those invalid cases consistent. And you're right about where the note should go. Although maybe we should have it in both places eventually, once we're certain this design can be widely implemented (i.e., once two browsers ship it). |
I think the such table is clearer than a long explanatory text #260 (review), #260 (comment):
Then maybe no need for additional notes. What do you think? |
And have that below the definition of URL's host? I think that's reasonable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple nits left. I'll also push a commit to adjust the formatting of the table a bit.
url.bs
Outdated
<p>An <dfn export>opaque host</dfn> is an <a>ASCII string</a> holding data that can be used for | ||
further processing. | ||
<p>An <dfn export>opaque host</dfn> is a non-empty <a>ASCII string</a> holding data that can be used | ||
for further processing. | ||
|
||
<p class="note no-backref">An <a>opaque host</a> is only used by <a lt="is special">non-special</a> | ||
<a for=/>URLs</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think either we should have a note like this for empty hosts too or we should drop this one, no?
url.bs
Outdated
<th><a>opaque host</a> | ||
<th><a>empty host</a> | ||
<th>null | ||
<tr><td>non-file <a lt="special scheme">special</a><td>✅<td>✅<td>✅<td><td><td> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
non-"<code>file</code>"
I ended up fixing the nits, let me know what you think. |
if host is null, then no validation error and don't set host to ""
I think your changes are OK. Yet I have changed the "Windows drive letter quirk" in the path state, to be more logical (no validation error if host null) and consistent (null host is left the null): For example, now: Now LGTM. |
Looks okay, blocked on speced/bikeshed#941 now. |
To confirm, this only impacts validation, not parsing/serializing? |
Yeah, this should not have changed either of those (other than logging validation errors). |
Yes, fixes validation bug. Parsing: |
Follows whatwg/url#260. This does not impact parsing, only validation and the URL record representation.
Note that there is no test coverage for the case introduced here: in if (this.url.host !== "" && this.url.host !== null) {
this.parseError = true;
this.url.host = "";
} the condition never evaluates to true so those two lines are never reached. Maybe a coverage gap, maybe a spec bug. |
@domenic test case, which covers this code is in previous comment: #260 (comment): It looks like WPT lacks this. I can prepare PR. |
Tests windows drive letter quirk in the path state: https://url.spec.whatwg.org/#path-state Fixes whatwg/url#272 (whatwg/url#260 (comment))
- Update to spec - Add opaque hosts - File state did not correctly deal with lack of base URL - Cleanup API for file and non-special URLs - Allow % and IPv6 addresses in non-special URL hosts - Use specific names for percent-encode sets - Add empty host concept for file and non-special URLs - Clarify IPv6 serializer - Fix existing mistakes - Add missing ':' to forbidden host code point list. - Correct IPv4 parser empty label behavior - Maintain type equivalence in URLContext with spec - scheme, username, and password should always be strings - host, port, query, and fragment may be strings or null - Align scheme state more closely with the spec - Make sure the `special` variable is always synced with URL_FLAG_SPECIAL. PR-URL: nodejs#12523 Fixes: nodejs#10608 Fixes: nodejs#10634 Refs: whatwg/url#185 Refs: whatwg/url#225 Refs: whatwg/url#224 Refs: whatwg/url#218 Refs: whatwg/url#243 Refs: whatwg/url#260 Refs: whatwg/url#268 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
- Update to spec - Add opaque hosts - File state did not correctly deal with lack of base URL - Cleanup API for file and non-special URLs - Allow % and IPv6 addresses in non-special URL hosts - Use specific names for percent-encode sets - Add empty host concept for file and non-special URLs - Clarify IPv6 serializer - Fix existing mistakes - Add missing ':' to forbidden host code point list. - Correct IPv4 parser empty label behavior - Maintain type equivalence in URLContext with spec - scheme, username, and password should always be strings - host, port, query, and fragment may be strings or null - Align scheme state more closely with the spec - Make sure the `special` variable is always synced with URL_FLAG_SPECIAL. PR-URL: #12523 Fixes: #10608 Fixes: #10634 Refs: whatwg/url#185 Refs: whatwg/url#225 Refs: whatwg/url#224 Refs: whatwg/url#218 Refs: whatwg/url#243 Refs: whatwg/url#260 Refs: whatwg/url#268 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
- Update to spec - Add opaque hosts - File state did not correctly deal with lack of base URL - Cleanup API for file and non-special URLs - Allow % and IPv6 addresses in non-special URL hosts - Use specific names for percent-encode sets - Add empty host concept for file and non-special URLs - Clarify IPv6 serializer - Fix existing mistakes - Add missing ':' to forbidden host code point list. - Correct IPv4 parser empty label behavior - Maintain type equivalence in URLContext with spec - scheme, username, and password should always be strings - host, port, query, and fragment may be strings or null - Align scheme state more closely with the spec - Make sure the `special` variable is always synced with URL_FLAG_SPECIAL. PR-URL: #12523 Fixes: #10608 Fixes: #10634 Refs: whatwg/url#185 Refs: whatwg/url#225 Refs: whatwg/url#224 Refs: whatwg/url#218 Refs: whatwg/url#243 Refs: whatwg/url#260 Refs: whatwg/url#268 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
- Update to spec - Add opaque hosts - File state did not correctly deal with lack of base URL - Cleanup API for file and non-special URLs - Allow % and IPv6 addresses in non-special URL hosts - Use specific names for percent-encode sets - Add empty host concept for file and non-special URLs - Clarify IPv6 serializer - Fix existing mistakes - Add missing ':' to forbidden host code point list. - Correct IPv4 parser empty label behavior - Maintain type equivalence in URLContext with spec - scheme, username, and password should always be strings - host, port, query, and fragment may be strings or null - Align scheme state more closely with the spec - Make sure the `special` variable is always synced with URL_FLAG_SPECIAL. PR-URL: #12523 Fixes: #10608 Fixes: #10634 Refs: whatwg/url#185 Refs: whatwg/url#225 Refs: whatwg/url#224 Refs: whatwg/url#218 Refs: whatwg/url#243 Refs: whatwg/url#260 Refs: whatwg/url#268 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
- Update to spec - Add opaque hosts - File state did not correctly deal with lack of base URL - Cleanup API for file and non-special URLs - Allow % and IPv6 addresses in non-special URL hosts - Use specific names for percent-encode sets - Add empty host concept for file and non-special URLs - Clarify IPv6 serializer - Fix existing mistakes - Add missing ':' to forbidden host code point list. - Correct IPv4 parser empty label behavior - Maintain type equivalence in URLContext with spec - scheme, username, and password should always be strings - host, port, query, and fragment may be strings or null - Align scheme state more closely with the spec - Make sure the `special` variable is always synced with URL_FLAG_SPECIAL. PR-URL: #12523 Fixes: #10608 Fixes: #10634 Refs: whatwg/url#185 Refs: whatwg/url#225 Refs: whatwg/url#224 Refs: whatwg/url#218 Refs: whatwg/url#243 Refs: whatwg/url#260 Refs: whatwg/url#268 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Daijiro Wachi <daijiro.wachi@gmail.com> Reviewed-By: Joyee Cheung <joyeec9h3@gmail.com>
Summary of changes:
Preview | Diff