From ff5ab1831bc826b36ff2cdb5a8da0260fd3e5c60 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rimas=20Misevi=C4=8Dius?= Date: Fri, 24 Feb 2017 10:57:13 +0200 Subject: [PATCH 1/9] uses empty host instead of null in path state Fixes #258. --- url.bs | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/url.bs b/url.bs index 0d23edfd..7a6d29e0 100644 --- a/url.bs +++ b/url.bs @@ -2066,10 +2066,10 @@ string input, optionally with a base URL base, opti Windows drive letter, then:
    -
  1. If url's host is non-null, +

  2. If url's host is not the empty string, validation error. -

  3. Set url's host to null and replace the second +

  4. Set url's host to the empty string and replace the second code point in buffer with ":".

From ac38a3af5d11d6d951dfc0c090986108e75f594e Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rimas=20Misevi=C4=8Dius?= Date: Fri, 24 Feb 2017 12:23:15 +0200 Subject: [PATCH 2/9] fixes "relative-URL string" definition for "file" base URL's host also must be checked it is empty/not empty string --- url.bs | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/url.bs b/url.bs index 7a6d29e0..dca7ec58 100644 --- a/url.bs +++ b/url.bs @@ -1172,9 +1172,9 @@ switching on base URL's scheme:

a path-relative-scheme-less-URL string

"file"

a scheme-relative-file-URL string -

a path-absolute-URL string if base URL's host is null +

a path-absolute-URL string if base URL's host is either null or the empty string

a path-absolute-non-Windows-file-URL string if base URL's host - is non-null + is non-empty string

a path-relative-scheme-less-URL string

Otherwise

a scheme-relative-URL string From 98ec6375f3029fea121d8656ad356068a135956b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rimas=20Misevi=C4=8Dius?= Date: Fri, 24 Feb 2017 12:40:21 +0200 Subject: [PATCH 3/9] introduces "empty host" term --- url.bs | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/url.bs b/url.bs index dca7ec58..0b347b2b 100644 --- a/url.bs +++ b/url.bs @@ -254,7 +254,7 @@ point URLs from A can come from untrusted sources.

Host representation

A host is a domain, an -IPv4 address, an IPv6 address, or an opaque host. Typically a host +IPv4 address, an IPv6 address, an opaque host, or an empty host. Typically a host serves as a network address, but it is sometimes used as opaque identifier in URLs where a network address is not necessary. @@ -286,6 +286,8 @@ further processing.

An opaque host is only used by non-special URLs. +

An empty host is the empty string. +

Host miscellaneous

@@ -768,7 +770,7 @@ no purpose other than being a location the algorithm can jump to. IPv6 serializer on host, followed by "]". -
  • Otherwise, host is a domain or opaque host, return host. +

  • Otherwise, host is a domain, an opaque host, or an empty host, return host. The IPv4 serializer takes an From f91dc1b070afce0136bb21b980548d70342727de Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rimas=20Misevi=C4=8Dius?= Date: Fri, 24 Feb 2017 13:30:58 +0200 Subject: [PATCH 4/9] opaque-host is a non-empty string --- url.bs | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/url.bs b/url.bs index 0b347b2b..134410cd 100644 --- a/url.bs +++ b/url.bs @@ -280,7 +280,7 @@ eight 16-bit pieces.

    Support for <zone_id> is intentionally omitted. -

    An opaque host is an ASCII string holding data that can be used for +

    An opaque host is a non-empty ASCII string holding data that can be used for further processing.

    An opaque host is only used by non-special @@ -385,7 +385,7 @@ up to three ASCII digits per sequence, each representing a decimal number XXX should we define the format inline instead just like STD 66? --> -

    An valid opaque-host string must be zero or more URL units or: +

    An valid opaque-host string must be one or more URL units or: "[", followed by a valid IPv6-address string, followed by "]".

    This is not part of the definition of valid host string as it @@ -1201,7 +1201,7 @@ optionally followed by a path-absolute-URL string. path-absolute-URL string.

    An opaque-host-and-port string must be either an empty -valid opaque-host string or: a non-empty valid opaque-host string, optionally followed +string or: a valid opaque-host string, optionally followed by ":" and a URL-port string.

    A scheme-relative-file-URL string must be From 03d242471ac888dfabddb7e46cab762c4eb45050 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rimas=20Misevi=C4=8Dius?= Date: Fri, 24 Feb 2017 16:49:19 +0200 Subject: [PATCH 5/9] formatting nits and more "empty host" use --- url.bs | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-) diff --git a/url.bs b/url.bs index 134410cd..8de65cc1 100644 --- a/url.bs +++ b/url.bs @@ -254,9 +254,9 @@ point URLs from A can come from untrusted sources.

    Host representation

    A host is a domain, an -IPv4 address, an IPv6 address, an opaque host, or an empty host. Typically a host -serves as a network address, but it is sometimes used as opaque identifier in URLs -where a network address is not necessary. +IPv4 address, an IPv6 address, an opaque host, or an empty host. +Typically a host serves as a network address, but it is sometimes used as opaque +identifier in URLs where a network address is not necessary.

    The RFCs referenced in the paragraphs below are for informative purposes only. They have no influence on host writing, parsing, and serialization. Unless stated otherwise @@ -280,8 +280,8 @@ eight 16-bit pieces.

    Support for <zone_id> is intentionally omitted. -

    An opaque host is a non-empty ASCII string holding data that can be used for -further processing. +

    An opaque host is a non-empty ASCII string holding data that can be used +for further processing.

    An opaque host is only used by non-special URLs. @@ -770,7 +770,8 @@ no purpose other than being a location the algorithm can jump to. IPv6 serializer on host, followed by "]". -

  • Otherwise, host is a domain, an opaque host, or an empty host, return host. +

  • Otherwise, host is a domain, an opaque host, or an empty + host, return host. The IPv4 serializer takes an @@ -1174,9 +1175,10 @@ switching on base URL's scheme:

    a path-relative-scheme-less-URL string

    "file"

    a scheme-relative-file-URL string -

    a path-absolute-URL string if base URL's host is either null or the empty string +

    a path-absolute-URL string if base URL's host is an empty + host

    a path-absolute-non-Windows-file-URL string if base URL's host - is non-empty string + is not an empty host

    a path-relative-scheme-less-URL string

    Otherwise

    a scheme-relative-URL string @@ -1200,7 +1202,7 @@ optionally followed by a path-absolute-URL string. "//", followed by an opaque-host-and-port string, optionally followed by a path-absolute-URL string. -

    An opaque-host-and-port string must be either an empty +

    An opaque-host-and-port string must be either the empty string or: a valid opaque-host string, optionally followed by ":" and a URL-port string. From 4335e76bde352ad886967c8b7a99269b4009e6ad Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rimas=20Misevi=C4=8Dius?= Date: Mon, 27 Feb 2017 15:44:01 +0200 Subject: [PATCH 6/9] fixes some typos and wrapping --- url.bs | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/url.bs b/url.bs index 8de65cc1..051613b7 100644 --- a/url.bs +++ b/url.bs @@ -385,7 +385,7 @@ up to three ASCII digits per sequence, each representing a decimal number XXX should we define the format inline instead just like STD 66? --> -

    An valid opaque-host string must be one or more URL units or: +

    A valid opaque-host string must be one or more URL units or: "[", followed by a valid IPv6-address string, followed by "]".

    This is not part of the definition of valid host string as it @@ -770,8 +770,8 @@ no purpose other than being a location the algorithm can jump to. IPv6 serializer on host, followed by "]". -

  • Otherwise, host is a domain, an opaque host, or an empty - host, return host. +

  • Otherwise, host is a domain, opaque host, or empty host, + return host. The IPv4 serializer takes an @@ -1175,8 +1175,8 @@ switching on base URL's scheme:

    a path-relative-scheme-less-URL string

    "file"

    a scheme-relative-file-URL string -

    a path-absolute-URL string if base URL's host is an empty - host +

    a path-absolute-URL string if base URL's host is an + empty host

    a path-absolute-non-Windows-file-URL string if base URL's host is not an empty host

    a path-relative-scheme-less-URL string @@ -2073,8 +2073,8 @@ string input, optionally with a base URL base, opti

  • If url's host is not the empty string, validation error. -

  • Set url's host to the empty string and replace the second - code point in buffer with ":". +

  • Set url's host to the empty string and replace the + second code point in buffer with ":".

    This is a (platform-independent) Windows drive letter quirk. From eb71fd1a0aede8e982167eeaf351b35ddab445b9 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rimas=20Misevi=C4=8Dius?= Date: Fri, 3 Mar 2017 17:33:00 +0200 Subject: [PATCH 7/9] =?UTF-8?q?Add=20a=20table=20with=20allowed=20URL?= =?UTF-8?q?=E2=80=99s=20scheme=20/=20host=20combinations?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- url.bs | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) diff --git a/url.bs b/url.bs index 051613b7..34027f36 100644 --- a/url.bs +++ b/url.bs @@ -1005,6 +1005,25 @@ It is initially the empty string.

    A URL's host is null or a host. It is initially null. +

    +

    The following table lists allowed URL's scheme / + host combinations. + + +
    URL's scheme + URL's host +
    domain + IPv4 address + IPv6 address + opaque host + empty host + null +
    non-file special +
    "file"✅ +
    non-special✅ +
    +

    +

    A URL's port is either null or a 16-bit unsigned integer that identifies a networking port. It is initially null. From 19f5a624b069b7dc3c0abccaf134f416a7e9266c Mon Sep 17 00:00:00 2001 From: Anne van Kesteren Date: Mon, 6 Mar 2017 11:29:08 +0100 Subject: [PATCH 8/9] remove note, adjust table --- url.bs | 48 ++++++++++++++++++++++++++++++++++-------------- 1 file changed, 34 insertions(+), 14 deletions(-) diff --git a/url.bs b/url.bs index 34027f36..418172f0 100644 --- a/url.bs +++ b/url.bs @@ -283,9 +283,6 @@ eight 16-bit pieces.

    An opaque host is a non-empty ASCII string holding data that can be used for further processing. -

    An opaque host is only used by non-special -URLs. -

    An empty host is the empty string. @@ -1010,17 +1007,40 @@ It is initially the empty string. host combinations. - + + + + +
    URL's scheme - URL's host -
    domain - IPv4 address - IPv6 address - opaque host - empty host - null -
    non-file special -
    "file"✅ -
    non-special✅ +
    scheme + host +
    domain + IPv4 address + IPv6 address + opaque host + empty host + null +
    non-"file" special + ✅ + ✅ + ✅ + ❌ + ❌ + ❌ +
    "file" + ✅ + ✅ + ✅ + ❌ + ✅ + ✅ +
    non-special + ❌ + ❌ + ✅ + ✅ + ✅ +
    From 38932229149fdd2f43ab7ca7ae914a785ada9a9b Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Rimas=20Misevi=C4=8Dius?= Date: Mon, 6 Mar 2017 16:34:11 +0200 Subject: [PATCH 9/9] Fix "Windows drive letter quirk" in the path state if host is null, then no validation error and don't set host to "" --- url.bs | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/url.bs b/url.bs index 418172f0..b1f5fbc8 100644 --- a/url.bs +++ b/url.bs @@ -2109,11 +2109,10 @@ string input, optionally with a base URL base, opti Windows drive letter, then:

      -
    1. If url's host is not the empty string, - validation error. +

    2. If url's host is neither the empty string nor null, + validation error, set url's host to the empty string. -

    3. Set url's host to the empty string and replace the - second code point in buffer with ":". +

    4. Replace the second code point in buffer with ":".

    This is a (platform-independent) Windows drive letter quirk.