-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Percent encoding does not round-trip #523
Comments
See #513 and #518 for some upcoming changes in this area. They don't really change the status quo, but do add percent-encode sets that include U+0025, including one that is an exact match for the one in JavaScript. And I guess we could add a note that it really depends on the caller which of the various percent-encode sets is safe to use. |
I can confirm that the spec and tests are correct, i.e. they match browsers. See https://jsdom.github.io/whatwg-url/#url=aHR0cDovLyUyNURPTUFJTjpmb29iYXJAZm9vZG9tYWluLmNvbS8=&base=YWJvdXQ6Ymxhbms= |
Also add some <div algorithm> wrappers to this section. Closes #523.
For clarity, if your username is |
Also add some <div algorithm> wrappers to this section. Closes #523.
Hi!
I'm trying to implement the current version of the spec in Swift (for non-browser applications. The idea is that it's useful to have a URL type that behaves as your browser behaves and accepts/rejects the same things).
One issue that I've noticed is that the current definition of the "UTF8 percent encode" algorithm doesn't round-trip for strings which contain the percent character. For example, following the algorithm, the string "%100" (UTF8: [37, 49, 48, 48]) is not changed at all when encoding (regardless or the percent-encoding set; none of them contain the "%" character itself). However, decoding that same string using the decoding algorithm in the spec results in the UTF8 sequence [16, 48], or "0" (ASCII 0x10 is the unprintable "data link escape" character).
RFC-3986 warns about this:
Indeed, the
encodeURIComponent
JS function encodes the percent character:I was ready to submit a PR to have the spec's algorithm also do this, but it appears there is an explicit test that percent characters are not escaped (in this case, in the URL's username component): https://github.com/web-platform-tests/wpt/blob/master/url/resources/urltestdata.json#L2372
I'm not sure if this is correct. Usernames are required to be escaped (meaning they must be unescaped to recover their original value), as in the following example:
However, unescaping the string "%25DOMAIN" as the test expects would not recover the original and result in a different username - "%DOMAIN".
Can anybody confirm the correct behaviour? I think we should be escaping the % character, and that the test is incorrect. If that's not the case (and the current behaviour is correct), this could use a note or warning in the spec.
The text was updated successfully, but these errors were encountered: