Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistency when using URLSearchParams to mutate a URL #478

Closed
jasnell opened this issue Apr 24, 2020 · 17 comments
Closed

Inconsistency when using URLSearchParams to mutate a URL #478

jasnell opened this issue Apr 24, 2020 · 17 comments
Labels
needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan needs implementer interest Moving the issue forward requires implementers to express interest topic: api

Comments

@jasnell
Copy link
Collaborator

jasnell commented Apr 24, 2020

See: nodejs/node#33037
Relates to: #18

Because of the differences in URLSearchParams and .search stringification, encoding of the tilde (~) character becomes inconsistent when using URLSearchParam to mutate a URL instance... take the following examples for instance:

const url = new URL('http://httpbin.org/anything?a=~');
url.search = url.searchParams.toString();
// becomes http://httpbin.org/anything?a=%7E

And...

(demonstrated in Node.js, but appears consistent across implementations)

C:\Users\jasne\Projects\tmp>node
Welcome to Node.js v14.0.0.
Type ".help" for more information.
> const u = new URL('http://example.com/?a=~')
undefined
> u.toString()
'http://example.com/?a=~'
> u.searchParams.sort()
undefined
> u.toString()
'http://example.com/?a=%7E'
>

Per #18, there are reasons why URLSearchParam uses different semantics for stringification, and I don't necessarily want to revisit those, but we should likely give consideration to what the expected behavior should be when URLSearchParam is used to mutate a URL instance. Which of the differing encoding semantics should take precedence?

/cc @szmarczak @bnoordhuis @himself65 @Hamper

@annevk annevk added topic: api needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan needs implementer interest Moving the issue forward requires implementers to express interest labels Apr 26, 2020
@jasnell
Copy link
Collaborator Author

jasnell commented Apr 28, 2020

I could work on the proposed spec text here but want to confirm the approach before doing so. What I'd propose for this is that the URL rules be applied when the URLSearchParams associated with a URL (url.searchParams) is stringified for the purpose of mutating that URL. It would be internal to the impl and would not change the user facing toString behavior of URLSearchParams.

@jasnell
Copy link
Collaborator Author

jasnell commented Apr 28, 2020

And as far as implementor interest, I would definitely update Node.js behavior to support the new approach.

@annevk
Copy link
Member

annevk commented Apr 28, 2020

@bakulf @achristensen07 to determine interest from Gecko and WebKit. @domenic do you know who to ask for Chromium?

@domenic
Copy link
Member

domenic commented Apr 28, 2020

Maybe @ricea...

@achristensen07
Copy link
Collaborator

achristensen07 commented Apr 29, 2020

Given that all implementations agree on implementation, we seem to already have achieved the goal of uniformity.
What is the goal of changing this? This isn't the first time we've had complaints about different encoding character sets. What is the history of the character set URLSearchParams encodes and why it's different than the URL query encoding character set? What is the compatibility risk of changing?

@domenic
Copy link
Member

domenic commented Apr 29, 2020

My take on the history (from #18) is that URLSearchParams is misnamed and misused.

It was designed as a kind of application/x-www-form-urlencoded representation manipulator, to represent form data in that form. But instead of being named ApplicationXWWWFormURLEncodedParams, it was misnamed URLSearchParams.

And then things got worse when one got attached to every URL object via the searchParams property, in a way where interacting with the new url.searchParams property would cause results inconsistent with interacting with the URL's search property, because search uses URL query rules instead of application/x-www-form-urlencoded rules.

So although we have cross-browser uniformity, we unfortunately have a massive web developer footgun.

There are a few ways out of this:

  1. Do nothing. If you want to actually manipulate a URL's query string according to URL query rules, you need to do so manually or with a dedicated library, and url.searchParams is not a good fit for you.
  2. (This issue's proposal, IIUC) Make URLSearchParams have an internal "mode", e.g. "application/x-www-form-urlencoded mode" vs. "URL search params mode". url.searchParams can be an instance of URLSearchParams which uses the "URL search params mode", whereas new URLSearchParams() can use the "application/x-www-form-urlencoded mode".
  3. Change URLSearchParams to always operate according to the URL query rules. Optionally, introduce a new object (e.g. ApplicationXWWWFormURLEncodedParams) which can be used for representing form data.
  4. Introduce RealURLSearchParams and url.realSearchParams which behave according to the URL query parser. Tell people to use those and never to use URLSearchParams for URL-related things--only for HTML forms---and to never use url.searchParams at all.
  5. Change browser's URL query string parsers/serializers and their application/x-www-form-urlencoded parsers/serializers to align with each other. This would likely involve changes to both <form> serialization and URL parsing, depending on which is more flexible in a given case.

I have always favored (3), but been unsure if we could get away with it; there are compat risks. The proposal of (2) is novel, but I'm not sure if it's a great idea; it seems like it'd add confusion and I don't know how much compat risk it would really mitigate. (5) would be a great unification and simplification for the platform, but even more risky than (3). Sadly, I think (1) or (4) are the most realistic.

@annevk
Copy link
Member

annevk commented Apr 29, 2020

I think one aspect that's missing is that <form> essentially uses url.searchParams for GET requests so it's not that out there to want to emulate that. And <form> is the origin of the & and = format so to me it actually still makes sense that if you want to use that format you get <form> rules. (Though I agree 5 would be ideal.)

@szmarczak
Copy link

I would just introduce a new object and rename URLSearchParams to URLFormParams. Which is no 3.

Solution no 4 would work too. IMO it's more likely to see this than point 3. I'd name it URLQueryParams.

5 is very unlikely to happen. But that's the solution everybody wants to see I guess.

@jasnell
Copy link
Collaborator Author

jasnell commented Apr 29, 2020

Of the 5, I may be biased since it's my proposal but, ignoring option 1, option 2 has the least likely chance of being breaking (or at least disruptive). Option 3 may have been the better choice if we were starting from scratch but at this point I don't believe it's worth potentially breaking anything with the change in semantics and introducing entirely new API artifacts just does not seem justified. 4 is an Ok option but we'd likely end up implementing it in much the same way we'd implement 2 -- that is, this RealURLSearchParams would essentially just be an alias for the current URLSearchParams with the internal mode set, so I doubt there's any real difference.

@ricea
Copy link

ricea commented Apr 30, 2020

I am in favour of maintaining the status-quo. Changing the standard to a different behaviour when the browsers are already aligned is risky and I think there has to be compelling benefits to justify it. I don't think this meets the bar.

@achristensen07
Copy link
Collaborator

I would be in favor of not changing the status quo here.

@jasnell
Copy link
Collaborator Author

jasnell commented May 4, 2020

Based on the responses I don't think we're going to get changes here, which is understandable. We'll add a clarification to the Node.js docs so folks can be informed on the differences and go from there. Thanks all!

@jasnell jasnell closed this as completed May 4, 2020
@annevk
Copy link
Member

annevk commented May 4, 2020

I'll resolve #18 with some kind of documentation in the URL Standard itself about this.

@domenic
Copy link
Member

domenic commented May 4, 2020

I'd like to leave #18 open to explore alternative (4), since it sounds like implementers are not willing to do any of the others.

@annevk
Copy link
Member

annevk commented May 4, 2020

I think that's best done as a new issue, if there's indeed interest in that. I'm not really convinced that's worth doing.

@szmarczak
Copy link

Can we reopen this issue please? No 4 seems to be the way to go. We can easily add a new feature now and when it gets popular enough, we can just deprecate the searchParams at some point.

@domenic
Copy link
Member

domenic commented May 4, 2020

I think it's underselling it to state that we can "easily" add a new feature in this way. But, I opened #491 to track (4), as requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs concrete proposal Moving the issue forward requires someone to figure out a detailed plan needs implementer interest Moving the issue forward requires implementers to express interest topic: api
Development

No branches or pull requests

6 participants