Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Unable to override QueryString.unescape #4055

Closed
jhnlsn opened this issue Sep 25, 2012 · 8 comments
Closed

Unable to override QueryString.unescape #4055

jhnlsn opened this issue Sep 25, 2012 · 8 comments

Comments

@jhnlsn
Copy link

jhnlsn commented Sep 25, 2012

var querystring = require('querystring');

querystring.unescape = function(a,b) { console.log('unescape no more'); }

querystring.parse('t=asdf&u=john+nelson@gmail.com');

The main thing that im trying to do is NOT have the + replaced with a space, the only apparent way to do that is to override unescape.

let me know if there is a way to do this as its just not working for me.

thanks

@rlidwka
Copy link

rlidwka commented Sep 25, 2012

Are you sure this action won't break any other libraries? I'd really suggest to clone querystring.js and make appropriate changes there using it as your own module...

@bnoordhuis
Copy link
Member

Can someone review bnoordhuis/node@9a7b6c0? It adds a decodeSpaces option to querystring.parse().

@langpavel
Copy link

@bnoordhuis @rlidwka Nothing to review. Test case is attached.

Is there any standard to compare with?

@coltrane
Copy link

@bnoordhuis, addition of the decodeSpaces option LGTM.

But I think the real issue here is that overriding querystring.unescape appears to have no effect on the behavior of parse? Even though the docs say:

## querystring.unescape
The unescape function used by `querystring.parse`,
provided so that it could be overridden if necessary.

Looking at the code... It's apparent that unescape is only called if decodeURIComponent throws; which will only happen if the uri contains invalid UTF-8 byte sequences.

IMO we should either: eliminate the calls to decodeURIComponent all together, or move that try/catch logic out of parse, and down into unescape. Something like this would work:

QueryString.unescape = function(s, decodeSpaces) {
  if (!decodeSpaces) {
    try {
      return decodeURIComponent(s);
    } catch (e) { }
  }
  return QueryString.unescapeBuffer(s, decodeSpaces).toString()
};

...either way, the effect is that unescape will always be called by parse, so overriding unescape will work as expected for all inputs.

@bnoordhuis
Copy link
Member

Actually, I think it's pretty silly that we suggest to monkey-patch a global function, I don't know why that was ever added to the docs. I'm going to remove that unless someone speaks up.

@jhnlsn
Copy link
Author

jhnlsn commented Sep 28, 2012

I don't know about those other guys and their thread hijacking, but my initial request doesn't seem all that unreasonable.

I get that you didn't intend for the function to be overrideable in the first place, so take that out of the docs. But then why even have that decodeSpaces parameter on the unescapeBuffer function in the first place?

You have that try catch setup to use decodeURIComponent first, which doesn't touch a + symbol then you have the second catch function that does. I believe that is a bit inconsistent behavior, to ONLY decode the + if there is an exception in the first try?

I can't stand that we have to put this hack into javascript to fix problems with PHP and others that use a stupid + to encode a space! According to http://www.ietf.org/rfc/rfc3986.txt section 2.2, the "+" symbol is reserved and thus should not be touched when decoding... unless im totally reading that wrong...

@jhnlsn
Copy link
Author

jhnlsn commented Sep 28, 2012

Either way this can just be closed, ive worked around this by double encoding all my values that might contain a +

@coltrane
Copy link

@bnoordhuis +1 for the doc change. good call.

I, too, was surprised to find that we "decode" + by default; and also that our decoder is so permissive of garbage-ish input:

a+b  -->  a b      // by default (options.decodeSpaces == true) 
%zx  -->  %zx      // non-conforming percent escapes are passed-through unchanged
%f0  -->  \uFFFD   // invalid UTF-8 sequences are replaced with the Unicode Replacement Char

Might be a good idea to make a note of these points in the documentation. In lieu of explicit docs, I think most folks would assume that decoding follows the same rules as decodeURIComponent.

mscdex pushed a commit to mscdex/node that referenced this issue Dec 25, 2014
Documentation states that `querystring.unescape` may be overridden to
replace unescaper during parsing. However, the function was only
being used as a fallback for when the native decoder throws (on a
malformed URL). This patch moves the call to the native function and
the try/catch around it into querystring.unescape then has the parser
always invoke it, so that an override will always be used.

Fixes nodejs#4055

Reviewed-By: Fedor Indutny <fedor@indutny.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants