Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Improve URL parsing speed by 300% #8638

Closed
wants to merge 1 commit into from
Closed

Improve URL parsing speed by 300% #8638

wants to merge 1 commit into from

Conversation

CGavrila
Copy link

At the moment, the url.parse() function escapes every character in the list of escapable characters without checking whether it is necessary or not - i.e. it does a lot of work without any reason, leading to performance degradation.

The reason I started looking into this was commit 17a379, which lead to a slight performance drop due to the fact that it doubled the list of escapable characters, which is actually a legitimate security fix. However, after digging into it a bit, it became clear that the could actually be improved without affecting security.

My patch simply checks whether the character that needs escaping is in the URL before encoding and replacing it with the safe option. Basically, the only change is represented by this if-statement: if (rest.indexOf(ae) !== -1).

On average, this should amount to a ~3x performance increase.

Results

var ITERATIONS = 100000;

var url = require('url');

var urls = [
  'http://nodejs.org/docs/latest/api/url.html#url_url_format_urlobj',
  'http://blog.nodejs.org/',
  'https://encrypted.google.com/search?q=url&q=site:npmjs.org&hl=en',
  'javascript:alert("node is awesome");',
  'some.ran/dom/url.thing?oh=yes#whoo'
];

var parsedLink;

for(var i=0; i<ITERATIONS; i++)
    for(var j=0;j<urls.length; j++)
        parsedLink = url.parse(urls[j]);

Running the code (inspired by the node benchmarks) above on the current v0.12 HEAD (or any version since approx. v0.4.x) with and without the one-line fix:

Without fix
cristian@fatcat:~/work/node$ time ./node test.js 

real    0m10.552s
user    0m10.525s
sys 0m0.020s
With fix
cristian@fatcat:~/work/node$ time ./node test.js 

real    0m3.124s
user    0m3.124s
sys 0m0.008s

Testing

All the unit tests that ship with Node and pass without the fix, pass with the fix as well. Also, by placing a console.log(parsedLink); at the end of the microbenchmark above, the output is identical for both versions.

The url.parse() function now checks whether an escapable character is in the URL before trying to escape it.
bnoordhuis pushed a commit to bnoordhuis/io.js that referenced this pull request Dec 6, 2014
The url.parse() function now checks whether an escapable character is
in the URL before trying to escape it.

PR-URL: nodejs/node-v0.x-archive#8638
Reviewed-By: Ben Noordhuis <info@bnoordhuis.nl>
bnoordhuis added a commit to bnoordhuis/io.js that referenced this pull request Dec 6, 2014
bnoordhuis added a commit to nodejs/node that referenced this pull request Dec 9, 2014
Based on the ad-hoc benchmark from nodejs/node-v0.x-archive#8638 plus an additional
benchmark for user:pass auth URLs.

PR-URL: #102
Reviewed-by: Chris Dickinson <christopher.s.dickinson@gmail.com>
trevnorris pushed a commit that referenced this pull request Dec 30, 2014
The url.parse() function now checks whether an escapable character is in
the URL before trying to escape it.

PR-URL: #8638
[trev.norris@gmail.com: Switch to use continue instead of if]
Signed-off-by: Trevor Norris <trev.norris@gmail.com>
@trevnorris
Copy link

Thanks. Landed in 6a03fce.

@trevnorris trevnorris closed this Dec 30, 2014
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants