benchmark: use autocannon instead of wrk #7180

bzoz · 2016-06-06T16:30:42Z

(this continues the work from #6949)

Checklist

documentation is changed or added
the commit message follows commit guidelines

Affected core subsystem(s)

benchmark

Description of change

Switches http benchmarks to use autocannon npm module instead of wrk. This allows benchmarks to be executed on all supported platforms, including Windows. Running benchmarks with wrk(gist) and autocannon(gist) on a 2 core virtual machine gives very similar results.

Switches all HTTP benchmarks to use autocanon (npm module) instead of wrk. This allows for running all HTTP benchmarks under Windows.

mcollina · 2016-06-06T16:42:15Z

Not sure if I get to vote this, but LGTM.

One thing, we might want to flag the version range of autocannon in the NPM install step. If we are going forward with this, I can release autocannon as 1.0.0 to ease maintenance (so we can say "install anything in the 1.x.x line").
Does that make sense?

mscdex · 2016-06-06T16:44:49Z

It seems like this could cause reliability issues since it's node-based and we'd be benchmarking changes to node's http implementation. So if a performance regression for example slips in that could affect benchmark results from version to version.

mcollina · 2016-06-06T16:54:55Z

@mscdex the way this PR is built, it does not leverage the current node build, but rather the "global" / installed version of node.

Benchmarks are always comparative anyway, they need to be taken on the same box, using the same tools. This PR does not change that, as it will use a third node build that stays the same between the two run.

This adds some benefits: we can add some more benchmarks related to http pipelining, something that is not currently supported by wrk.

mscdex · 2016-06-06T19:30:07Z

@mcollina Right, but that's exactly what I mean. Typically the global node version will change more often than say wrk or whatever and node is more likely to change in its http implementation than other tools. shrug

mcollina · 2016-06-06T19:53:14Z

@mscdex I do not think results are comparable if they are not taken on the very same conditions. This usually means taking a 30-60 minute break from your laptop, or use another one for that time period. The only way I can get consistent results is to a) shuts down the Internet and b) close every other program I am running. From time to time, some crazy stuff starts up anyway, and then I do get different results.
And all of this is in a short time period.

I have been stabbed by this too many times to know that any slight change matters. I have seen different numbers (+/- 10%) coming out by plugging an external monitor (HDMI) in my MBP (try, maybe it's just me).
I think that comparing two results taken at different point in time, under different conditions suffer from the exact same problem you describe. I wouldn't compare results anyway, because I would not remember what were the exact condition that measurement was taken.

rvagg · 2016-06-08T06:22:28Z

@nodejs/benchmarking your call as far as I'm concerned

mhdawson · 2016-06-08T13:01:40Z

Will try to take a look sometime today

mhdawson · 2016-06-09T20:34:22Z

Overall I like using a tool which is more platform independent. So I'm generally a +1 on the change.

I think that we probably want to call out the concern from @mscdex in the benchmark/README.md under ## Prerequisites.

What I'd suggest is that we explain that it will use the node currently in the path when the benchmarks are run and that if you want to compare results with previous runs that you should make sure that the node currently in the path does not change between those runs.

mscdex · 2016-06-09T20:46:28Z

I wasn't so much concerned about the version of node changing between (immediate) runs as I was about being able to reliably compare results across different (global) versions of node (e.g. running a benchmark with a global node v5.x, then upgrading to node v6.x and re-running the same benchmark and trying to compare the two results).

mhdawson · 2016-06-09T21:11:46Z

@mscdex I don't think I'm quite following. You will need to keep the environment consistent if you want to compare runs (whether separate by a short or long time) and was suggesting that we call out that the global node version is one of the things you have to keep consistent.

If the concern is that the global changes for other reasons, then I'd suggest part of the benchmarking setup should be to have an install of node somewhere else and that it be added to the path so that its used for the benchmark runs.

mscdex · 2016-06-10T00:40:56Z

@mhdawson For environments that are dedicated to running benchmarks, keeping consistent global node versions isn't a problem. On my own personal machine for example, where I run almost all of my benchmarks these days, I typically keep my global copy of node updated to the latest "stable"/"current" version.

Let's say I ran a particular benchmark back in January of this year and had whatever latest node v5.x installed globally at that point. Then when node v6.x was released, I upgraded my global copy to that and re-ran the same benchmark. I now run a greater risk of unreliability (IMHO) when I start comparing the two results because node's http implementation could have changed dramatically (heck, this could even happen within a single release line of node). If I were to use a global copy of wrk instead of node for the benchmark, then I would be more confident in being able to compare my two benchmark results because tools like wrk are not as likely to change in their http implementation as much as node (at least at this point).

mcollina · 2016-06-10T08:44:19Z

@mscdex let me add some inline comments

Let's say I ran a particular benchmark back in January of this year and had whatever latest node v5.x installed globally at that point. Then when node v6.x was released, I upgraded my global copy to that and re-ran the same benchmark.

Between that laps of time, you will have installed several OS updates and updated various other software on your machine. The number will not be comparable anyway.

I now run a greater risk of unreliability (IMHO) when I start comparing the two results because node's http implementation could have changed dramatically (heck, this could even happen within a single release line of node).

Autocannon does not rely on node HTTP layer, just on net and a bunch of npm modules. Some optimization on streams or V8 could affect the numbers, but less so.

The point for a benchmarking tool is to drive the benchmarked server at 100% CPU. As long as it does that, there should be no relevant change in numbers.

This are the results take on node v4 and node v6:

$ nvm use v6
Now using node v6.2.0 (npm v3.8.9)
$ autocannon -c 100 -d 5 http://localhost:3000/hello
Running 5s test @ http://localhost:3000/hello
100 connections

Stat         Avg     Stdev    Max
Latency (ms) 4.92    3.53     162
Req/Sec      18532.8 728.2    19167
Bytes/Sec    2.08 MB 84.44 kB 2.23 MB

93k requests in 5s, 10.29 MB read
$ nvm use v4
Now using node v4.4.2 (npm v2.15.0)
$ autocannon -c 100 -d 5 http://localhost:3000/hello
Running 5s test @ http://localhost:3000/hello
100 connections

Stat         Avg     Stdev     Max
Latency (ms) 4.92    4.51      217
Req/Sec      18529.6 952.33    19759
Bytes/Sec    2.05 MB 110.83 kB 2.23 MB

93k requests in 5s, 10.28 MB read

bzoz · 2016-06-15T13:51:48Z

I've added note about node version to README.md, PTAL

mhdawson · 2016-06-15T13:58:48Z

@mscdex in your scenario instead of using the global version you could use another copy of node, make sure it is first in your path and then keep that for when you want to compare to older results.

mscdex · 2016-06-15T14:11:24Z

@mhdawson Maintaining another, separate copy of node just for benchmarking isn't something I'd personally want to do.

mcollina · 2016-06-16T14:16:24Z

I think comparing the benchmarks taken months from each other is not a use case for the internal node benchmarking suite. The code of the benchmarks might change from two versions of node anyway.

I think we need a "benchmarking suite" that does not reside on the node.js tree, and even that might be hard to compare between versions across time.

mhdawson · 2016-06-16T15:00:35Z

What we have at benchmarking.nodejs.org is a mix. Some of the benchmarks (ex acme air) are not in the node.js tree while some run benchmarks which are (ex startup footprint).

I'm thinking we need broader input on this, maybe we should add for CTC discussion ?

orangemocha · 2016-06-16T15:21:23Z

@mscdex and others: How about adding the target Node (the one to be benchmarked) to the path so that you can use a single executable? Or we could make the benchmarks use the target Node by default, and let users override it with an environment variable when they want to use a separate version.

mscdex · 2016-06-16T17:58:16Z

@orangemocha As I mentioned, that would still involve me having to maintain a separate copy of node just for the purposes of benchmarking, which is not something I want to do.

bzoz · 2016-06-17T14:54:55Z

@mscdex, how about we provide a way to use user-defined installation of node? User could set environmental variable with path to node to use for autocannon.

I could also add a script that would automatically download and set up this local node installation.

mscdex · 2016-06-17T15:58:26Z

@bzoz My problem isn't where, it's that node would be used as the http client.

bzoz · 2016-06-20T10:59:31Z

Why? It would be different node than the one being benchmarked. Its version could be easily kept constant.

jbergstroem · 2016-06-21T10:23:12Z

Although not in a position to affect this decision I must say that I agree with @mscdex's concerns. You generally want a consistent benchmark platform, not something with moving parts or consistent performance enhancements.

sieroslawos · 2016-06-21T16:49:38Z

@bzoz fair enought to me, although I get @jbergstroem's and @mscdex's point. tough one...

bzoz · 2016-06-21T17:19:53Z

@jbergstroem, @mscdex: I understand your concerns, it would be best if we could use wrk under Windows, but we can't.

Now, autocannon will use node that is first on PATH. You can easily assure its version stays the same. Either by putting special node version used for benchmarking first in PATH before starting http benchmarks or by using nvm. This way you can have consistent benchmark platform on all systems that node will ever support.

mscdex · 2016-06-21T17:26:12Z

@bzoz Well, at the very least Windows can execute Linux binaries now, so that may be less of a problem now?

I should also add that I am not tied to wrk specifically. If there is an equivalent stable http benchmarking tool that is cross platform, then I would be open to switching to whatever. However, I am still -1 on using anything node-based as a benchmarking http client.

bzoz · 2016-06-21T17:35:16Z

@mscdex: It only works on Windows 10.

Besides that, I think using node as benchmarker will keep things more consistent 😉. Microsoft can change their "linux implementation" whenever they want without notice.

Fishrock123 · 2016-06-24T11:34:25Z

Would it be more ideal to benchmark with multiple tools?

bzoz · 2016-07-07T16:40:23Z

How about we do what @Fishrock123 suggested.

I can add support for autocannon next to wrk. On Linux both tools could be used for benchmark. On Windows we would use only autocannon. By default when running benchmarks we would try to use both tools. Benchmark would fail if none of the tools could be found. Does this sounds good?

bzoz · 2016-08-17T13:09:10Z

Continued in #8140

benchmark: use autocannon instead of wrk

19d023d

Switches all HTTP benchmarks to use autocanon (npm module) instead of wrk. This allows for running all HTTP benchmarks under Windows.

nodejs-github-bot added the benchmark Issues and PRs related to the benchmark subsystem. label Jun 6, 2016

bzoz mentioned this pull request Jun 6, 2016

benchmark: use apache bench instead of wrk #6949

Closed

2 tasks

mscdex added the http Issues or PRs related to the http subsystem. label Jun 6, 2016

Add info about Node version used by autocannon

d6ada41

Change wrk to autocannon in Makefile

696153a

bzoz mentioned this pull request Aug 17, 2016

benchmark: support for multiple http benchmarkers #8140

Closed

3 tasks

bzoz closed this Aug 17, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark: use autocannon instead of wrk #7180

benchmark: use autocannon instead of wrk #7180

bzoz commented Jun 6, 2016

mcollina commented Jun 6, 2016

mscdex commented Jun 6, 2016

mcollina commented Jun 6, 2016

mscdex commented Jun 6, 2016 •

edited

Loading

mcollina commented Jun 6, 2016

rvagg commented Jun 8, 2016

mhdawson commented Jun 8, 2016

mhdawson commented Jun 9, 2016

mscdex commented Jun 9, 2016 •

edited

Loading

mhdawson commented Jun 9, 2016

mscdex commented Jun 10, 2016 •

edited

Loading

mcollina commented Jun 10, 2016 •

edited

Loading

bzoz commented Jun 15, 2016

mhdawson commented Jun 15, 2016

mscdex commented Jun 15, 2016

mcollina commented Jun 16, 2016

mhdawson commented Jun 16, 2016

orangemocha commented Jun 16, 2016

mscdex commented Jun 16, 2016

bzoz commented Jun 17, 2016

mscdex commented Jun 17, 2016

bzoz commented Jun 20, 2016

jbergstroem commented Jun 21, 2016

sieroslawos commented Jun 21, 2016

bzoz commented Jun 21, 2016

mscdex commented Jun 21, 2016 •

edited

Loading

bzoz commented Jun 21, 2016

Fishrock123 commented Jun 24, 2016

bzoz commented Jul 7, 2016

bzoz commented Aug 17, 2016

benchmark: use autocannon instead of wrk #7180

benchmark: use autocannon instead of wrk #7180

Conversation

bzoz commented Jun 6, 2016

Checklist

Affected core subsystem(s)

Description of change

mcollina commented Jun 6, 2016

mscdex commented Jun 6, 2016

mcollina commented Jun 6, 2016

mscdex commented Jun 6, 2016 • edited Loading

mcollina commented Jun 6, 2016

rvagg commented Jun 8, 2016

mhdawson commented Jun 8, 2016

mhdawson commented Jun 9, 2016

mscdex commented Jun 9, 2016 • edited Loading

mhdawson commented Jun 9, 2016

mscdex commented Jun 10, 2016 • edited Loading

mcollina commented Jun 10, 2016 • edited Loading

bzoz commented Jun 15, 2016

mhdawson commented Jun 15, 2016

mscdex commented Jun 15, 2016

mcollina commented Jun 16, 2016

mhdawson commented Jun 16, 2016

orangemocha commented Jun 16, 2016

mscdex commented Jun 16, 2016

bzoz commented Jun 17, 2016

mscdex commented Jun 17, 2016

bzoz commented Jun 20, 2016

jbergstroem commented Jun 21, 2016

sieroslawos commented Jun 21, 2016

bzoz commented Jun 21, 2016

mscdex commented Jun 21, 2016 • edited Loading

bzoz commented Jun 21, 2016

Fishrock123 commented Jun 24, 2016

bzoz commented Jul 7, 2016

bzoz commented Aug 17, 2016

mscdex commented Jun 6, 2016 •

edited

Loading

mscdex commented Jun 9, 2016 •

edited

Loading

mscdex commented Jun 10, 2016 •

edited

Loading

mcollina commented Jun 10, 2016 •

edited

Loading

mscdex commented Jun 21, 2016 •

edited

Loading