diff --git a/benchmark/README.md b/benchmark/README.md index d1233470757f20..6fd9a97bdfb3bb 100644 --- a/benchmark/README.md +++ b/benchmark/README.md @@ -1,417 +1,246 @@ -# Node.js core benchmark - -This folder contains benchmarks to measure the performance of the Node.js APIs. - -## Table of Content - -* [Prerequisites](#prerequisites) -* [Running benchmarks](#running-benchmarks) - * [Running individual benchmarks](#running-individual-benchmarks) - * [Running all benchmarks](#running-all-benchmarks) - * [Comparing node versions](#comparing-node-versions) - * [Comparing parameters](#comparing-parameters) -* [Creating a benchmark](#creating-a-benchmark) - -## Prerequisites - -Most of the HTTP benchmarks require a benchmarker to be installed, this can be -either [`wrk`][wrk] or [`autocannon`][autocannon]. - -`Autocannon` is a Node script that can be installed using -`npm install -g autocannon`. It will use the Node executable that is in the -path, hence if you want to compare two HTTP benchmark runs make sure that the -Node version in the path is not altered. - -`wrk` may be available through your preferred package manager. If not, you can -easily build it [from source][wrk] via `make`. - -By default `wrk` will be used as benchmarker. If it is not available -`autocannon` will be used in it its place. When creating a HTTP benchmark you -can specify which benchmarker should be used. You can force a specific -benchmarker to be used by providing it as an argument, e. g.: - -`node benchmark/run.js --set benchmarker=autocannon http` - -`node benchmark/http/simple.js benchmarker=autocannon` - -Basic Unix tools are required for some benchmarks. -[Git for Windows][git-for-windows] includes Git Bash and the necessary tools, -which need to be included in the global Windows `PATH`. - -To analyze the results `R` should be installed. Check you package manager or -download it from https://www.r-project.org/. - -The R packages `ggplot2` and `plyr` are also used and can be installed using -the R REPL. - -```R -$ R -install.packages("ggplot2") -install.packages("plyr") -``` - -### CRAN Mirror Issues -In the event you get a message that you need to select a CRAN mirror first. - -You can specify a mirror by adding in the repo parameter. - -If we used the "http://cran.us.r-project.org" mirror, it could look something -like this: - -```R -install.packages("ggplot2", repo="http://cran.us.r-project.org") -``` - -Of course, use the mirror that suits your location. -A list of mirrors is [located here](https://cran.r-project.org/mirrors.html). - -## Running benchmarks - -### Running individual benchmarks - -This can be useful for debugging a benchmark or doing a quick performance -measure. But it does not provide the statistical information to make any -conclusions about the performance. - -Individual benchmarks can be executed by simply executing the benchmark script -with node. - -```console -$ node benchmark/buffers/buffer-tostring.js - -buffers/buffer-tostring.js n=10000000 len=0 arg=true: 62710590.393305704 -buffers/buffer-tostring.js n=10000000 len=1 arg=true: 9178624.591787899 -buffers/buffer-tostring.js n=10000000 len=64 arg=true: 7658962.8891432695 -buffers/buffer-tostring.js n=10000000 len=1024 arg=true: 4136904.4060201733 -buffers/buffer-tostring.js n=10000000 len=0 arg=false: 22974354.231509723 -buffers/buffer-tostring.js n=10000000 len=1 arg=false: 11485945.656765845 -buffers/buffer-tostring.js n=10000000 len=64 arg=false: 8718280.70650129 -buffers/buffer-tostring.js n=10000000 len=1024 arg=false: 4103857.0726124765 -``` - -Each line represents a single benchmark with parameters specified as -`${variable}=${value}`. Each configuration combination is executed in a separate -process. This ensures that benchmark results aren't affected by the execution -order due to v8 optimizations. **The last number is the rate of operations -measured in ops/sec (higher is better).** - -Furthermore you can specify a subset of the configurations, by setting them in -the process arguments: - -```console -$ node benchmark/buffers/buffer-tostring.js len=1024 - -buffers/buffer-tostring.js n=10000000 len=1024 arg=true: 3498295.68561504 -buffers/buffer-tostring.js n=10000000 len=1024 arg=false: 3783071.1678948295 -``` - -### Running all benchmarks - -Similar to running individual benchmarks, a group of benchmarks can be executed -by using the `run.js` tool. Again this does not provide the statistical -information to make any conclusions. - -```console -$ node benchmark/run.js arrays - -arrays/var-int.js -arrays/var-int.js n=25 type=Array: 71.90148040747789 -arrays/var-int.js n=25 type=Buffer: 92.89648382795582 -... - -arrays/zero-float.js -arrays/zero-float.js n=25 type=Array: 75.46208316171496 -arrays/zero-float.js n=25 type=Buffer: 101.62785630273159 -... - -arrays/zero-int.js -arrays/zero-int.js n=25 type=Array: 72.31023859816062 -arrays/zero-int.js n=25 type=Buffer: 90.49906662339653 -... -``` - -It is possible to execute more groups by adding extra process arguments. -```console -$ node benchmark/run.js arrays buffers -``` - -### Comparing node versions - -To compare the effect of a new node version use the `compare.js` tool. This -will run each benchmark multiple times, making it possible to calculate -statistics on the performance measures. - -As an example on how to check for a possible performance improvement, the -[#5134](https://github.com/nodejs/node/pull/5134) pull request will be used as -an example. This pull request _claims_ to improve the performance of the -`string_decoder` module. - -First build two versions of node, one from the master branch (here called -`./node-master`) and another with the pull request applied (here called -`./node-pr-5135`). - -The `compare.js` tool will then produce a csv file with the benchmark results. - -```console -$ node benchmark/compare.js --old ./node-master --new ./node-pr-5134 string_decoder > compare-pr-5134.csv -``` - -For analysing the benchmark results use the `compare.R` tool. - -```console -$ cat compare-pr-5134.csv | Rscript benchmark/compare.R - - improvement confidence p.value -string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=ascii 12.46 % *** 1.165345e-04 -string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=base64-ascii 24.70 % *** 1.820615e-15 -string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=base64-utf8 23.60 % *** 2.105625e-12 -string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=utf8 14.04 % *** 1.291105e-07 -string_decoder/string-decoder.js n=250000 chunk=1024 inlen=128 encoding=ascii 6.70 % * 2.928003e-02 -... -``` - -In the output, _improvement_ is the relative improvement of the new version, -hopefully this is positive. _confidence_ tells if there is enough -statistical evidence to validate the _improvement_. If there is enough evidence -then there will be at least one star (`*`), more stars is just better. **However -if there are no stars, then you shouldn't make any conclusions based on the -_improvement_.** Sometimes this is fine, for example if you are expecting there -to be no improvements, then there shouldn't be any stars. - -**A word of caution:** Statistics is not a foolproof tool. If a benchmark shows -a statistical significant difference, there is a 5% risk that this -difference doesn't actually exist. For a single benchmark this is not an -issue. But when considering 20 benchmarks it's normal that one of them -will show significance, when it shouldn't. A possible solution is to instead -consider at least two stars (`**`) as the threshold, in that case the risk -is 1%. If three stars (`***`) is considered the risk is 0.1%. However this -may require more runs to obtain (can be set with `--runs`). - -_For the statistically minded, the R script performs an [independent/unpaired -2-group t-test][t-test], with the null hypothesis that the performance is the -same for both versions. The confidence field will show a star if the p-value -is less than `0.05`._ - -The `compare.R` tool can also produce a box plot by using the `--plot filename` -option. In this case there are 48 different benchmark combinations, thus you -may want to filter the csv file. This can be done while benchmarking using the -`--set` parameter (e.g. `--set encoding=ascii`) or by filtering results -afterwards using tools such as `sed` or `grep`. In the `sed` case be sure to -keep the first line since that contains the header information. - -```console -$ cat compare-pr-5134.csv | sed '1p;/encoding=ascii/!d' | Rscript benchmark/compare.R --plot compare-plot.png - - improvement confidence p.value -string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=ascii 12.46 % *** 1.165345e-04 -string_decoder/string-decoder.js n=250000 chunk=1024 inlen=128 encoding=ascii 6.70 % * 2.928003e-02 -string_decoder/string-decoder.js n=250000 chunk=1024 inlen=32 encoding=ascii 7.47 % *** 5.780583e-04 -string_decoder/string-decoder.js n=250000 chunk=16 inlen=1024 encoding=ascii 8.94 % *** 1.788579e-04 -string_decoder/string-decoder.js n=250000 chunk=16 inlen=128 encoding=ascii 10.54 % *** 4.016172e-05 -... -``` - -![compare tool boxplot](doc_img/compare-boxplot.png) - -### Comparing parameters - -It can be useful to compare the performance for different parameters, for -example to analyze the time complexity. - -To do this use the `scatter.js` tool, this will run a benchmark multiple times -and generate a csv with the results. - -```console -$ node benchmark/scatter.js benchmark/string_decoder/string-decoder.js > scatter.csv -``` - -After generating the csv, a comparison table can be created using the -`scatter.R` tool. Even more useful it creates an actual scatter plot when using -the `--plot filename` option. - -```console -$ cat scatter.csv | Rscript benchmark/scatter.R --xaxis chunk --category encoding --plot scatter-plot.png --log - -aggregating variable: inlen - -chunk encoding mean confidence.interval - 16 ascii 1111933.3 221502.48 - 16 base64-ascii 167508.4 33116.09 - 16 base64-utf8 122666.6 25037.65 - 16 utf8 783254.8 159601.79 - 64 ascii 2623462.9 399791.36 - 64 base64-ascii 462008.3 85369.45 - 64 base64-utf8 420108.4 85612.05 - 64 utf8 1358327.5 235152.03 - 256 ascii 3730343.4 371530.47 - 256 base64-ascii 663281.2 80302.73 - 256 base64-utf8 632911.7 81393.07 - 256 utf8 1554216.9 236066.53 - 1024 ascii 4399282.0 186436.46 - 1024 base64-ascii 730426.6 63806.12 - 1024 base64-utf8 680954.3 68076.33 - 1024 utf8 1554832.5 237532.07 -``` - -Because the scatter plot can only show two variables (in this case _chunk_ and -_encoding_) the rest is aggregated. Sometimes aggregating is a problem, this -can be solved by filtering. This can be done while benchmarking using the -`--set` parameter (e.g. `--set encoding=ascii`) or by filtering results -afterwards using tools such as `sed` or `grep`. In the `sed` case be -sure to keep the first line since that contains the header information. - -```console -$ cat scatter.csv | sed -E '1p;/([^,]+, ){3}128,/!d' | Rscript benchmark/scatter.R --xaxis chunk --category encoding --plot scatter-plot.png --log - -chunk encoding mean confidence.interval - 16 ascii 701285.96 21233.982 - 16 base64-ascii 107719.07 3339.439 - 16 base64-utf8 72966.95 2438.448 - 16 utf8 475340.84 17685.450 - 64 ascii 2554105.08 87067.132 - 64 base64-ascii 330120.32 8551.707 - 64 base64-utf8 249693.19 8990.493 - 64 utf8 1128671.90 48433.862 - 256 ascii 4841070.04 181620.768 - 256 base64-ascii 849545.53 29931.656 - 256 base64-utf8 809629.89 33773.496 - 256 utf8 1489525.15 49616.334 - 1024 ascii 4931512.12 165402.805 - 1024 base64-ascii 863933.22 27766.982 - 1024 base64-utf8 827093.97 24376.522 - 1024 utf8 1487176.43 50128.721 -``` - -![compare tool boxplot](doc_img/scatter-plot.png) - -## Creating a benchmark - -All benchmarks use the `require('../common.js')` module. This contains the -`createBenchmark(main, configs[, options])` method which will setup your -benchmark. - -The arguments of `createBenchmark` are: - -* `main` {Function} The benchmark function, - where the code running operations and controlling timers should go -* `configs` {Object} The benchmark parameters. `createBenchmark` will run all - possible combinations of these parameters, unless specified otherwise. - Each configuration is a property with an array of possible values. - Note that the configuration values can only be strings or numbers. -* `options` {Object} The benchmark options. At the moment only the `flags` - option for specifying command line flags is supported. - -`createBenchmark` returns a `bench` object, which is used for timing -the runtime of the benchmark. Run `bench.start()` after the initialization -and `bench.end(n)` when the benchmark is done. `n` is the number of operations -you performed in the benchmark. - -The benchmark script will be run twice: - -The first pass will configure the benchmark with the combination of -parameters specified in `configs`, and WILL NOT run the `main` function. -In this pass, no flags except the ones directly passed via commands -that you run the benchmarks with will be used. - -In the second pass, the `main` function will be run, and the process -will be launched with: - -* The flags you've passed into `createBenchmark` (the third argument) -* The flags in the command that you run this benchmark with - -Beware that any code outside the `main` function will be run twice -in different processes. This could be troublesome if the code -outside the `main` function has side effects. In general, prefer putting -the code inside the `main` function if it's more than just declaration. - -```js -'use strict'; -const common = require('../common.js'); -const SlowBuffer = require('buffer').SlowBuffer; - -const configs = { - // Number of operations, specified here so they show up in the report. - // Most benchmarks just use one value for all runs. - n: [1024], - type: ['fast', 'slow'], // Custom configurations - size: [16, 128, 1024] // Custom configurations -}; - -const options = { - // Add --expose-internals if you want to require internal modules in main - flags: ['--zero-fill-buffers'] -}; - -// main and configs are required, options is optional. -const bench = common.createBenchmark(main, configs, options); - -// Note that any code outside main will be run twice, -// in different processes, with different command line arguments. - -function main(conf) { - // You will only get the flags that you have passed to createBenchmark - // earlier when main is run. If you want to benchmark the internal modules, - // require them here. For example: - // const URL = require('internal/url').URL - - // Start the timer - bench.start(); - - // Do operations here - const BufferConstructor = conf.type === 'fast' ? Buffer : SlowBuffer; - - for (let i = 0; i < conf.n; i++) { - new BufferConstructor(conf.size); - } - - // End the timer, pass in the number of operations - bench.end(conf.n); -} -``` - -## Creating HTTP benchmark - -The `bench` object returned by `createBenchmark` implements -`http(options, callback)` method. It can be used to run external tool to -benchmark HTTP servers. - -```js -'use strict'; - -const common = require('../common.js'); - -const bench = common.createBenchmark(main, { - kb: [64, 128, 256, 1024], - connections: [100, 500] -}); - -function main(conf) { - const http = require('http'); - const len = conf.kb * 1024; - const chunk = Buffer.alloc(len, 'x'); - const server = http.createServer(function(req, res) { - res.end(chunk); - }); - - server.listen(common.PORT, function() { - bench.http({ - connections: conf.connections, - }, function() { - server.close(); - }); - }); -} -``` - -Supported options keys are: -* `port` - defaults to `common.PORT` -* `path` - defaults to `/` -* `connections` - number of concurrent connections to use, defaults to 100 -* `duration` - duration of the benchmark in seconds, defaults to 10 -* `benchmarker` - benchmarker to use, defaults to -`common.default_http_benchmarker` - -[autocannon]: https://github.com/mcollina/autocannon -[wrk]: https://github.com/wg/wrk -[t-test]: https://en.wikipedia.org/wiki/Student%27s_t-test#Equal_or_unequal_sample_sizes.2C_unequal_variances -[git-for-windows]: http://git-scm.com/download/win +# Node.js Core Benchmarks + +This folder contains code and data used to measure performance +of different Node.js implementations and different ways of +writing JavaScript run by the built-in JavaScript engine. + +For a detailed guide on how to write and run benchmarks in this +directory, see [the guide on benchmarks](../doc/guides/writing-and-running-benchmarks.md). + +## Table of Contents + +* [Benchmark directories](#benchmark-directories) +* [Common API](#common-api) + +## Benchmark Directories + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
DirectoryPurpose
arrays + Benchmarks for various operations on array-like objects, + including Array, Buffer, and typed arrays. +
assert + Benchmarks for the assert subsystem. +
buffers + Benchmarks for the buffer subsystem. +
child_process + Benchmarks for the child_process subsystem. +
crypto + Benchmarks for the crypto subsystem. +
dgram + Benchmarks for the dgram subsystem. +
domain + Benchmarks for the domain subsystem. +
es + Benchmarks for various new ECMAScript features and their + pre-ES2015 counterparts. +
events + Benchmarks for the events subsystem. +
fixtures + Benchmarks fixtures used in various benchmarks throughout + the benchmark suite. +
fs + Benchmarks for the fs subsystem. +
http + Benchmarks for the http subsystem. +
misc + Miscellaneous benchmarks and benchmarks for shared + internal modules. +
module + Benchmarks for the module subsystem. +
net + Benchmarks for the net subsystem. +
path + Benchmarks for the path subsystem. +
process + Benchmarks for the process subsystem. +
querystring + Benchmarks for the querystring subsystem. +
streams + Benchmarks for the streams subsystem. +
string_decoder + Benchmarks for the string_decoder subsystem. +
timers + Benchmarks for the timers subsystem, including + setTimeout, setInterval, .etc. +
tls + Benchmarks for the tls subsystem. +
url + Benchmarks for the url subsystem, including the legacy + url implementation and the WHATWG URL implementation. +
util + Benchmarks for the util subsystem. +
vm + Benchmarks for the vm subsystem. +
+ +### Other Top-level files + +The top-level files include common dependencies of the benchmarks +and the tools for launching benchmarks and visualizing their output. +The actual benchmark scripts should be placed in their corresponding +directories. + +* `_benchmark_progress.js`: implements the progress bar displayed + when running `compare.js` +* `_cli.js`: parses the command line arguments passed to `compare.js`, + `run.js` and `scatter.js` +* `_cli.R`: parses the command line arguments passed to `compare.R` +* `_http-benchmarkers.js`: selects and runs external tools for benchmarking + the `http` subsystem. +* `common.js`: see [Common API](#common-api). +* `compare.js`: command line tool for comparing performance between different + Node.js binaries. +* `compare.R`: R script for statistically analyzing the output of + `compare.js` +* `run.js`: command line tool for running individual benchmark suite(s). +* `scatter.js`: command line tool for comparing the performance + between different parameters in benchmark configurations, + for example to analyze the time complexity. +* `scatter.R`: R script for visualizing the output of `scatter.js` with + scatter plots. + +## Common API + +The common.js module is used by benchmarks for consistency across repeated +tasks. It has a number of helpful functions and properties to help with +writing benchmarks. + +### createBenchmark(fn, configs[, options]) + +See [the guide on writing benchmarks](../doc/guides/writing-and-running-benchmarks.md#basics-of-a-benchmark). + +### default\_http\_benchmarker + +The default benchmarker used to run HTTP benchmarks. +See [the guide on writing HTTP benchmarks](../doc/guides/writing-and-running-benchmarks.md#creating-an-http-benchmark). + + +### PORT + +The default port used to run HTTP benchmarks. +See [the guide on writing HTTP benchmarks](../doc/guides/writing-and-running-benchmarks.md#creating-an-http-benchmark). + +### sendResult(data) + +Used in special benchmarks that can't use `createBenchmark` and the object +it returns to accomplish what they need. This function reports timing +data to the parent process (usually created by running `compare.js`, `run.js` or +`scatter.js`). + +### v8ForceOptimization(method[, ...args]) + +Force V8 to mark the `method` for optimization with the native function +`%OptimizeFunctionOnNextCall()` and return the optimization status +after that. + +It can be used to prevent the benchmark from getting disrupted by the optimizer +kicking in halfway through. However, this could result in a less effective +optimization. In general, only use it if you know what it actually does. diff --git a/benchmark/doc_img/compare-boxplot.png b/doc/guides/doc_img/compare-boxplot.png similarity index 100% rename from benchmark/doc_img/compare-boxplot.png rename to doc/guides/doc_img/compare-boxplot.png diff --git a/benchmark/doc_img/scatter-plot.png b/doc/guides/doc_img/scatter-plot.png similarity index 100% rename from benchmark/doc_img/scatter-plot.png rename to doc/guides/doc_img/scatter-plot.png diff --git a/doc/guides/writing-and-running-benchmarks.md b/doc/guides/writing-and-running-benchmarks.md new file mode 100644 index 00000000000000..a20f321b7c2408 --- /dev/null +++ b/doc/guides/writing-and-running-benchmarks.md @@ -0,0 +1,427 @@ +# How to Write and Run Benchmarks in Node.js Core + +## Table of Contents + +* [Prerequisites](#prerequisites) + * [HTTP Benchmark Requirements](#http-benchmark-requirements) + * [Benchmark Analysis Requirements](#benchmark-analysis-requirements) +* [Running benchmarks](#running-benchmarks) + * [Running individual benchmarks](#running-individual-benchmarks) + * [Running all benchmarks](#running-all-benchmarks) + * [Comparing Node.js versions](#comparing-nodejs-versions) + * [Comparing parameters](#comparing-parameters) +* [Creating a benchmark](#creating-a-benchmark) + * [Basics of a benchmark](#basics-of-a-benchmark) + * [Creating an HTTP benchmark](#creating-an-http-benchmark) + +## Prerequisites + +Basic Unix tools are required for some benchmarks. +[Git for Windows][git-for-windows] includes Git Bash and the necessary tools, +which need to be included in the global Windows `PATH`. + +### HTTP Benchmark Requirements + +Most of the HTTP benchmarks require a benchmarker to be installed, this can be +either [`wrk`][wrk] or [`autocannon`][autocannon]. + +`Autocannon` is a Node.js script that can be installed using +`npm install -g autocannon`. It will use the Node.js executable that is in the +path, hence if you want to compare two HTTP benchmark runs make sure that the +Node.js version in the path is not altered. + +`wrk` may be available through your preferred package manager. If not, you can +easily build it [from source][wrk] via `make`. + +By default `wrk` will be used as benchmarker. If it is not available +`autocannon` will be used in it its place. When creating a HTTP benchmark you +can specify which benchmarker should be used. You can force a specific +benchmarker to be used by providing it as an argument, e. g.: + +`node benchmark/run.js --set benchmarker=autocannon http` + +`node benchmark/http/simple.js benchmarker=autocannon` + +### Benchmark Analysis Requirements + +To analyze the results `R` should be installed. Check you package manager or +download it from https://www.r-project.org/. + +The R packages `ggplot2` and `plyr` are also used and can be installed using +the R REPL. + +```R +$ R +install.packages("ggplot2") +install.packages("plyr") +``` + +In the event you get a message that you need to select a CRAN mirror first. + +You can specify a mirror by adding in the repo parameter. + +If we used the "http://cran.us.r-project.org" mirror, it could look something +like this: + +```R +install.packages("ggplot2", repo="http://cran.us.r-project.org") +``` + +Of course, use the mirror that suits your location. +A list of mirrors is [located here](https://cran.r-project.org/mirrors.html). + +## Running benchmarks + +### Running individual benchmarks + +This can be useful for debugging a benchmark or doing a quick performance +measure. But it does not provide the statistical information to make any +conclusions about the performance. + +Individual benchmarks can be executed by simply executing the benchmark script +with node. + +```console +$ node benchmark/buffers/buffer-tostring.js + +buffers/buffer-tostring.js n=10000000 len=0 arg=true: 62710590.393305704 +buffers/buffer-tostring.js n=10000000 len=1 arg=true: 9178624.591787899 +buffers/buffer-tostring.js n=10000000 len=64 arg=true: 7658962.8891432695 +buffers/buffer-tostring.js n=10000000 len=1024 arg=true: 4136904.4060201733 +buffers/buffer-tostring.js n=10000000 len=0 arg=false: 22974354.231509723 +buffers/buffer-tostring.js n=10000000 len=1 arg=false: 11485945.656765845 +buffers/buffer-tostring.js n=10000000 len=64 arg=false: 8718280.70650129 +buffers/buffer-tostring.js n=10000000 len=1024 arg=false: 4103857.0726124765 +``` + +Each line represents a single benchmark with parameters specified as +`${variable}=${value}`. Each configuration combination is executed in a separate +process. This ensures that benchmark results aren't affected by the execution +order due to v8 optimizations. **The last number is the rate of operations +measured in ops/sec (higher is better).** + +Furthermore you can specify a subset of the configurations, by setting them in +the process arguments: + +```console +$ node benchmark/buffers/buffer-tostring.js len=1024 + +buffers/buffer-tostring.js n=10000000 len=1024 arg=true: 3498295.68561504 +buffers/buffer-tostring.js n=10000000 len=1024 arg=false: 3783071.1678948295 +``` + +### Running all benchmarks + +Similar to running individual benchmarks, a group of benchmarks can be executed +by using the `run.js` tool. To see how to use this script, +run `node benchmark/run.js`. Again this does not provide the statistical +information to make any conclusions. + +```console +$ node benchmark/run.js arrays + +arrays/var-int.js +arrays/var-int.js n=25 type=Array: 71.90148040747789 +arrays/var-int.js n=25 type=Buffer: 92.89648382795582 +... + +arrays/zero-float.js +arrays/zero-float.js n=25 type=Array: 75.46208316171496 +arrays/zero-float.js n=25 type=Buffer: 101.62785630273159 +... + +arrays/zero-int.js +arrays/zero-int.js n=25 type=Array: 72.31023859816062 +arrays/zero-int.js n=25 type=Buffer: 90.49906662339653 +... +``` + +It is possible to execute more groups by adding extra process arguments. +```console +$ node benchmark/run.js arrays buffers +``` + +### Comparing Node.js versions + +To compare the effect of a new Node.js version use the `compare.js` tool. This +will run each benchmark multiple times, making it possible to calculate +statistics on the performance measures. To see how to use this script, +run `node benchmark/compare.js`. + +As an example on how to check for a possible performance improvement, the +[#5134](https://github.com/nodejs/node/pull/5134) pull request will be used as +an example. This pull request _claims_ to improve the performance of the +`string_decoder` module. + +First build two versions of Node.js, one from the master branch (here called +`./node-master`) and another with the pull request applied (here called +`./node-pr-5135`). + +The `compare.js` tool will then produce a csv file with the benchmark results. + +```console +$ node benchmark/compare.js --old ./node-master --new ./node-pr-5134 string_decoder > compare-pr-5134.csv +``` + +For analysing the benchmark results use the `compare.R` tool. + +```console +$ cat compare-pr-5134.csv | Rscript benchmark/compare.R + + improvement confidence p.value +string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=ascii 12.46 % *** 1.165345e-04 +string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=base64-ascii 24.70 % *** 1.820615e-15 +string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=base64-utf8 23.60 % *** 2.105625e-12 +string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=utf8 14.04 % *** 1.291105e-07 +string_decoder/string-decoder.js n=250000 chunk=1024 inlen=128 encoding=ascii 6.70 % * 2.928003e-02 +... +``` + +In the output, _improvement_ is the relative improvement of the new version, +hopefully this is positive. _confidence_ tells if there is enough +statistical evidence to validate the _improvement_. If there is enough evidence +then there will be at least one star (`*`), more stars is just better. **However +if there are no stars, then you shouldn't make any conclusions based on the +_improvement_.** Sometimes this is fine, for example if you are expecting there +to be no improvements, then there shouldn't be any stars. + +**A word of caution:** Statistics is not a foolproof tool. If a benchmark shows +a statistical significant difference, there is a 5% risk that this +difference doesn't actually exist. For a single benchmark this is not an +issue. But when considering 20 benchmarks it's normal that one of them +will show significance, when it shouldn't. A possible solution is to instead +consider at least two stars (`**`) as the threshold, in that case the risk +is 1%. If three stars (`***`) is considered the risk is 0.1%. However this +may require more runs to obtain (can be set with `--runs`). + +_For the statistically minded, the R script performs an [independent/unpaired +2-group t-test][t-test], with the null hypothesis that the performance is the +same for both versions. The confidence field will show a star if the p-value +is less than `0.05`._ + +The `compare.R` tool can also produce a box plot by using the `--plot filename` +option. In this case there are 48 different benchmark combinations, thus you +may want to filter the csv file. This can be done while benchmarking using the +`--set` parameter (e.g. `--set encoding=ascii`) or by filtering results +afterwards using tools such as `sed` or `grep`. In the `sed` case be sure to +keep the first line since that contains the header information. + +```console +$ cat compare-pr-5134.csv | sed '1p;/encoding=ascii/!d' | Rscript benchmark/compare.R --plot compare-plot.png + + improvement confidence p.value +string_decoder/string-decoder.js n=250000 chunk=1024 inlen=1024 encoding=ascii 12.46 % *** 1.165345e-04 +string_decoder/string-decoder.js n=250000 chunk=1024 inlen=128 encoding=ascii 6.70 % * 2.928003e-02 +string_decoder/string-decoder.js n=250000 chunk=1024 inlen=32 encoding=ascii 7.47 % *** 5.780583e-04 +string_decoder/string-decoder.js n=250000 chunk=16 inlen=1024 encoding=ascii 8.94 % *** 1.788579e-04 +string_decoder/string-decoder.js n=250000 chunk=16 inlen=128 encoding=ascii 10.54 % *** 4.016172e-05 +... +``` + +![compare tool boxplot](doc_img/compare-boxplot.png) + +### Comparing parameters + +It can be useful to compare the performance for different parameters, for +example to analyze the time complexity. + +To do this use the `scatter.js` tool, this will run a benchmark multiple times +and generate a csv with the results. To see how to use this script, +run `node benchmark/scatter.js`. + +```console +$ node benchmark/scatter.js benchmark/string_decoder/string-decoder.js > scatter.csv +``` + +After generating the csv, a comparison table can be created using the +`scatter.R` tool. Even more useful it creates an actual scatter plot when using +the `--plot filename` option. + +```console +$ cat scatter.csv | Rscript benchmark/scatter.R --xaxis chunk --category encoding --plot scatter-plot.png --log + +aggregating variable: inlen + +chunk encoding mean confidence.interval + 16 ascii 1111933.3 221502.48 + 16 base64-ascii 167508.4 33116.09 + 16 base64-utf8 122666.6 25037.65 + 16 utf8 783254.8 159601.79 + 64 ascii 2623462.9 399791.36 + 64 base64-ascii 462008.3 85369.45 + 64 base64-utf8 420108.4 85612.05 + 64 utf8 1358327.5 235152.03 + 256 ascii 3730343.4 371530.47 + 256 base64-ascii 663281.2 80302.73 + 256 base64-utf8 632911.7 81393.07 + 256 utf8 1554216.9 236066.53 + 1024 ascii 4399282.0 186436.46 + 1024 base64-ascii 730426.6 63806.12 + 1024 base64-utf8 680954.3 68076.33 + 1024 utf8 1554832.5 237532.07 +``` + +Because the scatter plot can only show two variables (in this case _chunk_ and +_encoding_) the rest is aggregated. Sometimes aggregating is a problem, this +can be solved by filtering. This can be done while benchmarking using the +`--set` parameter (e.g. `--set encoding=ascii`) or by filtering results +afterwards using tools such as `sed` or `grep`. In the `sed` case be +sure to keep the first line since that contains the header information. + +```console +$ cat scatter.csv | sed -E '1p;/([^,]+, ){3}128,/!d' | Rscript benchmark/scatter.R --xaxis chunk --category encoding --plot scatter-plot.png --log + +chunk encoding mean confidence.interval + 16 ascii 701285.96 21233.982 + 16 base64-ascii 107719.07 3339.439 + 16 base64-utf8 72966.95 2438.448 + 16 utf8 475340.84 17685.450 + 64 ascii 2554105.08 87067.132 + 64 base64-ascii 330120.32 8551.707 + 64 base64-utf8 249693.19 8990.493 + 64 utf8 1128671.90 48433.862 + 256 ascii 4841070.04 181620.768 + 256 base64-ascii 849545.53 29931.656 + 256 base64-utf8 809629.89 33773.496 + 256 utf8 1489525.15 49616.334 + 1024 ascii 4931512.12 165402.805 + 1024 base64-ascii 863933.22 27766.982 + 1024 base64-utf8 827093.97 24376.522 + 1024 utf8 1487176.43 50128.721 +``` + +![compare tool boxplot](doc_img/scatter-plot.png) + +## Creating a benchmark + +### Basics of a benchmark + +All benchmarks use the `require('../common.js')` module. This contains the +`createBenchmark(main, configs[, options])` method which will setup your +benchmark. + +The arguments of `createBenchmark` are: + +* `main` {Function} The benchmark function, + where the code running operations and controlling timers should go +* `configs` {Object} The benchmark parameters. `createBenchmark` will run all + possible combinations of these parameters, unless specified otherwise. + Each configuration is a property with an array of possible values. + Note that the configuration values can only be strings or numbers. +* `options` {Object} The benchmark options. At the moment only the `flags` + option for specifying command line flags is supported. + +`createBenchmark` returns a `bench` object, which is used for timing +the runtime of the benchmark. Run `bench.start()` after the initialization +and `bench.end(n)` when the benchmark is done. `n` is the number of operations +you performed in the benchmark. + +The benchmark script will be run twice: + +The first pass will configure the benchmark with the combination of +parameters specified in `configs`, and WILL NOT run the `main` function. +In this pass, no flags except the ones directly passed via commands +that you run the benchmarks with will be used. + +In the second pass, the `main` function will be run, and the process +will be launched with: + +* The flags you've passed into `createBenchmark` (the third argument) +* The flags in the command that you run this benchmark with + +Beware that any code outside the `main` function will be run twice +in different processes. This could be troublesome if the code +outside the `main` function has side effects. In general, prefer putting +the code inside the `main` function if it's more than just declaration. + +```js +'use strict'; +const common = require('../common.js'); +const SlowBuffer = require('buffer').SlowBuffer; + +const configs = { + // Number of operations, specified here so they show up in the report. + // Most benchmarks just use one value for all runs. + n: [1024], + type: ['fast', 'slow'], // Custom configurations + size: [16, 128, 1024] // Custom configurations +}; + +const options = { + // Add --expose-internals if you want to require internal modules in main + flags: ['--zero-fill-buffers'] +}; + +// main and configs are required, options is optional. +const bench = common.createBenchmark(main, configs, options); + +// Note that any code outside main will be run twice, +// in different processes, with different command line arguments. + +function main(conf) { + // You will only get the flags that you have passed to createBenchmark + // earlier when main is run. If you want to benchmark the internal modules, + // require them here. For example: + // const URL = require('internal/url').URL + + // Start the timer + bench.start(); + + // Do operations here + const BufferConstructor = conf.type === 'fast' ? Buffer : SlowBuffer; + + for (let i = 0; i < conf.n; i++) { + new BufferConstructor(conf.size); + } + + // End the timer, pass in the number of operations + bench.end(conf.n); +} +``` + +### Creating an HTTP benchmark + +The `bench` object returned by `createBenchmark` implements +`http(options, callback)` method. It can be used to run external tool to +benchmark HTTP servers. + +```js +'use strict'; + +const common = require('../common.js'); + +const bench = common.createBenchmark(main, { + kb: [64, 128, 256, 1024], + connections: [100, 500] +}); + +function main(conf) { + const http = require('http'); + const len = conf.kb * 1024; + const chunk = Buffer.alloc(len, 'x'); + const server = http.createServer(function(req, res) { + res.end(chunk); + }); + + server.listen(common.PORT, function() { + bench.http({ + connections: conf.connections, + }, function() { + server.close(); + }); + }); +} +``` + +Supported options keys are: +* `port` - defaults to `common.PORT` +* `path` - defaults to `/` +* `connections` - number of concurrent connections to use, defaults to 100 +* `duration` - duration of the benchmark in seconds, defaults to 10 +* `benchmarker` - benchmarker to use, defaults to +`common.default_http_benchmarker` + +[autocannon]: https://github.com/mcollina/autocannon +[wrk]: https://github.com/wg/wrk +[t-test]: https://en.wikipedia.org/wiki/Student%27s_t-test#Equal_or_unequal_sample_sizes.2C_unequal_variances +[git-for-windows]: http://git-scm.com/download/win diff --git a/test/README.md b/test/README.md index 5ed028a19631d6..8635dea3140e31 100644 --- a/test/README.md +++ b/test/README.md @@ -1,147 +1,154 @@ -# Table of Contents -* [Test directories](#test-directories) -* [Common module API](#common-module-api) - -## Test Directories - -### abort - -Tests for when the `--abort-on-uncaught-exception` flag is used. - -| Runs on CI | -|:----------:| -| No | - -### addons - -Tests for [addon](https://nodejs.org/api/addons.html) functionality along with -some tests that require an addon to function properly. - - -| Runs on CI | -|:----------:| -| Yes | - -### cctest - -C++ test that is run as part of the build process. - -| Runs on CI | -|:----------:| -| Yes | - -### debugger - -Tests for [debugger](https://nodejs.org/api/debugger.html) functionality. - -| Runs on CI | -|:----------:| -| No | - -### disabled - -Tests that have been disabled from running for various reasons. - -| Runs on CI | -|:----------:| -| No | - -### fixtures - -Test fixtures used in various tests throughout the test suite. - -### gc - -Tests for garbage collection related functionality. - -| Runs on CI | -|:----------:| -| No | - +# Node.js Core Tests -### inspector +This folder contains code and data used to test the Node.js implementation. -Tests for the V8 inspector integration. +For a detailed guide on how to write tests in this +directory, see [the guide on writing tests](../doc/guides/writing-tests.md). -| Runs on CI | -|:----------:| -| Yes | +On how to run tests in this direcotry, see +[the contributing guide](../CONTRIBUTING.md#step-5-test). -### internet +## Table of Contents -Tests that make real outbound connections (mainly networking related modules). -Tests for networking related modules may also be present in other directories, -but those tests do not make outbound connections. - -| Runs on CI | -|:----------:| -| No | - -### known_issues - -Tests reproducing known issues within the system. - -| Runs on CI | -|:----------:| -| No | - -### message - -Tests for messages that are output for various conditions (`console.log`, -error messages etc.) - -| Runs on CI | -|:----------:| -| Yes | - -### parallel - -Various tests that are able to be run in parallel. - -| Runs on CI | -|:----------:| -| Yes | - -### pummel - -Various tests for various modules / system functionality operating under load. - -| Runs on CI | -|:----------:| -| No | - -### sequential - -Various tests that are run sequentially. - -| Runs on CI | -|:----------:| -| Yes | - -### testpy - -Test configuration utility used by various test suites. - -### tick-processor - -Tests for the V8 tick processor integration. The tests are for the logic in -`lib/internal/v8_prof_processor.js` and `lib/internal/v8_prof_polyfill.js`. The -tests confirm that the profile processor packages the correct set of scripts -from V8 and introduces the correct platform specific logic. - -| Runs on CI | -|:----------:| -| No | - -### timers - -Tests for [timing utilities](https://nodejs.org/api/timers.html) (`setTimeout` -and `setInterval`). +* [Test directories](#test-directories) +* [Common module API](#common-module-api) -| Runs on CI | -|:----------:| -| No | +## Test Directories + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
DirectoryRuns on CIPurpose
abortNo + Tests for when the --abort-on-uncaught-exception + flag is used. +
addonsYes + Tests for addon + functionality along with some tests that require an addon to function + properly. +
cctestYes + C++ test that is run as part of the build process. +
debuggerNo + Tests for debugger + functionality along with some tests that require an addon to function + properly. +
disabledNo + Tests that have been disabled from running for various reasons. +
fixturesTest fixtures used in various tests throughout the test suite.
gcNoTests for garbage collection related functionality.
inspectorYesTests for the V8 inspector integration.
internetNo + Tests that make real outbound connections (mainly networking related + modules). Tests for networking related modules may also be present in + other directories, but those tests do not make outbound connections. +
known_issuesNoTests reproducing known issues within the system.
messageYes + Tests for messages that are output for various conditions + (console.log, error messages etc.)
parallelYesVarious tests that are able to be run in parallel.
pummelNo + Various tests for various modules / system functionality operating + under load. +
sequentialYes + Various tests that are run sequentially. +
testpy + Test configuration utility used by various test suites. +
tick-processorNo + Tests for the V8 tick processor integration. The tests are for the + logic in lib/internal/v8_prof_processor.js and + lib/internal/v8_prof_polyfill.js. The tests confirm that + the profile processor packages the correct set of scripts from V8 and + introduces the correct platform specific logic. +
timersNo + Tests for + timing utilities + (setTimeout and setInterval). +
## Common module API