Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forking for every test file is slow #78

Closed
floatdrop opened this issue Oct 17, 2015 · 44 comments
Closed

Forking for every test file is slow #78

floatdrop opened this issue Oct 17, 2015 · 44 comments
Assignees
Labels
blocked waiting on something else to be resolved first bug current functionality does not work as desired help wanted priority

Comments

@floatdrop
Copy link
Contributor

Update: Blocked by #696


Seems like it is not quite good idea to spawn a fork for every test file. I have try'd fresh version of ava in got and tests run time get worse (not to mention, that iTunes is getting icky while tests).

May be it is better to fork batched runners and/or have a concurrency flag?

@kevva
Copy link
Contributor

kevva commented Oct 19, 2015

Aren't we setting some kind of limit of forks? We should probably not create more forks than there are CPUs.

@sindresorhus
Copy link
Member

We're not and tbh it's annoying that we'll probably have too. Why can't the system just pool the load for us instead of we doing it manually? I hate computers so much... Let's experiement with what kind of number is optimal. I'm thinking (require('os').cpus().length || 1) * 2;.

@sindresorhus
Copy link
Member

We should also look into why forking is so slow. Maybe there's something we can optimize there.

@floatdrop
Copy link
Contributor Author

@sindresorhus forking is fast, spinning up Node and babel is quite slow.

@sindresorhus
Copy link
Member

Node.js isn't inherently slow to spin up, it's requiring things that are slow. Babel caches the compilation, so that should only be an issue on the a cold load.

@sindresorhus
Copy link
Member

We could use https://github.com/jaguard/time-require to measure if any of our require's are slowing down the startup. Anyone interested in looking into this?

@vadimdemedes
Copy link
Contributor

I checked and babel takes the most time to require. I'm planning to migrate AVA to Babel 6 (#117), will see if there are any time wins.

@sindresorhus
Copy link
Member

I ran time-require on a forked test file and here are the results:

 #  module                           time  %
 1  has-color (../...lor/index.js)   20ms  ▇ 2%
 2  chalk (../node...alk/index.js)   26ms  ▇ 3%
 3  ./util.js (../...main/util.js)   12ms  ▇ 1%
 4  bluebird (../n.../bluebird.js)  107ms  ▇▇▇▇ 11%
 5  esprima (../no...a/esprima.js)   78ms  ▇▇▇ 8%
 6  array-foreach...ach/index.js)    15ms  ▇ 1%
 7  escallmatch (....tch/index.js)  109ms  ▇▇▇▇ 11%
 8  ./lib/decorato...decorator.js)  113ms  ▇▇▇▇ 11%
 9  define-propert...ies/index.js)   12ms  ▇ 1%
10  empower (../no...wer/index.js)  127ms  ▇▇▇▇ 13%
11  ./strategies (...trategies.js)   14ms  ▇ 1%
12  stringifier (....ier/index.js)   18ms  ▇ 2%
13  esprima (../no...a/esprima.js)   53ms  ▇▇ 5%
14  ./traverse (...../traverse.js)   59ms  ▇▇ 6%
15  ./javascript/d...ompressed.js)   29ms  ▇ 3%
16  googlediff (.....iff/index.js)   29ms  ▇ 3%
17  ./udiff (../no...lib/udiff.js)   31ms  ▇ 3%
18  ./lib/create (...ib/create.js)  119ms  ▇▇▇▇ 12%
19  power-assert-f...ter/index.js)  120ms  ▇▇▇▇ 12%
20  ./assert (../n...ib/assert.js)  289ms  ▇▇▇▇▇▇▇▇▇ 29%
21  ./test (../nod.../lib/test.js)  292ms  ▇▇▇▇▇▇▇▇▇ 29%
22  ./lib/runner (...ib/runner.js)  400ms  ▇▇▇▇▇▇▇▇▇▇▇▇ 40%
23  https (https)                    27ms  ▇ 3%
24  ./lib/_stream_..._readable.js)   17ms  ▇ 2%
25  readable-strea.../readable.js)   26ms  ▇ 3%
26  once (../node_...once/once.js)   13ms  ▇ 1%
27  end-of-stream...eam/index.js)    15ms  ▇ 1%
28  duplexify (../...ify/index.js)   44ms  ▇▇ 4%
29  ./_stream_read...eam_readable)   22ms  ▇ 2%
30  ./lib/_stream_...am_duplex.js)   23ms  ▇ 2%
31  readable-strea.../readable.js)   35ms  ▇▇ 3%
32  read-all-strea...eam/index.js)   39ms  ▇▇ 4%
33  node-status-co...des/index.js)   18ms  ▇ 2%
34  ../ (../index.js)               361ms  ▇▇▇▇▇▇▇▇▇▇▇ 36%
Total require(): 327
Total time: 1s

1 second is way too much for just requiring dependencies. We should look into optimizing that somehow.

@Qix-
Copy link
Contributor

Qix- commented Nov 2, 2015

Nice tool, I'll add that to my toolbox. 💃

@sindresorhus
Copy link
Member

We could also cache the "require resolving", so every test files doesn't have to do that.

Maybe by using https://github.com/bahmutov/cache-require-paths by @bahmutov.

@sindresorhus
Copy link
Member

Node.js 5.2.0 is also ~45% faster to startup! :) https://twitter.com/rvagg/status/674775383967858688

@jamestalmage
Copy link
Contributor

Maybe by using https://github.com/bahmutov/cache-require-paths by @bahmutov.

I'm not so sure about that. NPM 3's automatic flattening and deduping of the dependency tree really sped up resolve times. Also, I don't see if or how it decides to invalidate the the cache. That seems iffy to me, especially for a testing framework where it's fairly reasonable to expect their dependency tree will change frequently.

Node.js 5.2.0 is also ~45% faster to startup!

That is huge.

@jamestalmage
Copy link
Contributor

I think the biggest improvement will come from moving Babel to the main thread.

@sindresorhus
Copy link
Member

I think the biggest improvement will come from moving Babel to the main thread.

Definitely. Was just looking for other minor wins.

@jamestalmage
Copy link
Contributor

I guess it's worth a shot. I just have doubts it is going to be effective with us spawning so many processes (it doesn't coordinate across processes), and dealing with a relatively volatile dependency tree. Seems better suited to servers than dev boxes.

@sindresorhus
Copy link
Member

Agree. Ignore that.

@ariporad
Copy link
Contributor

@sindresorhus: I have a really stupid idea: What if we webpack'd (or something similar) AVA and all of it's dependencies (except the CLI), and have the child processes use that? Then they wouldn't have to do any requiring. If we uglified that, it would eliminate requiring, and greatly reduce parsing time.

@ariporad
Copy link
Contributor

@sindresorhus: Or, a slightly more sane idea: Make a map of every file in our dependency tree mapped to it's contents, then hook require to use those instead of the file system. Example:

var files = new Map([
    ["./lib/test.js", "'use strict';\nvar isGeneratorFn = require('is-generator-fn');\nvar maxTimeout = require('max-timeout');\nvar Promise = require('bluebird');\nvar fnName = require('fn-name');\nvar co = require('co-with-promise');\nvar observableToPromise = require('observable-to-promise');\nvar isPromise = require('is-promise');\nvar isObservable = require('is-observable');\nvar assert = require('./assert');\nvar enhanceAssert = require('./enhance-assert');\nvar globals = require('./globals');\n\nfunction Test(title, fn) {\nif (!(this instanceof Test)) {\n\t\treturn new Test(title, fn);\n\t}\n\n\tif (typeof title === 'function') {\n\t\tfn = title;\n\t\ttitle = null;\n\t}\n\n\tassert.is(typeof fn, 'function', 'you must provide a callback');\n\n\tthis.title = title || fnName(fn) || '[anonymous]';\n\tthis.fn = isGeneratorFn(fn) ? co.wrap(fn) : fn;\n\tthis.assertions = [];\n\tthis.planCount = null;\n\tthis.duration = null;\n\tthis.assertError = undefined;\n\n\tObject.defineProperty(this, 'assertCount', {\n\t\tenumerable: true,\n\t\tget: function () {\n\t\t\treturn this.assertions.length;\n\t\t}\n\t});\n\n\t// TODO(jamestalmage): make this an optional constructor arg instead of having Runner set it after the fact.\n\t// metadata should just always exist, otherwise it requires a bunch of ugly checks all over the place.\n\tthis.metadata = {};\n\n\t// store the time point before test execution\n\t// to calculate the total time spent in test\n\tthis._timeStart = null;\n\n\t// workaround for Babel giving anonymous functions a name\n\tif (this.title === 'callee$0$0') {\n\t\tthis.title = '[anonymous]';\n\t}\n}\n\nmodule.exports = Test;\n\nTest.prototype._assert = function (promise) {\n\tthis.assertions.push(promise);\n};\n\nTest.prototype._setAssertError = function (err) {\n\tif (this.assertError !== undefined) {\n\t\treturn;\n\t}\n\n\tif (err === undefined) {\n\t\terr = 'undefined';\n\t}\n\n\tthis.assertError = err;\n};\n\nTest.prototype.plan = function (count) {\n\tif (typeof count !== 'number') {\n\t\tthrow new TypeError('Expected a number');\n\t}\n\n\tthis.planCount = count;\n\n\t// in case the `planCount` doesn't match `assertCount,\n\t// we need the stack of this function to throw with a useful stack\n\tthis.planStack = new Error().stack;\n};\n\nTest.prototype.run = function () {\n\tvar self = this;\n\n\tthis.promise = Promise.pending();\n\n\t// TODO(vdemedes): refactor this to avoid storing the promise\n\tif (!this.fn) {\n\t\tthis.exit();\n\t\treturn undefined;\n\t}\n\n\tthis._timeStart = globals.now();\n\n\t// wait until all assertions are complete\n\tthis._timeout = globals.setTimeout(function () {}, maxTimeout);\n\n\tvar ret;\n\n\ttry {\n\t\tret = this.fn(this._publicApi());\n\t} catch (err) {\n\t\tthis._setAssertError(err);\n\t\tthis.exit();\n\t}\n\n\tvar asyncType = 'promises';\n\n\tif (isObservable(ret)) {\n\t\tasyncType = 'observables';\n\t\tret = observableToPromise(ret);\n\t}\n\n\tif (isPromise(ret)) {\n\t\tif (this.metadata.callback) {\n\t\t\tself._setAssertError(new Error('Do not return ' + asyncType + ' from tests declared via `test.cb(...)`, if you want to return a promise simply declare the test via `test(...)`'));\n\t\t\tthis.exit();\n\t\t\treturn this.promise.promise;\n\t\t}\n\n\t\tret\n\t\t\t.then(function () {\n\t\t\t\tself.exit();\n\t\t\t})\n\t\t\t.catch(function (err) {\n\t\t\t\tself._setAssertError(err);\n\t\t\t\tself.exit();\n\t\t\t});\n\t} else if (!this.metadata.callback) {\n\t\tthis.exit();\n\t}\n\n\treturn this.promise.promise;\n};\n\nObject.defineProperty(Test.prototype, 'end', {\n\tget: function () {\n\t\tif (this.metadata.callback) {\n\t\t\treturn this._end.bind(this);\n\t\t}\n\t\tthrow new Error('t.end is not supported in this context. To use t.end as a callback, you must use \"callback mode\" via `test.cb(testName, fn)` ');\n\t}\n});\n\nTest.prototype._end = function (err) {\n\tif (err) {\n\t\tthis._setAssertError(new assert.AssertionError({\n\t\t\tactual: err,\n\t\t\tmessage: 'Callback called with an error → ' + err,\n\t\t\toperator: 'callback'\n\t\t}));\n\n\t\tthis.exit();\n\t\treturn;\n\t}\n\n\tif (this.endCalled) {\n\t\tthrow new Error('.end() called more than once');\n\t}\n\n\tthis.endCalled = true;\n\tthis.exit();\n};\n\nTest.prototype._checkPlanCount = function () {\n\tif (this.assertError === undefined && this.planCount !== null && this.planCount !== this.assertions.length) {\n\t\tthis._setAssertError(new assert.AssertionError({\n\t\t\tactual: this.assertions.length,\n\t\t\texpected: this.planCount,\n\t\t\tmessage: 'Assertion count does not match planned',\n\t\t\toperator: 'plan'\n\t\t}));\n\n\t\t//this.assertError.stack = this.planStack;\n\t}\n};\n\nTest.prototype.exit = function () {\n\tvar self = this;\n\n\tthis._checkPlanCount();\n\n\tPromise.all(this.assertions)\n\t\t.catch(function (err) {\n\t\t\tself._setAssertError(err);\n\t\t})\n\t\t.finally(function () {\n\t\t\t// calculate total time spent in test\n\t\t\tself.duration = globals.now() - self._timeStart;\n\n\t\t\t// stop infinite timer\n\t\t\tglobals.clearTimeout(self._timeout);\n\n\t\t\tself._checkPlanCount();\n\n\t\t\tif (!self.ended) {\n\t\t\t\tself.ended = true;\n\n\t\t\t\tglobals.setImmediate(function () {\n\t\t\t\t\tif (self.assertError !== undefined) {\n\t\t\t\t\t\tself.promise.reject(self.assertError);\n\t\t\t\t\t\treturn;\n\t\t\t\t\t}\n\n\t\t\t\t\tself.promise.resolve(self);\n\t\t\t\t});\n\t\t\t}\n\t\t});\n};\n\nTest.prototype._publicApi = function () {\n\tvar self = this;\n\tvar api = {skip: {}};\n\n\t// Getters\n\t[\n\t\t'assertCount',\n\t\t'title',\n\t\t'end'\n\t]\n\t\t.forEach(function (name) {\n\t\t\tObject.defineProperty(api, name, {\n\t\t\t\tenumerable: name === 'end' ? self.metadata.callback : true,\n\t\t\t\tget: function () {\n\t\t\t\t\treturn self[name];\n\t\t\t\t}\n\t\t\t});\n\t\t});\n\n\t// Get / Set\n\tObject.defineProperty(api, 'context', {\n\t\tenumerable: true,\n\t\tget: function () {\n\t\t\treturn self.context;\n\t\t},\n\t\tset: function (context) {\n\t\t\tself.context = context;\n\t\t}\n\t});\n\n\t// Bound Functions\n\tapi.plan = this.plan.bind(this);\n\n\tfunction skipFn() {\n\t\tself._assert(Promise.resolve());\n\t}\n\n\tfunction onAssertionEvent(event) {\n\t\tvar promise;\n\t\tif (event.assertionThrew) {\n\t\t\tevent.error.powerAssertContext = event.powerAssertContext;\n\t\t\tpromise = Promise.reject(event.error);\n\t\t} else {\n\t\t\tpromise = Promise.resolve(observableToPromise(event.returnValue));\n\t\t}\n\t\tpromise = promise\n\t\t\t.catch(function (err) {\n\t\t\t\terr.originalMessage = event.originalMessage;\n\t\t\t\treturn Promise.reject(err);\n\t\t\t});\n\t\tself._assert(promise);\n\t}\n\n\tvar enhanced = enhanceAssert({\n\t\tassert: assert,\n\t\tonSuccess: onAssertionEvent,\n\t\tonError: onAssertionEvent\n\t});\n\n\t// Patched assert methods: increase assert count and store errors.\n\tObject.keys(assert).forEach(function (el) {\n\t\tapi.skip[el] = skipFn;\n\t\tapi[el] = enhanced[el].bind(enhanced);\n\t});\n\n\tapi._capt = enhanced._capt.bind(enhanced);\n\tapi._expr = enhanced._expr.bind(enhanced);\n\n\treturn api;\n};\n"],
    // etc.
]);

var actualRequire = require._extensions['.js'];
require._extensions['.js'] = function (module, filename) {
    if (files.has(filename)) return module._compile(filename);
    return actualRequire.apply(this, arguments);
};

require('./lib/babel.js');

If we uglified the files with "whitespace only" mode, it could totally* work!

*Probobly

@Qix-
Copy link
Contributor

Qix- commented Dec 27, 2015

Isn't the overhead of forking the actual forking bit? Like the allocation the OS does to fork processes is what the overhead really brings, unless it's also doing babel compilation for each fork...

@jamestalmage
Copy link
Contributor

unless it's also doing babel compilation for each fork

That is currently the case, #349 fixes it. Currently, assuming a cache miss - we spend 700ms just requiring stuff in each child process. In #349 that drops to 150ms. Those numbers are from my laptop with super speedy SSD's, people using traditional hard drives will see significantly worse performance in either case, but benefit a lot more from #349.

I don't think we should be concatenating files. That is a lot of complexity, for what I think is going to be minimal gains.

@Qix-
Copy link
Contributor

Qix- commented Dec 27, 2015

Why can't the system just pool the load for us instead of we doing it manually?

Because that would put a lot of scheduling engineers out of jobs 💃 /s


Agreed with @jamestalmage. Concatenation is a fix for something that isn't broken. I wish it were that easy.

@novemberborn
Copy link
Member

I don't know enough about modern scheduling regarding forks and CPU cores, but presumably we want to max out CPU so tests run faster and spinning up as many forks as possible achieves that. It'd be more concerned with available RAM and causing swap. That's probably heavily dependent on the code under test though.

There might be an argument for specifying the number of concurrent forks if this is a concern, however it would limit us in other ways. Currently we gather tests from all forks before running any of them, and things like .only are supposed to apply across forks. We might have to carefully spin forks up and down to do this analysis and then start the actual test run, assuming the test files don't change in between. (I guess we could copy them to a temporary directory?)

We've worked hard to reduce Babel overhead for the test files themselves but users probably still run into this themselves using custom --require hooks. Conceivably we could provide a framework where even those hooks are loaded in the main process, similarly to the test files. An alternative approach would be to transpile in a separate build process (this is my preferred approach).

@novemberborn
Copy link
Member

Tracking my last suggestion in #577.

@jprichardson
Copy link

I have a very big project that I wanted to try ava on. After having the Activity Monitor (Mac OS X 10.11) open and seeing 50+ node process, it brought my system to its knees. It'd be cool if there was some kind of limit.

@jamestalmage jamestalmage self-assigned this Apr 1, 2016
@jamestalmage
Copy link
Contributor

@spudly - I actually have a start on this already. I will post some code soon.

Basically, it needs to help identify test.only, test.serial.only, etc.

@jamestalmage
Copy link
Contributor

Technically, people don't need to use test as their variable name when importing ava:

import ava from 'ava';

ava(t => t.is(...));

Also, I have done this in the past:

// now all the tests are serial:
var test = require('ava').serial;

test(t => t.is(...));

I have since decided that is bad form, and that it's just better to do test.serial each time (clarity is more important than brevity). Still, I think we could support it.


More problematic is if use of only is wrapped in a conditional:

// fictional module that analyzes math expressions in strings:
import calculator from 'calculator-parser';
import ava from 'ava';

function test(input, expected, only) {
  const method = only ? ava.only : ava;
  method(input + ' === ' + expected, t => {
    const actual = calculator.calculate(input);
    t.is(actual, expected);
  });
}

test('3 + 4', 7);
test('5 - 2', 3);
test('6 * 3', 18, true); // causes this to be interpreted as an exclusive test
// ...

I do not see any way to reliably do static analysis on this. That is a bit of a bummer.

@jamestalmage jamestalmage added bug current functionality does not work as desired priority labels Apr 1, 2016
@jamestalmage jamestalmage added blocked waiting on something else to be resolved first and removed enhancement new functionality labels Apr 1, 2016
@novemberborn
Copy link
Member

Tracking the management of the forks in #718.

@ORESoftware
Copy link

ORESoftware commented May 12, 2016

Why not transpile tests first, then execute all child processes with plain node? :) maybe you already do that?

I am willing to bet that this would speed things up tremendously and you would catch transpile errors early too. But you'd have to have some temp directory somewhere which most people think is unsexy.

@novemberborn
Copy link
Member

@ORESoftware yup, that's how it works. The problem is executing all child processes at once.

@ORESoftware
Copy link

ORESoftware commented May 12, 2016

@novemberborn not from what I gather. On this thread the main complaint is that requiring babel is slow. TMK you wouldn't need to require babel if all test files were already transpiled. Am I right? What am I missing?

forking cps is not slow at all in my experience, it's more or less the standard ~50ms of waiting per fork, it's babel that is really slowing you down.

@jamestalmage
Copy link
Contributor

jamestalmage commented May 12, 2016

@ORESoftware

We transpile tests in the main process, the forks don't transpile tests at all. The remaining problem is how to handle transpiling sources. Currently, you have two choices:

  1. Transpile your sources before running the tests. This results in probably the best performance, but requires a good deal of boilerplate - especially to support watch mode.
  2. Use babel-register to transpile sources on the fly. This is much easier to setup (just --require babel-register), but your per-fork cost is significant.

forking cps is not slow at all in my experience, it's more or less the standard ~50ms of waiting per fork, it's babel that is really slowing you down

Agreed, but some users have hundreds of test files, and they have reached the point where they are experiencing memory/disk thrashing that is impacting performance. #791 will address that.

Also, we intend to add support for reusing forked processes at some point (#789). That will have some major downsides (mainly that it breaks test isolation), so it will be opt-in. 50ms per file is not insignificant when dealing with hundreds of files.

@Qix-
Copy link
Contributor

Qix- commented May 12, 2016

So have we actually profiled AVA to determine that babel is indeed the culprit? Or is it speculation?

Something else to consider - forking for every test at once is going to cause problems. Not sure if that's what's happening but that causes hella context switches between the child processes if it is.

@jamestalmage
Copy link
Contributor

So have we actually profiled AVA to determine that babel is indeed the culprit? Or is it speculation?

Definitely not speculation. Moving Babel to the main thread for tests created significant speed improvements. It stands to reason the same improvements could be realized by moving source transpilation out of the forks.

Something else to consider - forking for every test at once is going to cause problems.

Already considered. See #791.

@Qix-
Copy link
Contributor

Qix- commented May 12, 2016

Perfect, fair enough @jamestalmage :)

@ORESoftware
Copy link

ORESoftware commented May 12, 2016

@jamestalmage sorry I don't know what you mean by transpiling sources vs transpiling tests - I frankly have no idea what that means - can you briefly explain? it seems like the best chance of getting towards the speed criterion is not have to require an iota of babelness in your child_processes.

@jamestalmage
Copy link
Contributor

We use Babel to transpile your test files. We do this in the main process, then write the transpilation result out to disk. When we fork a child process, we don't actually load your test file, we load that transpilation result. (We use node_modules/.cache to store the transpilation).

Doing the same for sources is much more difficult, because we can't tell what needs to be transpiled without attempting static analysis.

@ORESoftware
Copy link

ORESoftware commented May 13, 2016

ok, by sources you mean the source code that the test file may require/reference? Sounds like it. Yeah, IDK.

@dcousineau
Copy link
Contributor

@jamestalmage ava could 'cheat' and use the --source flags as a hint to pre-compile everything that matches (even if it doesn't catch everything and it compiles more than it has to)

@novemberborn
Copy link
Member

@dcousineau yup, see #577.

@jamestalmage
Copy link
Contributor

jamestalmage commented May 13, 2016

I am closing this, it's just too general. We are tracking specific improvements in the following issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocked waiting on something else to be resolved first bug current functionality does not work as desired help wanted priority
Projects
None yet
Development

No branches or pull requests