-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Forking for every test file is slow #78
Comments
Aren't we setting some kind of limit of forks? We should probably not create more forks than there are CPUs. |
We're not and tbh it's annoying that we'll probably have too. Why can't the system just pool the load for us instead of we doing it manually? I hate computers so much... Let's experiement with what kind of number is optimal. I'm thinking |
We should also look into why forking is so slow. Maybe there's something we can optimize there. |
@sindresorhus forking is fast, spinning up Node and babel is quite slow. |
Node.js isn't inherently slow to spin up, it's requiring things that are slow. Babel caches the compilation, so that should only be an issue on the a cold load. |
We could use https://github.com/jaguard/time-require to measure if any of our |
I checked and babel takes the most time to require. I'm planning to migrate AVA to Babel 6 (#117), will see if there are any time wins. |
I ran
1 second is way too much for just requiring dependencies. We should look into optimizing that somehow. |
Nice tool, I'll add that to my toolbox. 💃 |
We could also cache the "require resolving", so every test files doesn't have to do that. Maybe by using https://github.com/bahmutov/cache-require-paths by @bahmutov. |
Node.js 5.2.0 is also ~45% faster to startup! :) https://twitter.com/rvagg/status/674775383967858688 |
I'm not so sure about that. NPM 3's automatic flattening and deduping of the dependency tree really sped up resolve times. Also, I don't see if or how it decides to invalidate the the cache. That seems iffy to me, especially for a testing framework where it's fairly reasonable to expect their dependency tree will change frequently.
That is huge. |
I think the biggest improvement will come from moving Babel to the main thread. |
Definitely. Was just looking for other minor wins. |
I guess it's worth a shot. I just have doubts it is going to be effective with us spawning so many processes (it doesn't coordinate across processes), and dealing with a relatively volatile dependency tree. Seems better suited to servers than dev boxes. |
Agree. Ignore that. |
@sindresorhus: I have a really stupid idea: What if we |
@sindresorhus: Or, a slightly more sane idea: Make a map of every file in our dependency tree mapped to it's contents, then hook require to use those instead of the file system. Example: var files = new Map([
["./lib/test.js", "'use strict';\nvar isGeneratorFn = require('is-generator-fn');\nvar maxTimeout = require('max-timeout');\nvar Promise = require('bluebird');\nvar fnName = require('fn-name');\nvar co = require('co-with-promise');\nvar observableToPromise = require('observable-to-promise');\nvar isPromise = require('is-promise');\nvar isObservable = require('is-observable');\nvar assert = require('./assert');\nvar enhanceAssert = require('./enhance-assert');\nvar globals = require('./globals');\n\nfunction Test(title, fn) {\nif (!(this instanceof Test)) {\n\t\treturn new Test(title, fn);\n\t}\n\n\tif (typeof title === 'function') {\n\t\tfn = title;\n\t\ttitle = null;\n\t}\n\n\tassert.is(typeof fn, 'function', 'you must provide a callback');\n\n\tthis.title = title || fnName(fn) || '[anonymous]';\n\tthis.fn = isGeneratorFn(fn) ? co.wrap(fn) : fn;\n\tthis.assertions = [];\n\tthis.planCount = null;\n\tthis.duration = null;\n\tthis.assertError = undefined;\n\n\tObject.defineProperty(this, 'assertCount', {\n\t\tenumerable: true,\n\t\tget: function () {\n\t\t\treturn this.assertions.length;\n\t\t}\n\t});\n\n\t// TODO(jamestalmage): make this an optional constructor arg instead of having Runner set it after the fact.\n\t// metadata should just always exist, otherwise it requires a bunch of ugly checks all over the place.\n\tthis.metadata = {};\n\n\t// store the time point before test execution\n\t// to calculate the total time spent in test\n\tthis._timeStart = null;\n\n\t// workaround for Babel giving anonymous functions a name\n\tif (this.title === 'callee$0$0') {\n\t\tthis.title = '[anonymous]';\n\t}\n}\n\nmodule.exports = Test;\n\nTest.prototype._assert = function (promise) {\n\tthis.assertions.push(promise);\n};\n\nTest.prototype._setAssertError = function (err) {\n\tif (this.assertError !== undefined) {\n\t\treturn;\n\t}\n\n\tif (err === undefined) {\n\t\terr = 'undefined';\n\t}\n\n\tthis.assertError = err;\n};\n\nTest.prototype.plan = function (count) {\n\tif (typeof count !== 'number') {\n\t\tthrow new TypeError('Expected a number');\n\t}\n\n\tthis.planCount = count;\n\n\t// in case the `planCount` doesn't match `assertCount,\n\t// we need the stack of this function to throw with a useful stack\n\tthis.planStack = new Error().stack;\n};\n\nTest.prototype.run = function () {\n\tvar self = this;\n\n\tthis.promise = Promise.pending();\n\n\t// TODO(vdemedes): refactor this to avoid storing the promise\n\tif (!this.fn) {\n\t\tthis.exit();\n\t\treturn undefined;\n\t}\n\n\tthis._timeStart = globals.now();\n\n\t// wait until all assertions are complete\n\tthis._timeout = globals.setTimeout(function () {}, maxTimeout);\n\n\tvar ret;\n\n\ttry {\n\t\tret = this.fn(this._publicApi());\n\t} catch (err) {\n\t\tthis._setAssertError(err);\n\t\tthis.exit();\n\t}\n\n\tvar asyncType = 'promises';\n\n\tif (isObservable(ret)) {\n\t\tasyncType = 'observables';\n\t\tret = observableToPromise(ret);\n\t}\n\n\tif (isPromise(ret)) {\n\t\tif (this.metadata.callback) {\n\t\t\tself._setAssertError(new Error('Do not return ' + asyncType + ' from tests declared via `test.cb(...)`, if you want to return a promise simply declare the test via `test(...)`'));\n\t\t\tthis.exit();\n\t\t\treturn this.promise.promise;\n\t\t}\n\n\t\tret\n\t\t\t.then(function () {\n\t\t\t\tself.exit();\n\t\t\t})\n\t\t\t.catch(function (err) {\n\t\t\t\tself._setAssertError(err);\n\t\t\t\tself.exit();\n\t\t\t});\n\t} else if (!this.metadata.callback) {\n\t\tthis.exit();\n\t}\n\n\treturn this.promise.promise;\n};\n\nObject.defineProperty(Test.prototype, 'end', {\n\tget: function () {\n\t\tif (this.metadata.callback) {\n\t\t\treturn this._end.bind(this);\n\t\t}\n\t\tthrow new Error('t.end is not supported in this context. To use t.end as a callback, you must use \"callback mode\" via `test.cb(testName, fn)` ');\n\t}\n});\n\nTest.prototype._end = function (err) {\n\tif (err) {\n\t\tthis._setAssertError(new assert.AssertionError({\n\t\t\tactual: err,\n\t\t\tmessage: 'Callback called with an error → ' + err,\n\t\t\toperator: 'callback'\n\t\t}));\n\n\t\tthis.exit();\n\t\treturn;\n\t}\n\n\tif (this.endCalled) {\n\t\tthrow new Error('.end() called more than once');\n\t}\n\n\tthis.endCalled = true;\n\tthis.exit();\n};\n\nTest.prototype._checkPlanCount = function () {\n\tif (this.assertError === undefined && this.planCount !== null && this.planCount !== this.assertions.length) {\n\t\tthis._setAssertError(new assert.AssertionError({\n\t\t\tactual: this.assertions.length,\n\t\t\texpected: this.planCount,\n\t\t\tmessage: 'Assertion count does not match planned',\n\t\t\toperator: 'plan'\n\t\t}));\n\n\t\t//this.assertError.stack = this.planStack;\n\t}\n};\n\nTest.prototype.exit = function () {\n\tvar self = this;\n\n\tthis._checkPlanCount();\n\n\tPromise.all(this.assertions)\n\t\t.catch(function (err) {\n\t\t\tself._setAssertError(err);\n\t\t})\n\t\t.finally(function () {\n\t\t\t// calculate total time spent in test\n\t\t\tself.duration = globals.now() - self._timeStart;\n\n\t\t\t// stop infinite timer\n\t\t\tglobals.clearTimeout(self._timeout);\n\n\t\t\tself._checkPlanCount();\n\n\t\t\tif (!self.ended) {\n\t\t\t\tself.ended = true;\n\n\t\t\t\tglobals.setImmediate(function () {\n\t\t\t\t\tif (self.assertError !== undefined) {\n\t\t\t\t\t\tself.promise.reject(self.assertError);\n\t\t\t\t\t\treturn;\n\t\t\t\t\t}\n\n\t\t\t\t\tself.promise.resolve(self);\n\t\t\t\t});\n\t\t\t}\n\t\t});\n};\n\nTest.prototype._publicApi = function () {\n\tvar self = this;\n\tvar api = {skip: {}};\n\n\t// Getters\n\t[\n\t\t'assertCount',\n\t\t'title',\n\t\t'end'\n\t]\n\t\t.forEach(function (name) {\n\t\t\tObject.defineProperty(api, name, {\n\t\t\t\tenumerable: name === 'end' ? self.metadata.callback : true,\n\t\t\t\tget: function () {\n\t\t\t\t\treturn self[name];\n\t\t\t\t}\n\t\t\t});\n\t\t});\n\n\t// Get / Set\n\tObject.defineProperty(api, 'context', {\n\t\tenumerable: true,\n\t\tget: function () {\n\t\t\treturn self.context;\n\t\t},\n\t\tset: function (context) {\n\t\t\tself.context = context;\n\t\t}\n\t});\n\n\t// Bound Functions\n\tapi.plan = this.plan.bind(this);\n\n\tfunction skipFn() {\n\t\tself._assert(Promise.resolve());\n\t}\n\n\tfunction onAssertionEvent(event) {\n\t\tvar promise;\n\t\tif (event.assertionThrew) {\n\t\t\tevent.error.powerAssertContext = event.powerAssertContext;\n\t\t\tpromise = Promise.reject(event.error);\n\t\t} else {\n\t\t\tpromise = Promise.resolve(observableToPromise(event.returnValue));\n\t\t}\n\t\tpromise = promise\n\t\t\t.catch(function (err) {\n\t\t\t\terr.originalMessage = event.originalMessage;\n\t\t\t\treturn Promise.reject(err);\n\t\t\t});\n\t\tself._assert(promise);\n\t}\n\n\tvar enhanced = enhanceAssert({\n\t\tassert: assert,\n\t\tonSuccess: onAssertionEvent,\n\t\tonError: onAssertionEvent\n\t});\n\n\t// Patched assert methods: increase assert count and store errors.\n\tObject.keys(assert).forEach(function (el) {\n\t\tapi.skip[el] = skipFn;\n\t\tapi[el] = enhanced[el].bind(enhanced);\n\t});\n\n\tapi._capt = enhanced._capt.bind(enhanced);\n\tapi._expr = enhanced._expr.bind(enhanced);\n\n\treturn api;\n};\n"],
// etc.
]);
var actualRequire = require._extensions['.js'];
require._extensions['.js'] = function (module, filename) {
if (files.has(filename)) return module._compile(filename);
return actualRequire.apply(this, arguments);
};
require('./lib/babel.js'); If we uglified the files with "whitespace only" mode, it could totally* work! *Probobly |
Isn't the overhead of forking the actual forking bit? Like the allocation the OS does to fork processes is what the overhead really brings, unless it's also doing babel compilation for each fork... |
That is currently the case, #349 fixes it. Currently, assuming a cache miss - we spend I don't think we should be concatenating files. That is a lot of complexity, for what I think is going to be minimal gains. |
Because that would put a lot of scheduling engineers out of jobs 💃 /s Agreed with @jamestalmage. Concatenation is a fix for something that isn't broken. I wish it were that easy. |
I don't know enough about modern scheduling regarding forks and CPU cores, but presumably we want to max out CPU so tests run faster and spinning up as many forks as possible achieves that. It'd be more concerned with available RAM and causing swap. That's probably heavily dependent on the code under test though. There might be an argument for specifying the number of concurrent forks if this is a concern, however it would limit us in other ways. Currently we gather tests from all forks before running any of them, and things like We've worked hard to reduce Babel overhead for the test files themselves but users probably still run into this themselves using custom |
Tracking my last suggestion in #577. |
I have a very big project that I wanted to try ava on. After having the Activity Monitor (Mac OS X 10.11) open and seeing 50+ |
@spudly - I actually have a start on this already. I will post some code soon. Basically, it needs to help identify |
Technically, people don't need to use import ava from 'ava';
ava(t => t.is(...)); Also, I have done this in the past: // now all the tests are serial:
var test = require('ava').serial;
test(t => t.is(...)); I have since decided that is bad form, and that it's just better to do More problematic is if use of // fictional module that analyzes math expressions in strings:
import calculator from 'calculator-parser';
import ava from 'ava';
function test(input, expected, only) {
const method = only ? ava.only : ava;
method(input + ' === ' + expected, t => {
const actual = calculator.calculate(input);
t.is(actual, expected);
});
}
test('3 + 4', 7);
test('5 - 2', 3);
test('6 * 3', 18, true); // causes this to be interpreted as an exclusive test
// ... I do not see any way to reliably do static analysis on this. That is a bit of a bummer. |
Tracking the management of the forks in #718. |
Why not transpile tests first, then execute all child processes with plain node? :) maybe you already do that? I am willing to bet that this would speed things up tremendously and you would catch transpile errors early too. But you'd have to have some temp directory somewhere which most people think is unsexy. |
@ORESoftware yup, that's how it works. The problem is executing all child processes at once. |
@novemberborn not from what I gather. On this thread the main complaint is that requiring babel is slow. TMK you wouldn't need to require babel if all test files were already transpiled. Am I right? What am I missing? forking cps is not slow at all in my experience, it's more or less the standard ~50ms of waiting per fork, it's babel that is really slowing you down. |
We transpile tests in the main process, the forks don't transpile tests at all. The remaining problem is how to handle transpiling sources. Currently, you have two choices:
Agreed, but some users have hundreds of test files, and they have reached the point where they are experiencing memory/disk thrashing that is impacting performance. #791 will address that. Also, we intend to add support for reusing forked processes at some point (#789). That will have some major downsides (mainly that it breaks test isolation), so it will be opt-in. |
So have we actually profiled AVA to determine that babel is indeed the culprit? Or is it speculation? Something else to consider - forking for every test at once is going to cause problems. Not sure if that's what's happening but that causes hella context switches between the child processes if it is. |
Definitely not speculation. Moving Babel to the main thread for tests created significant speed improvements. It stands to reason the same improvements could be realized by moving source transpilation out of the forks.
Already considered. See #791. |
Perfect, fair enough @jamestalmage :) |
@jamestalmage sorry I don't know what you mean by transpiling sources vs transpiling tests - I frankly have no idea what that means - can you briefly explain? it seems like the best chance of getting towards the speed criterion is not have to require an iota of babelness in your child_processes. |
We use Babel to transpile your test files. We do this in the main process, then write the transpilation result out to disk. When we fork a child process, we don't actually load your test file, we load that transpilation result. (We use Doing the same for sources is much more difficult, because we can't tell what needs to be transpiled without attempting static analysis. |
ok, by sources you mean the source code that the test file may require/reference? Sounds like it. Yeah, IDK. |
@jamestalmage ava could 'cheat' and use the |
@dcousineau yup, see #577. |
I am closing this, it's just too general. We are tracking specific improvements in the following issues.
|
Update: Blocked by #696
Seems like it is not quite good idea to spawn a fork for every test file. I have try'd fresh version of
ava
in got and tests run time get worse (not to mention, that iTunes is getting icky while tests).May be it is better to fork batched runners and/or have a
concurrency
flag?The text was updated successfully, but these errors were encountered: