-
Notifications
You must be signed in to change notification settings - Fork 12.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Better-scheduled parallel tests #18462
Conversation
Couple of notes without looking at the code yet:
Now that you have my old 24-thread machine up and running again, it would be interesting to compare results from it as well. |
Also, I don't care about how many tests have passed so far, just how many have failed. |
2b76c74
to
f44f5ac
Compare
With the message-batching change, the PR is as fast (or maybe a little faster) than the current |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems ok to me. My only comment is that we are stuffing a lot into runner.ts, it might make sense to split up the worker and orchestrator pieces into different files to be more easily maintained.
@rbuckton done. |
On my laptop, the time went up from 2min 41s to 3m 10s. My laptop has 8 cores like my desktop but only 8GB of memory while my desktop has 16GB. |
@andy-ms How many workers does your machine use? It is possible that the batch ratio is a bit suboptimal for a given thread count (ie, pre-batching too many or too few tests, resulting in idle threads). |
On both desktop and laptop it says it's using 8 threads / groups. |
Hmmm.... I'll make the batch parameter tuneable. The worst case (pre-batching all tests) should approximately equal the old parallel runner in speed (since there will be many idle threads) - and right now it is hardcoded to batch 90% of the available filesize into the first messages. OFC, since filesize isn't a perfect approximator for test time, this isn't perfect... I have some ideas on how to improve it a bit more, though. |
@rbuckton mentioned that if he were to rewrite the parallel test runner, he would make it use a work stealing system, rather than partitioning work in advance; to help improve thread utilization near the end of a test run. This is effectively that.
This pulls the import parts of our parallel test script into the test runner itself, and then reworks the parallel runner to use a central queue for distributing work to the worker threads, while also sorting the work by file size (approximating how long the test will take to execute - the largest tests execute first). The improved utilization and load balancing takes the end-to-end time of
gulp runtests-parallel --tests=rwc
(now that that is functional) down from 13 minutes to 8 minutes on my workstation - I can only assume the effect is even more pronounced with more cores at the disposal of the test runner.The primary downside to this system is that the tests aren't actually run under mocha, but rather a shim which polyfills the behaviors we use. This is so it immediately executes the tests on discovery (since we run one file at a time on-demand), but the downside is that timeouts aren't currently supported in my shim (so a test can just hang if it is malfunctioning).