[3.2] run nonparallel tests in parallel via separate docker containers #83
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When running the nonparallelizable_tests & long_running_tests, run the tests safely in parallel by running each in a separate docker container.
I elected to implement this magic in a javascript action as I wasn't entirely sure I could get everything tracked correctly in a shell script. It's only ~60 line javascript, so easy enough to study I hope. There are some hardcoded uglies, in particular the baked in assumption that the workdir is
/__w/leap/leap
. It's hard to imagine the runner changing that. Also, probably the population ofnp-tests
andlr-tests
via ctest should just be done when the tests are actually run -- placing them in their current location only makes sense if we're going to populate a matrix with them, which we aren't: trying to spawn a VM per test like how a matrix would was just too overwhelming to the fleet; it'd be 200+ simultaneous VMs per build once we get tests on all platforms, pinned & ARM, clutched in.But this approach seems good enough to last us until changes in eosnetworkfoundation/product#36 are ready, in my opinion.
ENF fleet has been modified to support a new runner type
enf-x86-midtier
which these tests run on. This is a 16vCPU/24GB VM currently. The NP tests generally idle about so being "over subscribed" a bit will hopefully not be a problem. Of course we can bump this up if need be.enf-x86-midtier
is also running on the "give me whatever you got tier" which seems to be getting rather old Broadwell CPUs.There is a failing workflow run at https://github.com/AntelopeIO/leap/actions/runs/2965522882 to verify that the logs from a failing test are uploaded correctly.