test-runner: Run tests isolated by default in headless mode #2831
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Hi. I wanted to propose some changes to the wasm test runner. I'm currently working on some internal projects that are using
yew
frontend framework and I came across some problems with the current state of test runner. I'm trying to address those by this PR but please notice that I'm only focusing on running frontend tests in headless browser. I may be missing some points about other forms of testing that this tool has to offer as I simply haven't used it in other ways. I tried to not modify the behavior of any other forms of testing. If you think that this proposal could be expanded also todeno
/node
tests, I can try to handle this either in scope of this PR or in a future one.Problem
Currently running a binary containing
wasm_bindgen_test
s results in creating a js script that executes all tests sequentially one after another in a headless browser. This doesn't introduce any isolation of environment in which tests are run, that means that each test is impacting all tests that are executed after it. Consider this simple example:Assume we are testing some website which has such content:
And some tests (pseudocode warning):
Those test cases looks pretty deterministic, however the test results will not be. We possibly already redirected to some other page and we rendered
MyComponent
second time, so depending on the implementation ofsomehow_render
it may lead to some hidden complications.Solution
Of course being aware of this behavior we can try to include some forms of cleanup in tests. In this (and possibly other) simple cases this can not be a problem. Eg. redirect back, remove inserted elements and so on. However restoring DOM to original state with all default listeners is not that simple, and restoring JavaScript state seems pretty impossible. We can have some
Promises
running in the void, some mutated global state etc.With this in mind, I thought that the only possible way to make running such tests deterministic is to run a clean webdriver session for each test case. This seems to be the simplest (and only?) solution that delivers.
Problems of solution :)
Running a new webdriver session is a huge time overhead. When tests are run sequentially, it is often orders of magnitude slower than just executing test and we pay for it with each running testcase. The obvious solution for this is to make all tests execute in parallel. I've implemented this and it seems to be doing a good job shortening the time the tests are ran.
The implementation is probably not the smartest one, as I don't only run a webdriver for each of test cases but also a separated rouille server. This doesn't seem to be that heavy overhead but it is definitely also not the lightest in terms of resources. I know that this could be handled by only spawning the server once and modifying the requests that we make to the server, however I decided to go with the simplest thing right now. Implementation uses rayon to manage resources as I didn't want to bother with manual calculations of how many threads to spawn depending on available cores and so on. Rayon handles all of this by itself.
In my opinion the resources during testing is not that much of a problem this days as the tests execution time, so I consider this as a reasonable tradeoff.
Environment configuration
In this proposal, I've included running headless tests isolated by default. It can be controlled by
NO_ISOLATED
flag. I think that this should be a default behavior, as it makes tests deterministic. Also I've limited the output printed bywasm-bindgen-test-runner
by hiding most of it's prints not directly related to ran tests behind the debugging flag. I renamed it fromWASM_BINDGEN_NO_DEBUG
toWASM_BINDGEN_DEBUG
and it means that debug will be off by default. I think this is a reasonable to print only an absolute minimum when everything goes well.Breaking API
I consider this changes as breaking api so I think if this is merged we should release a new version of wasm-bindgen with a note about those changes. Those are also worth documenting in official book or wherever you think they should however I haven't bothered in providing anything yet as I'm not sure what kind of response this will get.
Testing done
I've tested this PR on my project containing a handful of wasm tests, where I've driven some testcases to either fail or timeout and everything was working smoothly. I tried running tests of wasm-bindgen-cli itself and it seems to pass some tests and then hang somewhere unless I pass
NO_ISOLATED=1
. I haven't investigated it yet but I'll try to take care of this in the near future.