You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Seeing intermittent fatal error: runtime: out of memory errors in Travis CI due to what seems to be leaky tests.
These leaks seem tied specifically to the pulumi/eks test surface, and not quantity of tests as eks is < 15 tests currently, and pulumi/examples has ~90 tests.
One possible theory is the over use of dynamic providers in this repo than any other repo. Another theory is that when failures occur in tests, this creates a compounding effect on more failures to occur, leading to further resource starvation.
Tests are run in parallel with a current max of 20 jobs set.
We've started testing in a slimmed VM in EC2 to mimic Travis CI runtime with less resources than travis (using t2.medium)
We repro’d the starvation issue on a test VM, with no failures occurring - seems that just the concurrent runs of all tests is enough to do the machine in, and noticeably node processes shot up to consuming most of cpu, til just now where kswapd0, snapd, and a couple of pulumi-language processes are coming in for a total of over 150% cpu usage (see pics below for data).
OTOH, in a separate travis run I’ve set TESTPARALLELISM=3 vs current default of 20 tests, and that seems to be humming along for now with no failures, but will inevitably hit the max travis 2 hour test run limit at this pace.
Problem description
Seeing intermittent
fatal error: runtime: out of memory
errors in Travis CI due to what seems to be leaky tests.These leaks seem tied specifically to the
pulumi/eks
test surface, and not quantity of tests as eks is < 15 tests currently, andpulumi/examples
has ~90 tests.One possible theory is the over use of dynamic providers in this repo than any other repo. Another theory is that when failures occur in tests, this creates a compounding effect on more failures to occur, leading to further resource starvation.
Tests are run in parallel with a current max of 20 jobs set.
t2.medium
)us-west-2
Errors & Logs
Output of
/var/log/kern.log
:kern.log
Output of
ps aux | grep node && ps aux | grep pulumi
after repro:top
after repro:Reproducing the issue
make test_all
in thenodejs/eks
directoryRelated Issues
The text was updated successfully, but these errors were encountered: