Github Actions Limitations #4876

ariskotsomitopoulos · 2022-01-07T10:37:05Z

Following the Trying to fix integration tests PR. After the fixes. The tests can run most of the time and are published manually as a PR comment, splitting the tests to smaller chunks also helped.

Running an emulator from within linux GitHub action server has a lot of problems and limitations (on mac slave is much better)
ReactiveCircus/android-emulator-runner#62

This is a solution but not a stable one. It would be good if we can increase slaves hardware
Other than that we can use another CI/CD tool like Jenkins, specific for integration tests so the flow will be as follows:
Github Action triggers Jenkins -> Jenkins run Integrations Tests and Post results back to GitHub

ouchadam · 2022-01-13T17:13:20Z

to add on some more ideas from previous discussions

Run the device tests on firebase test lab
Create abstractions to avoid needing the android runtime

michaelkaye · 2022-02-09T15:11:53Z

We are using the macos environment for the element-ios builds so just turning that on (with the hardware acceleration that comes with it) seems reasonable; it's more expensive but if it works we just need to pay for it.

How was it not working? Unreliable tests or timeouts or similar (do we have an example of a GHA based test that is failing atm, a lot seem to be green in the actions tab)

We're also using buildkite as our main non-GHA CI tool; we could see if running on the linux instances of that might be better than the GHA ones; but they'll still (i believe) not have hardware acceleration available to them, which will mean they will likely still run slow, but we might be able to run the tests for an extended period.

Last option is to use a local farm (or single machine) of real machines with hardware acceleration available to them to run these actions on as custom runners, but that's a bit of an investment.

ouchadam · 2022-02-09T15:40:54Z

mainly unreliable https://github.com/vector-im/element-android/actions/workflows/sanity_test.yml these same tests pass consistently locally without issue

we're also using the osx runner for the nightly UI test suite, the android emulator is notoriously picky when running headless without a gpu

my recommendation would be to avoid using VMs all together and use a dedicated service like firebase test lab but it would require an externally accessible synapse instance

michaelkaye · 2022-02-09T16:38:42Z

Yeah; i was going to see how easy it would be to move the synapse outside of the build process first, in case the synapse itself is causing some overheads. If that works then moving further onto firebase or elsewhere would be fairly easy.

ariskotsomitopoulos · 2022-02-09T21:40:41Z

The main problem is something like this:


> Task :app:validateSigningDevDebugAndroidTest UP-TO-DATE
> Task :app:packageDevDebugAndroidTest UP-TO-DATE
[PropertyFetcher]: ShellCommandUnresponsiveException getting properties for device emulator-5554: null
[PropertyFetcher]: ShellCommandUnresponsiveException getting properties for device emulator-5554: null

> Task :app:connectedDevDebugAndroidTest
Skipping device 'test(AVD)' for 'app:DEV': Unknown API Level

DEV > : No compatible devices connected.[TestRunner] FAILED 
Found 1 connected device(s), 0 of which were compatible.

This is caused mainly due to missing hardware acceleration. I believe that with iOS slaves will work much better, can you verify if we can also use macOs slaves for the android builds? If thats not the case maybe we should see for your other suggested solutions.

michaelkaye · 2022-02-11T09:26:35Z

https://github.com/vector-im/element-android/pull/5193/files

So i tried this; which was to take the settings for the integration test (which seems to reliably start the emulator) and move them across to the sanity test (i also forced the sanity test to run each time we push to my branch, so don't merge).

It seems to be OK, other than some flaky tests, I haven't seen an emulator start error here. Perhaps the problem was the android level 29 or the exact version of the pixel etc - is that something we explicitly wanted to test with or is it independent?

ariskotsomitopoulos · 2022-02-11T10:08:21Z

Thanks for your update Michael, nice changes! Well the main issue is that there are errors that are not even persistent, so we cant rely on the results. For example in your brach here there are 3 failures ( I guess that is after your changes). The emulator error for example happens to me with about 20% in every run with the previous settings.

The android api level is not that important, I tried a lot of different settings, API levels and emulator-builds to conclude using the settings we have while it produced the less errors. But still I am not sure if GHA is made for that kind of runs, maybe using Macos slaves and hardware accelerated will help

Maybe we can apply your changes and check about improvements in our every day builds

michaelkaye · 2022-02-11T11:23:17Z

Yeah, this fixes the "it reliably starts an emulator and runs" - we need to do more changes to make it actually fail the build on a failure (for instance the integration tests also don't fail the build).

I'll tidy the branch up into a real PR and offer it for review

michaelkaye · 2022-02-11T11:25:30Z

https://github.com/vector-im/element-android/actions/workflows/sanity_test.yml is what i was actually trying to get working, btw.

I think the integration tests might fail because the synapse that demo.sh starts does not have the additional configuration to enable the threading logic on the server side, but there's tests that use the threading that are failing.

michaelkaye · 2022-02-14T17:32:13Z

Received this in a sanity test run:

/bin/sh -c adb root
adb: unable to connect for root: closed
Error: The process '/bin/sh' failed with exit code 1

Adding a loop around adb root to see if it just needs retrying.

michaelkaye · 2022-02-15T15:28:23Z

So there's various manifestations of the unreliability on the runners recently:

the emulator starts (adb set-property completes) but then adb root fails a little later [macos]. Occurred 1 in 9 attempts. Haven't seen whether the adb root loop will help or not; (last seen 2022-02-14)
the emulator starts including adb root, but halfway through a test the emulator just dies [ubuntu]. Occurs irregularly. Could be same symptom as the above macos failure; but hasn't been seen on macos so far.

michaelkaye · 2022-03-01T11:49:49Z

The emulator dying is possibly due to a CPP level failure in (eg) the realm code which causes a signal 9 which causes the emulator to stop responding mid-run, which has become visible due to the logcat logs now being visible.

So it's possible that a bunch of the errors that we thought were the emulator failing mid-run, are actually the tests doing the right thing and highlighting a real code failure.

ariskotsomitopoulos · 2022-03-02T11:57:33Z

hmm interesting, I wonder why this is not happening locally

manuroe added the X-DevOps Issues that require some infrastructure support label Jan 7, 2022

ouchadam added the T-Task Refactoring, enabling or disabling functionality, other engineering tasks label Jan 13, 2022

michaelkaye self-assigned this Feb 9, 2022

michaelkaye removed their assignment Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Github Actions Limitations #4876

Github Actions Limitations #4876

ariskotsomitopoulos commented Jan 7, 2022 •

edited

Loading

ouchadam commented Jan 13, 2022

michaelkaye commented Feb 9, 2022

ouchadam commented Feb 9, 2022 •

edited

Loading

michaelkaye commented Feb 9, 2022

ariskotsomitopoulos commented Feb 9, 2022 •

edited

Loading

michaelkaye commented Feb 11, 2022

ariskotsomitopoulos commented Feb 11, 2022 •

edited

Loading

michaelkaye commented Feb 11, 2022

michaelkaye commented Feb 11, 2022

michaelkaye commented Feb 14, 2022

michaelkaye commented Feb 15, 2022 •

edited

Loading

michaelkaye commented Mar 1, 2022

ariskotsomitopoulos commented Mar 2, 2022

Github Actions Limitations #4876

Github Actions Limitations #4876

Comments

ariskotsomitopoulos commented Jan 7, 2022 • edited Loading

ouchadam commented Jan 13, 2022

michaelkaye commented Feb 9, 2022

ouchadam commented Feb 9, 2022 • edited Loading

michaelkaye commented Feb 9, 2022

ariskotsomitopoulos commented Feb 9, 2022 • edited Loading

michaelkaye commented Feb 11, 2022

ariskotsomitopoulos commented Feb 11, 2022 • edited Loading

michaelkaye commented Feb 11, 2022

michaelkaye commented Feb 11, 2022

michaelkaye commented Feb 14, 2022

michaelkaye commented Feb 15, 2022 • edited Loading

michaelkaye commented Mar 1, 2022

ariskotsomitopoulos commented Mar 2, 2022

ariskotsomitopoulos commented Jan 7, 2022 •

edited

Loading

ouchadam commented Feb 9, 2022 •

edited

Loading

ariskotsomitopoulos commented Feb 9, 2022 •

edited

Loading

ariskotsomitopoulos commented Feb 11, 2022 •

edited

Loading

michaelkaye commented Feb 15, 2022 •

edited

Loading