Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix part of #973: Introduce coroutine executor service #1764

Merged
merged 15 commits into from
Sep 3, 2020

Conversation

BenHenning
Copy link
Member

@BenHenning BenHenning commented Sep 2, 2020

Fix part of #973 (peripherally since this will help stability for all UI-facing tests that rely on Glide--see #1630).

This introduces a new Java ExecutorService that's compatible with coroutines such that it will respect Oppia's test coroutine dispatchers for thread synchronization and Espresso idling. This is essential for #1630 since it allows coordinating background Glide requests with Oppia's test dispatcher utility to ensure Espresso doesn't try to interact with UI that depends on image loading before the image is actually available.

Long-term, we should avoid relying on network requests in our unit tests. However even if we did that, we should still ensure proper coordination between Glide & the UI layer since even local image loading could have unexpected delays that results in flakes.

Note that some tests are disabled and will be re-enabled in #1763. I suggest reading the KDocs of both CoroutineExecutorService and its test for specific clarity on why these tests are particularly difficult to stabilize. I've already spent multiple weeks trying to get this utility working, and since it's blocking writing tests for an alpha blocker (concept card integration), I want to make sure we get a reasonably working implementation checked in to unblock that work.

Finally, this PR also introduces timeouts for the test dispatcher utility. This will only apply to Robolectric runs of tests (though Espresso already enforces timeouts for idling resources), and it will prevent infinite while loops (such as seen in #1680 and many of my own investigations) and instead will cause fast failure if background tasks don't finish in time (the default timeout is 10 seconds). If a debugger is attached, the timeout will extend to infinity to avoid long debugging sessions from triggering the timeout.

A runUntilIdle was also added to the TestCoroutineDispatcher class since it helps with stabilization in the new executor service test suite. It really shouldn't be used in normal situations since TestCoroutineDispatchers handles idleness properly across all relevant threads for most tests.

coroutines, plus a test to verify that it fundamentally follows one
interpretation of ExecutorService's API.
invokeAny() and provide longer timeouts for tests that are
CPU-sensitive.
for cancellation, and increase timeout times in tests to reduce
flakiness for time-sensitive tests. Some tests are remaining flaky, so
ignoring those.

Re-add maybeWithTimeoutOrNull since it actually was needed.
@BenHenning
Copy link
Member Author

@aggarwalpulkit596 adding you as an optional reviewer.

@BenHenning
Copy link
Member Author

This is also highly related to #1741: I learned during this effort that coroutine scopes need to be carefully managed since one task failure in a scope will cause all other current and future tasks to fail in that scope. This has resulted in some additional complexity throughout the new implementation & in the test dispatcher utility.

@BenHenning BenHenning changed the title Introduce coroutine executor service Fix part of #973: Introduce coroutine executor service Sep 2, 2020
BenHenning and others added 4 commits September 2, 2020 02:04
…t + Landscape - Buttons (#1653)"

This reverts commit 1bb1ffa.
should happen individually for each task during termination). This fixes
a failure observed in StateFragmentLocalTest in #1630.
@BenHenning BenHenning changed the base branch from develop to revert-1653-state-recycler-view-buttons-highfi September 2, 2020 10:03
@BenHenning BenHenning changed the title Fix part of #973: Introduce coroutine executor service Fix part of #973: Introduce coroutine executor service [Blocked: #1768] Sep 2, 2020
Copy link
Contributor

@anandwana001 anandwana001 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks @BenHenning
put 2 points

Test result
Screenshot 2020-09-02 at 18 48 37

@anandwana001 anandwana001 removed their assignment Sep 2, 2020
Copy link
Contributor

@rt4914 rt4914 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks Good. Left some questions.

@rt4914 rt4914 removed their assignment Sep 2, 2020
Base automatically changed from revert-1653-state-recycler-view-buttons-highfi to develop September 2, 2020 19:09
@BenHenning BenHenning changed the title Fix part of #973: Introduce coroutine executor service [Blocked: #1768] Fix part of #973: Introduce coroutine executor service Sep 2, 2020
@BenHenning
Copy link
Member Author

BenHenning commented Sep 2, 2020

@anandwana001 & @rt4914 I addressed/responded your comments, PTAL.

@BenHenning
Copy link
Member Author

Also @anandwana001 thanks for verifying the tests pass for you!

@rt4914
Copy link
Contributor

rt4914 commented Sep 2, 2020

Resolved all comments (including that of @anandwana001 )

@BenHenning
Copy link
Member Author

BenHenning commented Sep 2, 2020

Thanks! @anandwana001 please follow up if you have additional work suggested--happy to follow up after this PR.

@BenHenning BenHenning merged commit 000e91e into develop Sep 3, 2020
@BenHenning BenHenning deleted the introduce-coroutine-executor-service branch September 3, 2020 03:55
BenHenning added a commit that referenced this pull request Sep 3, 2020
)

* Introduce test coroutine dispatchers support in Espresso.

This piggybacks off of the solution introduced in #1276 for Robolectric.
That PR allows Robolectric tests in app module to share dependencies
with production components by creating a test application & telling
Robolectric to use it instead of OppiaApplication via a @config
annotation.

This PR achieves the same thing by using a custom test runner that reads
the same annotation and forces Espresso & instrumentation to use the
test application instead of the default OppiaApplication. This setup
will be easier once #59 is finished since we can specify the application
in per-test manifests that both Robolectric & Espresso will respect.

Note that while this lets the same test coroutine dispatchers work in
both production & test code for Espresso, some additional work was
needed to ensure the coroutines behave correctly. In particular, the
coroutines use a fake system clock in Robolectric that requires explicit
synchronization points in the tests to allow the clock to move forward
& properly coordinate execution between background & main thread
operations. However, in Espresso, since everything is using a real clock
an idling resource is the preferred way to synchronize execution: it
allows the background coroutines to operate in real-time much like they
would in production, and then notify Espresso when completed. The test
dispatchers API is now setup to support both synchronization mechanisms
for both Robolectric & Espresso (the idling resource does nothing on
Robolectric and the force synchronization effectively does nothing on
Espresso).

The first test being demonstrated as now stable is SplashActivityTest
(as part of downstream work in #1397.

* Revert "Fixes #941: Add radar effect in Hints and solution (#1475)"

This reverts commit 41eb10b.

* Stabilize StateFragmentTest such that it passes on both Robolectric and
Espresso.

Note that some issues were found during this: #1612 (#1611 was found a
few weeks ago, but it also affects these tests). To ensure the tests can
still be run, a @runon annotation was added to allow tests to target
specific test platforms. The tests that currently fail on Robolectric
due to #1611 and #1612 are disabled for that platform. The test suite as
a whole has been verified to pass in its current state on both
Robolectric and Espresso (on a Pixel XL).

The aim of this PR is to actually enable critical state fragment tests
in CI, so both StateFragmentTest and StateFragmentLocalTest are being
enabled in GitHub actions.

* Enable StateFragmentTest (Robolectric) & StateFragmentLocalTest for CI.

* Add thorough documentation for new dispatchers.

* Clean up comments & add additional documentation.

* Fix lint errors.

* Fix broken test after changes to FakeSystemClock.

* Fix linter errors.

* Use a custom executor service for Glide requests that coordinates with
Oppia's test dispatchers. Note that this does not actually introduce the
service--that will happen in a new branch.

* Introduce new executor service which allows interop with Kotlin
coroutines, plus a test to verify that it fundamentally follows one
interpretation of ExecutorService's API.

* Fix flaky timeout tests by improving cancellation cooperation for
invokeAny() and provide longer timeouts for tests that are
CPU-sensitive.

* Add documentation & clean up unused code.

* Lint fixes.

* Significantly reorganize invokeAll() to try and make it more cooperative
for cancellation, and increase timeout times in tests to reduce
flakiness for time-sensitive tests. Some tests are remaining flaky, so
ignoring those.

Re-add maybeWithTimeoutOrNull since it actually was needed.

* Lint fixes.

* Post-merge module fixes.

* Post-merge fixes with ratio input & add a TODO to improve speed of the
new coroutine executor service.

* Revert "Fixes part of #40 & #42: Generalisation Highfi Mobile Portrait + Landscape - Buttons (#1653)"

This reverts commit 1bb1ffa.

* Ensure terminated tasks do not interfere with one another (timeouts
should happen individually for each task during termination). This fixes
a failure observed in StateFragmentLocalTest in #1630.

* Ignore failing tests until #1769 is resolved.

* Fix awaitTermination & improve test. Improve stack trace for test
dispatcher timeouts.

* Fix slow & broken tests in Robolectric for StateFragmentLocalTest.

* Add missing deps for StateFragmentLocalTest.

* Address reviewer comments.
prayutsu pushed a commit to prayutsu/oppia-android that referenced this pull request Sep 3, 2020
* Introduce new executor service which allows interop with Kotlin
coroutines, plus a test to verify that it fundamentally follows one
interpretation of ExecutorService's API.

* Fix flaky timeout tests by improving cancellation cooperation for
invokeAny() and provide longer timeouts for tests that are
CPU-sensitive.

* Add documentation & clean up unused code.

* Lint fixes.

* Significantly reorganize invokeAll() to try and make it more cooperative
for cancellation, and increase timeout times in tests to reduce
flakiness for time-sensitive tests. Some tests are remaining flaky, so
ignoring those.

Re-add maybeWithTimeoutOrNull since it actually was needed.

* Lint fixes.

* Revert "Fixes part of oppia#40 & oppia#42: Generalisation Highfi Mobile Portrait + Landscape - Buttons (oppia#1653)"

This reverts commit 1bb1ffa.

* Ensure terminated tasks do not interfere with one another (timeouts
should happen individually for each task during termination). This fixes
a failure observed in StateFragmentLocalTest in oppia#1630.

* Fix awaitTermination & improve test. Improve stack trace for test
dispatcher timeouts.

* Address reviewer comments.
prayutsu pushed a commit to prayutsu/oppia-android that referenced this pull request Sep 3, 2020
…#1764] (oppia#1630)

* Introduce test coroutine dispatchers support in Espresso.

This piggybacks off of the solution introduced in oppia#1276 for Robolectric.
That PR allows Robolectric tests in app module to share dependencies
with production components by creating a test application & telling
Robolectric to use it instead of OppiaApplication via a @config
annotation.

This PR achieves the same thing by using a custom test runner that reads
the same annotation and forces Espresso & instrumentation to use the
test application instead of the default OppiaApplication. This setup
will be easier once oppia#59 is finished since we can specify the application
in per-test manifests that both Robolectric & Espresso will respect.

Note that while this lets the same test coroutine dispatchers work in
both production & test code for Espresso, some additional work was
needed to ensure the coroutines behave correctly. In particular, the
coroutines use a fake system clock in Robolectric that requires explicit
synchronization points in the tests to allow the clock to move forward
& properly coordinate execution between background & main thread
operations. However, in Espresso, since everything is using a real clock
an idling resource is the preferred way to synchronize execution: it
allows the background coroutines to operate in real-time much like they
would in production, and then notify Espresso when completed. The test
dispatchers API is now setup to support both synchronization mechanisms
for both Robolectric & Espresso (the idling resource does nothing on
Robolectric and the force synchronization effectively does nothing on
Espresso).

The first test being demonstrated as now stable is SplashActivityTest
(as part of downstream work in oppia#1397.

* Revert "Fixes oppia#941: Add radar effect in Hints and solution (oppia#1475)"

This reverts commit 41eb10b.

* Stabilize StateFragmentTest such that it passes on both Robolectric and
Espresso.

Note that some issues were found during this: oppia#1612 (oppia#1611 was found a
few weeks ago, but it also affects these tests). To ensure the tests can
still be run, a @runon annotation was added to allow tests to target
specific test platforms. The tests that currently fail on Robolectric
due to oppia#1611 and oppia#1612 are disabled for that platform. The test suite as
a whole has been verified to pass in its current state on both
Robolectric and Espresso (on a Pixel XL).

The aim of this PR is to actually enable critical state fragment tests
in CI, so both StateFragmentTest and StateFragmentLocalTest are being
enabled in GitHub actions.

* Enable StateFragmentTest (Robolectric) & StateFragmentLocalTest for CI.

* Add thorough documentation for new dispatchers.

* Clean up comments & add additional documentation.

* Fix lint errors.

* Fix broken test after changes to FakeSystemClock.

* Fix linter errors.

* Use a custom executor service for Glide requests that coordinates with
Oppia's test dispatchers. Note that this does not actually introduce the
service--that will happen in a new branch.

* Introduce new executor service which allows interop with Kotlin
coroutines, plus a test to verify that it fundamentally follows one
interpretation of ExecutorService's API.

* Fix flaky timeout tests by improving cancellation cooperation for
invokeAny() and provide longer timeouts for tests that are
CPU-sensitive.

* Add documentation & clean up unused code.

* Lint fixes.

* Significantly reorganize invokeAll() to try and make it more cooperative
for cancellation, and increase timeout times in tests to reduce
flakiness for time-sensitive tests. Some tests are remaining flaky, so
ignoring those.

Re-add maybeWithTimeoutOrNull since it actually was needed.

* Lint fixes.

* Post-merge module fixes.

* Post-merge fixes with ratio input & add a TODO to improve speed of the
new coroutine executor service.

* Revert "Fixes part of oppia#40 & oppia#42: Generalisation Highfi Mobile Portrait + Landscape - Buttons (oppia#1653)"

This reverts commit 1bb1ffa.

* Ensure terminated tasks do not interfere with one another (timeouts
should happen individually for each task during termination). This fixes
a failure observed in StateFragmentLocalTest in oppia#1630.

* Ignore failing tests until oppia#1769 is resolved.

* Fix awaitTermination & improve test. Improve stack trace for test
dispatcher timeouts.

* Fix slow & broken tests in Robolectric for StateFragmentLocalTest.

* Add missing deps for StateFragmentLocalTest.

* Address reviewer comments.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants