fix(tracer, metrics): use polling instead of fixed wait in e2e tests #654
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of your changes
This PR changes the way the E2E tests for metrics and tracer perform their checks.
Instead of waiting for a fixed amount of time (and hoping that the expected data will be available after the wait time elapses), the tests will now actively and periodically poll the corresponding services (CloudWatch metrics, X-Ray traces). When the expected number of result items arrives, the tests will perform their validation as usual. In case there will be not enough (or too many) items, the polling will stop after a timeout, so the tests will fail as intended.
The implementation relies on polling using unique IDs as keys. For metrics, the namespaces already represent such unique IDs in form of UUIDs. For traces, instead of introducing additional UUID annotations, it turned out to be easier to use unique Lambda function names for each test run.
This PR adds a new dev dependency: the promise-retry package (and its corresponding type declaration package).
How to verify this change
Run the E2E tests for tracer and metrics multiple times:
Explore the test results.
NOTE: The tests might fail for a different (legitimate) reason, so a red test result doesn't necessarily mean an issue with this PR. If any test fails, compare the actual stored trace or metric data with the expected values using AWS CLI or the Management Console. If the values differ, the test failure is caused not by the way the tests gather the data, so those failures must be investigated separately.
Related issues, RFCs
#644
PR status
Is this ready for review?: YES
Is it a breaking change?: NO
Checklist
Breaking change checklist
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.