-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ci: fix flakiness in update-check
in Manage integration check workflow
#9787
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files
|
Deploying agoric-sdk with Cloudflare Pages
|
ab634f2
to
a7de897
Compare
update-check
in Manage integration check workflow
a7de897
to
c2a351e
Compare
I need to think through this. In particular, what happens if a check had previously been created from a previous run (e.g. a failed run like a flake), and the integration workflow is re-run. In that case I believe we need to create a new check instead of re-using the previous one. Is there a mergify-experiment PR on which we can test these cases first? |
@mhofman As you suggested that issue is present. i resolved it in a new commit, by adding the run attempt number to the external id and filtering on that. I tested this out in this repo https://github.com/frazarshad/agoric-sdk where i have added a 60s sleep to the create check job. forcing the update check job to run first (simulating the situation described in the PR description). I have also tested out repeating the as you can see in these two runs (which are triggered by the 3rd re-run of |
This is pretty awesome! This makes me think of another scenario: what happens if the initial attempt takes longer to create the result than for the second attempt? Or for that matter if we have multiple runs but the first one executes this workflow after the second one. I think we would end up with the synthetic result of the first one taking precedence over the second one. Arguably this is an unrelated issue, so out of scope for this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm with one small tweak
external_id, | ||
output: { | ||
title: "Integration Test Aggregate Result", | ||
summary: `Synthetic check capturing the result of the <a href="${context.payload.workflow_run.html_url}">integration-test workflow run</a>`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we're now explicitly taking into consideration the attempt's number, let's include that detail in the summary, at least if it's not the 1st attempt. Here and in the normal creation place.
c768823
to
f6b26a3
Compare
refs: #8937
Description
Addresses the issue with flake in the
manage-integration-check.yml
workflow. This is due to a race condition where the workflow is executed 3 times in quick succession and is expected to run in order of execution.The "Integration tests" workflow triggers
manage-integration-check.yml
3 times. once forrequested
,in_progress
, andcompleted
statuses of the calling workflow.In cases where the
completed
status might happen at the same time as the other two (such as when the "Integration tests" workflow is skipped), the 3 calls of themanage-integration-check.yml
workflow might run simultaneously and cause a race condition.the race condition occurs because the final call of
manage-integration-check.yml
(the one triggered by thecompleted
status of "Integration tests") expects a github check run to already exist. since this might not be the case in a race condition, it failsTo fix this two things have been done:
completed
statuscompleted
status and accidentally create a check run within_progess
status. their filter check has been removed entirely so that they considercompleted
check runs as well.Security Considerations
Scaling Considerations
Documentation Considerations
Testing Considerations
Upgrade Considerations