Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HOLD for payment 2024-06-06] [$250] preDeploy.yml should not skip on default branch if another run is queued #41936

Closed
blimpich opened this issue May 9, 2024 · 32 comments
Assignees
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 External Added to denote the issue can be worked on by a contributor

Comments

@blimpich
Copy link
Contributor

blimpich commented May 9, 2024

If you haven’t already, check out our contributing guidelines for onboarding and email contributors@expensify.com to request to join our Slack channel!


Issue reported by: @blimpich
Slack conversation: https://expensify.slack.com/archives/C01GTK53T8Q/p1715015588472179?thread_ts=1713554811.796439&cid=C01GTK53T8Q
cc: @rayane-djouah

Action Performed:

  1. merge a PR into main
  2. immediately merge another PR into main

Expected Result:

Both runs should run to completion

Actual Result:

The first run is skipped/cancelled.

Screenshots/Videos

Example of skipped job

Go to https://github.com/Expensify/App/actions/workflows/preDeploy.yml and you will find many examples of skipped jobs by just looking through the recent merges.

Further context:

This was caused by this PR. We don't want to revert this PR because it saves resources on builds for pull requests, we just don't want it to apply to when we're merging into main.

View all open jobs on GitHub

Issue OwnerCurrent Issue Owner: @
Upwork Automation - Do Not Edit
  • Upwork Job URL: https://www.upwork.com/jobs/~013f9368ae63177d49
  • Upwork Job ID: 1788618186197295104
  • Last Price Increase: 2024-05-09
  • Automatic offers:
    • rayane-djouah | Reviewer | 0
    • badeggg | Contributor | 0
@blimpich blimpich added External Added to denote the issue can be worked on by a contributor Daily KSv2 Bug Something is broken. Auto assigns a BugZero manager. labels May 9, 2024
@blimpich blimpich self-assigned this May 9, 2024
@melvin-bot melvin-bot bot added the Help Wanted Apply this label when an issue is open to proposals by contributors label May 9, 2024
Copy link

melvin-bot bot commented May 9, 2024

Triggered auto assignment to Contributor-plus team member for initial proposal review - @rayane-djouah (External)

Copy link

melvin-bot bot commented May 9, 2024

Triggered auto assignment to @bfitzexpensify (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details. Please add this bug to a GH project, as outlined in the SO.

@blimpich blimpich added External Added to denote the issue can be worked on by a contributor and removed External Added to denote the issue can be worked on by a contributor labels May 9, 2024
@melvin-bot melvin-bot bot changed the title preDeploy.yml should not skip on default branch if another run is queued [$250] preDeploy.yml should not skip on default branch if another run is queued May 9, 2024
Copy link

melvin-bot bot commented May 9, 2024

Job added to Upwork: https://www.upwork.com/jobs/~013f9368ae63177d49

Copy link

melvin-bot bot commented May 9, 2024

Current assignee @rayane-djouah is eligible for the External assigner, not assigning anyone new.

@amgenene
Copy link

amgenene commented May 9, 2024

Proposal:
Please re-state the problem that we are trying to solve in this issue.
The problem we are trying to solve is that when multiple pull requests are merged into the main branch in quick succession, the workflow runs triggered by earlier merges are being canceled due to the cancel-in-progress: true flag in the workflow concurrency configuration.
What is the root cause of that problem?
The root cause of the problem is the cancel-in-progress: true setting, which is designed to cancel any currently running workflows or jobs with the same concurrency group when a new workflow run is triggered. This behavior is desirable for pull requests, where we want to save resources and avoid running redundant workflows or jobs when a new commit is pushed. However, for merges into the main branch, we want all workflow runs to complete without being canceled, even if another merge happens shortly after.
What changes do you think we should make in order to solve the problem?
To solve the problem, we can introduce conditional logic in the workflow files to apply the cancel-in-progress: true setting only for pull requests and not for merges to the main branch.
For the preDeploy.yml workflow file (or any other workflow file that handles merges to the main branch), we can set the cancel-in-progress setting to false
What alternative solutions did you explore? (Optional)
An alternative solution could be to introduce a delay or a manual trigger for workflow runs triggered by merges to the main branch. This would prevent the second workflow run from canceling the first one, but it might introduce additional complexity or manual intervention in the deployment process.

Copy link

melvin-bot bot commented May 9, 2024

📣 @amgenene! 📣
Hey, it seems we don’t have your contributor details yet! You'll only have to do this once, and this is how we'll hire you on Upwork.
Please follow these steps:

  1. Make sure you've read and understood the contributing guidelines.
  2. Get the email address used to login to your Expensify account. If you don't already have an Expensify account, create one here. If you have multiple accounts (e.g. one for testing), please use your main account email.
  3. Get the link to your Upwork profile. It's necessary because we only pay via Upwork. You can access it by logging in, and then clicking on your name. It'll look like this. If you don't already have an account, sign up for one here.
  4. Copy the format below and paste it in a comment on this issue. Replace the placeholder text with your actual details.
    Screen Shot 2022-11-16 at 4 42 54 PM
    Format:
Contributor details
Your Expensify account email: <REPLACE EMAIL HERE>
Upwork Profile Link: <REPLACE LINK HERE>

@amgenene
Copy link

amgenene commented May 9, 2024

Contributor details
Your Expensify account email: alazar.genene@gmail.com
Upwork Profile Link: https://www.upwork.com/freelancers/~010c7773782b08158e

Copy link

melvin-bot bot commented May 9, 2024

✅ Contributor details stored successfully. Thank you for contributing to Expensify!

@ShridharGoel
Copy link
Contributor

Proposal

Please re-state the problem that we are trying to solve in this issue.

preDeploy.yml should not skip on default branch if another run is queued.

What is the root cause of that problem?

As of now, cancel-in-progress is true for all cases, so even workflows for merges to main get cancelled if a new merge happens.

Existing code:

concurrency:
  group: "${{ github.ref }}-lint"
  cancel-in-progress: true
concurrency:
  group: "${{ github.ref }}-jest"
  cancel-in-progress: true

We don't want to cancel the older runs for main.

What changes do you think we should make in order to solve the problem?

We should add a condition to make cancellation happen only for non-main branches.

concurrency:
  group: "${{ github.ref }}-lint"
  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}
concurrency:
  group: "${{ github.ref }}-jest"
  cancel-in-progress: ${{ github.ref != 'refs/heads/main' }}

If it's not the main branch, then older runs would cancel. Else, they will not.

@rayane-djouah
Copy link
Contributor

rayane-djouah commented May 11, 2024

@ShridharGoel's proposal looks good to me.

🎀👀🎀 C+ reviewed

Edit: the proposal will not work correctly based on the github doc

Looking for better proposals.

Copy link

melvin-bot bot commented May 11, 2024

Current assignee @blimpich is eligible for the choreEngineerContributorManagement assigner, not assigning anyone new.

@badeggg
Copy link
Contributor

badeggg commented May 11, 2024

@rayane-djouah @blimpich , I think @ShridharGoel's proposal is not correct, and I am writing a proposal.

What cancel-in-progress: false do is not keep all concurrent workflows, what cancel-in-progress influence is only whether cancel running workflows, queued is always canceled.

Based on github doc here

When a concurrent job or workflow is queued, if another job or workflow using the same concurrency group in the repository is in progress, the queued job or workflow will be pending. Any pending job or workflow in the concurrency group will be canceled. This means that there can be at most one running and one pending job in a concurrency group at any time.

To also cancel any currently running job or workflow in the same concurrency group, specify cancel-in-progress: true. To conditionally cancel currently running jobs or workflows in the same concurrency group, you can specify cancel-in-progress as an expression with any of the allowed expression contexts.

@rayane-djouah
Copy link
Contributor

Yeah, That makes sense @badeggg.
Looking for better proposals!

@badeggg
Copy link
Contributor

badeggg commented May 11, 2024

Proposal

Please re-state the problem that we are trying to solve in this issue.

Do not skip workflow on main branch

What is the root cause of that problem?

We have several concurrency config like this, which will skip concurrent workflows. We care two of them in this issue.

What changes do you think we should make in order to solve the problem?

We should change this to:

concurrency:
  group: ${{ github.ref == 'refs/heads/main' && format('{0}-{1}', github.ref, github.sha) || github.ref }}-lint
  cancel-in-progress: true

Similar change need be applied to here

What alternative solutions did you explore? (Optional)

N/A

@badeggg
Copy link
Contributor

badeggg commented May 11, 2024

Little explanation about my proposal.

  • check if we are running workflow on main branch by github.ref == 'refs/heads/main'
  • if it is not, we set up a group based on github.ref(which is the fully-formed ref of the branch or tag)
  • if it is the main branch, we form up a unique group name for every workflow run by combining github.ref and github.sha. Which will guarantee every main branch workflow run case have a unique group name, so cause no skip.

@badeggg
Copy link
Contributor

badeggg commented May 11, 2024

I have updated my proposal and explanation:

  • github workflow yaml does not support condition ? 'A' : 'B' syntax, use condition && 'A' || 'B' instead, doc here
  • github workflow yaml does not support + operator, use format function
  • use github.ref and github.sha to form up unique group name, I think this will make more sense
  • delete surrounding " on group name to align with another concurrency config (though it's negligible)

@rayane-djouah
Copy link
Contributor

@badeggg's proposal looks good to me.

🎀👀🎀 C+ reviewed

Copy link

melvin-bot bot commented May 12, 2024

Current assignee @blimpich is eligible for the choreEngineerContributorManagement assigner, not assigning anyone new.

@melvin-bot melvin-bot bot removed the Help Wanted Apply this label when an issue is open to proposals by contributors label May 13, 2024
@badeggg
Copy link
Contributor

badeggg commented May 14, 2024

I have made pr

@blimpich
Copy link
Contributor Author

@rayane-djouah @badeggg #42134 just got merged. Since this can only really be tested in production, can you please test to make sure that this works by confirming that (A) PR builds aren't effected by this change and (B) builds that would have been skipped without this change are not being skipped?

@badeggg
Copy link
Contributor

badeggg commented May 28, 2024

Looks fine.

Evidence for (A) PR builds aren't effected by this change:

lint:

run 1
run 2
to view more
image

typecheck:

run 1
image
to view more

test:

run 1
image
to view more

e2ePerformanceTests

The trigger condition for this workflow is relatively rare met, manually trigger or call from another workflow. Only ./workflows/preDeploy.yml will call it and the trigger condition for ./workflows/preDeploy.yml is 'push to main branch'. This makes us hard to catch an evidence for e2ePerformanceTests.
maybe come across one case

Evidence for (B) builds that would have been skipped without this change are not being skipped:

Those three commits were pushed almost the same time(less than 1 second), and there is no typecheck/lint/test/e2ePerformanceTests job skip.
commit e430c2d --> workflow
commit cabf71d --> workflow
commit 3599850 --> workflow

image
image
image

@blimpich
Copy link
Contributor Author

Fantastic! Thank you @badeggg for the QA. Very happy with this 🥳

@melvin-bot melvin-bot bot added Weekly KSv2 Awaiting Payment Auto-added when associated PR is deployed to production and removed Weekly KSv2 labels May 30, 2024
@melvin-bot melvin-bot bot changed the title [$250] preDeploy.yml should not skip on default branch if another run is queued [HOLD for payment 2024-06-06] [$250] preDeploy.yml should not skip on default branch if another run is queued May 30, 2024
@melvin-bot melvin-bot bot removed the Reviewing Has a PR in review label May 30, 2024
Copy link

melvin-bot bot commented May 30, 2024

Reviewing label has been removed, please complete the "BugZero Checklist".

Copy link

melvin-bot bot commented May 30, 2024

The solution for this issue has been 🚀 deployed to production 🚀 in version 1.4.77-11 and is now subject to a 7-day regression period 📆. Here is the list of pull requests that resolve this issue:

If no regressions arise, payment will be issued on 2024-06-06. 🎊

For reference, here are some details about the assignees on this issue:

Copy link

melvin-bot bot commented May 30, 2024

BugZero Checklist: The PR fixing this issue has been merged! The following checklist (instructions) will need to be completed before the issue can be closed:

  • [@blimpich] The PR that introduced the bug has been identified. Link to the PR:
  • [@blimpich] The offending PR has been commented on, pointing out the bug it caused and why, so the author and reviewers can learn from the mistake. Link to comment:
  • [@blimpich] A discussion in #expensify-bugs has been started about whether any other steps should be taken (e.g. updating the PR review checklist) in order to catch this type of bug sooner. Link to discussion:
  • [@badeggg / @rayane-djouah] Determine if we should create a regression test for this bug.
  • [@badeggg / @rayane-djouah] If we decide to create a regression test for the bug, please propose the regression test steps to ensure the same bug will not reach production again.
  • [@bfitzexpensify] Link the GH issue for creating/updating the regression test once above steps have been agreed upon:

@bfitzexpensify bfitzexpensify removed their assignment May 30, 2024
@bfitzexpensify bfitzexpensify added Bug Something is broken. Auto assigns a BugZero manager. and removed Bug Something is broken. Auto assigns a BugZero manager. labels May 30, 2024
Copy link

melvin-bot bot commented May 30, 2024

Triggered auto assignment to @garrettmknight (Bug), see https://stackoverflow.com/c/expensify/questions/14418 for more details. Please add this bug to a GH project, as outlined in the SO.

@melvin-bot melvin-bot bot added Daily KSv2 and removed Weekly KSv2 labels May 30, 2024
@bfitzexpensify bfitzexpensify self-assigned this May 30, 2024
@bfitzexpensify
Copy link
Contributor

Adding a BZ buddy for payment/regression test update - I will be OOO until June 11th

@melvin-bot melvin-bot bot added the Overdue label Jun 3, 2024
@garrettmknight garrettmknight added Weekly KSv2 and removed Daily KSv2 Overdue labels Jun 3, 2024
@rayane-djouah
Copy link
Contributor

This doesn't need a BZ checklist as it's a Github action improvement

@melvin-bot melvin-bot bot added Daily KSv2 and removed Weekly KSv2 labels Jun 6, 2024
@garrettmknight
Copy link
Contributor

All paid out, closing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Payment Auto-added when associated PR is deployed to production Bug Something is broken. Auto assigns a BugZero manager. Daily KSv2 External Added to denote the issue can be worked on by a contributor
Projects
None yet
Development

No branches or pull requests

7 participants