Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep enqueued promotion in pending state while analysis template is running for the previous promotion #2128

Closed
shabbirsaifee92 opened this issue Jun 6, 2024 · 5 comments · Fixed by #2762

Comments

@shabbirsaifee92
Copy link

shabbirsaifee92 commented Jun 6, 2024

Checklist

  • [ x ] I've searched the issue queue to verify this is not a duplicate feature request.
  • [ x ] I've pasted the output of kargo version, if applicable.
  • [ x ] I've pasted logs, if applicable.

kargo version: 0.6.0

Proposed Feature

Current Behavior
When a Promotion (lets call it PromotionA) creates a PR, and another Promotions (PromotionB) is enqueued, it stays in pending state while the PR is open.
When the PR is merged for PromotionA, a new PR is created for PromotionB while analysis templates are running for PromotionA

Screenshot 2024-06-06 at 2 12 07 PM Screenshot 2024-06-06 at 2 10 36 PM

Suggested/Desired behavior
PromotionB should be in pending state and only be moved to Running once the verification for previous Promotion is completed and successful.

Motivation

Verifications is an important part of the the delivery process. In many scenarios these templates would be reconciling with argo to ensure that the deployment is actually successful. With the current behavior a user can merge another change while the analysis for the previous promotion is still not complete which makes things confusing.

Suggested Implementation

@krancour
Copy link
Member

krancour commented Jun 6, 2024

I agree completely.

I think @hiddeco and I will have to put our heads together on this one.

@hiddeco
Copy link
Contributor

hiddeco commented Jun 7, 2024

I would argue this could mean that verification becomes part of the Promotion. This does not have to mean verification can't also be run for a Stage, but that they can be triggered from the Promotion (to confirm it was promoted successfully) and (manually) for the Stage (to confirm things are still as they should be).

The easier way, however, would be to check if there are any AnalysisRuns in a non-terminated state before actually starting to work on the Promotion.

@shabbirsaifee92
Copy link
Author

I'd probably also suggest that any enqueued promotions should be cancelled if the verification fails for the ongoing promotion.

@krancour
Copy link
Member

krancour commented Jun 7, 2024

I would argue this could mean that verification becomes part of the Promotion.

I almost wrote a comment yesterday suggesting we not do this, or at least that we not rush into deciding that. Instead, I decided we should just talk it through. Since I don't think we'll get to that before the weekend, here's an abridged version of my thoughts, mainly so I don't lose track of them myself:

I think there are fundamentally two separate questions that can be asked after a Promotion is complete:

  1. Did the transition to the new state succeed?
  2. Is the new state good?

It's entirely possible, perhaps even a frequent occurrence, that 1 succeeds and 2 fails. I believe our verifications only answer the second of those two questions.

I still agree that new Promotions should not start mid-verification, but we need to put a lot of careful thought into whether integrating verifications into the Promotion process is the path we want to take.

Also, for visibility, whatever we decide has consequences for #2069.

@krancour
Copy link
Member

krancour commented Jul 6, 2024

This came up in an offline conversation today and something that occurred to me is that when a Stage doesn't explicitly define an Analysis-based verification process, there is still am implicit verification process that is "verified once healthy."

Ultimately I think that blocking a Promotion because the target Stage has an explicit verification process that is still in-progress vs blocking a Promotion because the target Stage is still trying to reach a healthy state -- which it conceivably might never -- are two different things.

Perhaps we need to have distinct phases for "waiting for healthy" and "verifying". (Or our intention to trend toward using conditions instead of phases may help here as well.)

Promoting into a Stage that is "waiting for healthy" and has not yet started "verifying" yet seems ok?

Promoting into a Stage that is actively verifying seems like the thing we ought not do. A user can abort the verification if they need to promote that badly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants