Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Propagate provisioning status of a ProvReq into the Workload status #2007

Merged
merged 7 commits into from
Apr 23, 2024

Conversation

pajakd
Copy link
Contributor

@pajakd pajakd commented Apr 18, 2024

What type of PR is this?

/kind feature

What this PR does / why we need it:

It propagates the message from the Provisioning Request (condition Provisioned = false) into the Workload. The message will contain the expected time for the resources to be provisioned. Exposing this information to the users will allow them to understand why the job is still pending and how long until the resources will be available.

Which issue(s) this PR fixes:

Fixes #1749

Special notes for your reviewer:

Does this PR introduce a user-facing change?

ProvisioningRequest: Propagate the message for a ProvisioningRequest being provisioned (which might include an ETA, depending on the implementation) to the Workload status

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. labels Apr 18, 2024
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Apr 18, 2024
@k8s-ci-robot
Copy link
Contributor

Hi @pajakd. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 18, 2024
Copy link

netlify bot commented Apr 18, 2024

Deploy Preview for kubernetes-sigs-kueue canceled.

Name Link
🔨 Latest commit 273498f
🔍 Latest deploy log https://app.netlify.com/sites/kubernetes-sigs-kueue/deploys/6627b4d97f7b2f00081365ef

@alculquicondor
Copy link
Contributor

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 18, 2024
@pajakd pajakd marked this pull request as draft April 18, 2024 13:11
@k8s-ci-robot k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 18, 2024
@gabesaba
Copy link
Contributor

About 6 weeks back, we were discussing putting this status in Conditions rather than Admission Checks. But looking at it again, I'm not coming up with any drawbacks of Admission Checks. If anything, I'd argue it's nicer to have all ProvReq info in one place (as is implemented in this PR). Are there any reasons to prefer Conditions over Admission Checks that I'm missing @yaroslava-serdiuk @PBundyra?

@pajakd
Copy link
Contributor Author

pajakd commented Apr 22, 2024

Tested the feature end-to-end. I did observe that the workloads get the placeholder message "Provisioning Request wasn't provisioned." when the Provisioned flag was equal to False. The message later changed to "successfully provisioned" when Provisioned changes to True.

As far as I know it is as desired (knowing that the ETA messages are not provided by the DWS yet).

The question remains whether the message should be in Conditions or Admission Checks -- see the above comment.

@pajakd pajakd marked this pull request as ready for review April 22, 2024 08:28
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 22, 2024
@yaroslava-serdiuk
Copy link
Contributor

About 6 weeks back, we were discussing putting this status in Conditions rather than Admission Checks. But looking at it again, I'm not coming up with any drawbacks of Admission Checks. If anything, I'd argue it's nicer to have all ProvReq info in one place (as is implemented in this PR). Are there any reasons to prefer Conditions over Admission Checks that I'm missing Yaroslava Serdiuk Patryk Bundyra?

I think users (data scientists) would like to check Workloads status to understand when it will be admitted / why it's pending, so I would update workload conditions as well.

@PBundyra
Copy link
Contributor

I don't have a strong preference, but as long as we do not lose any information about ProvReqs I would lean towards keeping it in the Workload.Status.AdmissionChecks field

@pajakd
Copy link
Contributor Author

pajakd commented Apr 22, 2024

So, we have two options:

  1. Create a new condition (something like Workload.Conditions.ProvisioningRequestProvisioned) and have the ETA message in there and also in the Workload.Status.AdmissionChecks.
  2. Have the ETA message only in Workload.Status.AdmissionChecks . This is as it is currently implemented; the messages are visible to the users running for example kubectl describe workloads -- @yaroslava-serdiuk do you think it is not enough and we should also add it to Workload.Conditions?

Comment on lines 536 to 537
if prAccepted && !prFailed {
updated = updated || updateCheckMessage(&checkState, apimeta.FindStatusCondition(pr.Status.Conditions, autoscaling.Provisioned).Message)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be a part of the switch statement for consistency? Could you add a comment explaining this case in fact covers Provisioned=false condition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding to the switch statement -- see my other reply #2007 (comment).

Added a comment explaining the feature. In fact we also update the message for Provisioned=True (otherwise we would stick to the "wasn't provisioned" message at the workload which would be confusing). When Provisioned=True the message changes to "successfully provisioned".

@yaroslava-serdiuk
Copy link
Contributor

@pajakd I think it's enough for now, let's leave as it is.

@alculquicondor
Copy link
Contributor

/release-note-edit

The message for a ProvisioningRequest being provisioned (which might include an ETA, depending on the implementation) is now propagated to workloads.

updated = updateCheckState(&checkState, kueue.CheckStatePending) || updated
}
if prAccepted && !prFailed {
updated = updateCheckMessage(&checkState, apimeta.FindStatusCondition(pr.Status.Conditions, autoscaling.Provisioned).Message) || updated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe add a case for prAccepted in the switch so you can do the message update there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, it seems like it would fit into the switch statement. But adding a case for prAccepted would cause the default case not to be executed and this is not what we want (bc we update a state there). As I understand it, the switch is mainly for updating the state of the admission check. Since my feature will be about only updating the message I don't want to mess with the switch statement. I did consider it, thought about using fallthrough, but I could not find a good way that would not make this code more brittle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate why the switch is mainly for updating the state of admission check? I imagine it could be along the lines:

switch {
 case prProvisioned: // updates state and message (ETA -> successfully provisioned)
 case prFailed: // retries or updates state and message (ETA -> failed)
 case prAccepted && !prFailed : // updates message (ETA)
 default: // for now covers cases !prAccepted, prBookingExpired, prCapacityRevoked
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I wanted to avoid having the ETA update in multiple places, but perhaps you are right. One more thing is that in the new prAccepted case I would have to also run everything that is in the default case. But, I'll revisit this idea.

Copy link
Contributor

@PBundyra PBundyra Apr 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is prAccepted a different case than prAccepted && !prFailed? I thought prFailed is by default set to false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be missing something, but switch only runs the single case; the topmost one that evaluates to true, right? so if we reach to case prAccepted we are sure that prFailed = False and prProvisioned= False, right?

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just a nit

Copy link
Contributor

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Apr 23, 2024
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: b103dd85cf527f06f26a1782c56b2663f7897996

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, pajakd

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 23, 2024
@k8s-ci-robot k8s-ci-robot merged commit e8fc9b7 into kubernetes-sigs:main Apr 23, 2024
15 checks passed
@k8s-ci-robot k8s-ci-robot added this to the v0.7 milestone Apr 23, 2024
@pajakd pajakd deleted the update_eta_message branch April 24, 2024 05:55
}
case prAccepted:
// we propagate the message from the provisioning request status into the workload
// this happens for provisioned = false (ETA updates) and also for provisioned = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please update the comment as Provisioned=true is covered by the case above?

@alculquicondor
Copy link
Contributor

/release-note-edit

ProvisioningRequest: Propagate the message for a ProvisioningRequest being provisioned (which might include an ETA, depending on the implementation) to the Workload status

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Propagate provisioning status of a ProvReq into the Workload status
6 participants