-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix the suspend=true added to the job by the default job webhook has … #758
Conversation
✅ Deploy Preview for kubernetes-sigs-kueue ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
Hi @fjding. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/ok-to-test |
@fjding Thanks for your contribution. Could you fix the MPIJob as well? |
Sorry, I missed it, I will re-update the pr |
} | ||
|
||
func (j *Job) Object() client.Object { | ||
return &j.Job | ||
return j.Job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should return a pointer here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a pointer already
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I found the receiver is a pointer. Thanks!
@@ -121,11 +121,11 @@ func (h *parentWorkloadHandler) queueReconcileForChildJob(object client.Object, | |||
} | |||
|
|||
type Job struct { | |||
batchv1.Job | |||
*batchv1.Job |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uhm... Instead of making this a pointer, adding a new function, SetObject(obj client.Object)
might be better.
@alculquicondor @mimowo @kerthcet WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pointer sounds better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of func Object()
was incorrect. So, a pointer sounds better.
/kind bug |
tests are failing, could you investigate? |
…not taken effect Signed-off-by: fjding <dingfangjie@bytedance.com>
274779e
to
a3e923c
Compare
/ok-to-test |
done |
@fjding This PR is related to a user-facing change. So instead of adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fjding Thank you!
/lgtm
/lgtm thanks for fixing. Given the importance of the scenario I suggest we add an e2e test. Either in this or a follow up PR. WDYT @alculquicondor ? |
+1 |
-1 on e2e, but we should have an integration test for this. |
@fjding could you add one integration test for job in this PR? /lgtm You can do it in a follow up, but ideally we cherry-pick that one too. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alculquicondor, fjding The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick release-0.3 |
@alculquicondor: once the present PR merges, I will cherry-pick it on top of release-0.3 in a new PR and assign it to you. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
@alculquicondor: new pull request created: #765 In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Through more investigation, I found that it is hard to test this scenario. The job reconciler sets kueue/pkg/controller/jobframework/reconciler.go Lines 347 to 351 in 8733524
So, we should run the integration test only for webhook. But I'm not sure how to start the webhook server without a manager using the envtest. @alculquicondor Any thoughts? |
Maybe we must start a simple http server to start a webhook server without a manager. testEnv := &envtest.Environment{}
hookServer := &webhook.Server{
Port: testEnv.WebhookInstallOptions.LocalServingPort,
Host: testEnv.WebhookInstallOptions.LocalServingHost,
}
httpServer := &http.Server{
Addr: hookServer.Address(),
Handler: hookServer.ServeMux,
}
... |
I guess we would face the same problem with e2e tests. Maybe it's not worth doing this? |
That's right.
As another approach, we can add a unit test for the
WDYT? @alculquicondor |
That wouldn't help, because we where passing the wrong object to that function. The unit test needs to be for the Job webhook function (and MPIJob), if there isn't one already. |
That makes sense.
Yes, we don't have any unit tests for the Defaulter. |
I suggested e2e because I thought we have webhooks running there. This comment suggests that: Line 67 in 8733524
|
Uhm, adding a unit test for the Defaulter might be better. |
Yes, but the e2e test would have the same problem than this #758 (comment) |
Yes, it would be hard to verify the Defaulter in E2E :( |
In the integration test, as a hacky approach, we might be able to verify the Defaulter in the following:
Although, I'm not sure whether the hacky test is worth it. |
For e2e, I don't have any good ideas. |
not worth, IMO. unit tests at the webhook level should be enough |
Agree. Let's go ahead :) |
What type of PR is this?
/kind bug
What this PR does / why we need it:
Fix the suspend=true added to the job by the default job webhook has not taken effect.
Which issue(s) this PR fixes:
Fixes #757
Special notes for your reviewer:
Does this PR introduce a user-facing change?