-
Notifications
You must be signed in to change notification settings - Fork 462
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RayCluster] Make headpod name deterministic #3028
[RayCluster] Make headpod name deterministic #3028
Conversation
Signed-off-by: owenowenisme <mses010108@gmail.com>
Signed-off-by: owenowenisme <mses010108@gmail.com>
Signed-off-by: owenowenisme <mses010108@gmail.com>
…adpod name is fixed now Signed-off-by: owenowenisme <mses010108@gmail.com>
Tests that are failed by this changed will also be fixed in this PR. |
Signed-off-by: owenowenisme <mses010108@gmail.com>
…add error log when multiple headpod is found Signed-off-by: owenowenisme <mses010108@gmail.com>
Signed-off-by: owenowenisme <mses010108@gmail.com>
1b991ed
to
713ff51
Compare
} | ||
r.rayClusterScaleExpectation.ExpectScalePod(extraHeadPodToDelete.Namespace, instance.Name, expectations.HeadGroup, extraHeadPodToDelete.Name, expectations.Delete) | ||
} | ||
logger.Info("reconcilePods", fmt.Sprintf("Multiple head pods found, it should only exist one head pod. Please delete extra head pods and leave only the head pod with name `%s-head`.", instance.Name), headPods.Items) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-
please don't use
fmt.Sprintf
inside log function. See https://github.com/nginx/nginx-gateway-fabric/blob/main/docs/developer/logging-guidelines.md#message-guidelines for more details. -
please don't print
headPods.Items
. Instead, print the name of head Pods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kuberay/ray-operator/controllers/ray/raycluster_controller.go
Lines 557 to 560 in 713ff51
if len(services.Items) > 1 { | |
logger.Info("reconcileHeadService", "Duplicate head service found", services.Items) | |
return fmt.Errorf("%d head service found %v", len(services.Items), services.Items) | |
} |
I'm actually following the HeadService which also print the Items, should I change it as well?
Signed-off-by: owenowenisme <mses010108@gmail.com>
current err msg look like this {"level":"info","ts":"2025-02-13T04:16:20.592Z","logger":"controllers.RayCluster",
"msg":"Multiple head pods found, it should only exist one head pod. Please delete extra head pods.",
"RayCluster":{"name":"rayservice-sample-raycluster-9wl9m","namespace":"test-ns-6zktp"},
"reconcileID":"6d93964f-00a4-4d78-8db6-0d40c53804e2",
"found pods":["mytest-head","rayservice-sample-raycluster-9wl9m-head"],
"should only leave":"rayservice-sample-raycluster-9wl9m-head"} {"level":"info","ts":"2025-02-13T04:16:20.592Z",
"logger":"controllers.RayCluster",
"msg":"Reconciliation error",
"RayCluster":{"name":"rayservice-sample-raycluster-9wl9m","namespace":"test-ns-6zktp"},
"reconcileID":"6d93964f-00a4-4d78-8db6-0d40c53804e2",
"error":"2 head pods found [mytest-head rayservice-sample-raycluster-9wl9m-head]. Please delete extra head pods and leave only the head pod with name rayservice-sample-raycluster-9wl9m-head"} |
Signed-off-by: owenowenisme <mses010108@gmail.com>
kuberay/ray-operator/controllers/ray/raycluster_controller.go Lines 750 to 764 in fa0aeaa
@kevin85421 BTW, I think the reconcile Worker for-loop in reconcilePod is pretty hard to read and realize for it is 150 lines and have to trace many function all over files, maybe we should refactor this part? |
Why are these changes needed?
Related issue number
Closes #3013
Checks
( raycluster-autoscaler is the rayCluster name from a random yaml I used for testing, which is not related to the issue.)