-
Notifications
You must be signed in to change notification settings - Fork 540
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Serve][k8s] K8s replica ports not detected #3798
Comments
It will take a while for the port to be ready on a newly created k8s pod. We added the fix here #3634, but seems there is still an issue here. Could you help check why this happens? |
Hmm, I'm not able to replicate this on GKE. Where was your Kubernetes cluster running?
|
On local kind cluster created by |
Ah, I think I know what's going on. When using skypilot/sky/provision/kubernetes/network_utils.py Lines 234 to 238 in aea7322
This won't work when the endpoint is fetched from within the cluster (e.g., from the controller) since localhost will not point to the ingress service. Hence the readiness probe fails:
We probably need to special case the ingress endpoint fetching when running inside the controller and using ingress mode to directly use the ingress controller service. |
This issue is stale because it has been open 120 days with no activity. Remove stale label or comment or this will be closed in 10 days. |
commit
34f13a33d4c4036017ea3f8edb43bd1fa4e89eb8
On latest master,
examples/serve/http_server/task.yaml
for k8s controller + replica failed to detect the port on k8s replica. Seems like theget_endpoints
is the reason.Controller log:
The text was updated successfully, but these errors were encountered: