You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We're testing the kuberay autoscaler following this guide and this chart as a reference.
The sidecar fails with this error:
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: HTTPSConnectionPool(host='kubernetes.default', port=443): Max retries exceeded with url: /apis/ray.io/v1/namespaces/ml-platform-model-v6/rayclusters/ml-platform-model-v6-raycluster-wzg8l (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7f16c1646b90>: Failed to establish a new connection: [Errno -2] Name or service not known'))
It seems the Ray Autoscaler needs to call kubernetes APIs. To do an equivalent of kubernetes get rayclusters using the API. The referenced part of the code makes an assumption about the kubernetes host, which is different in our case.
Versions / Dependencies
Version
Ray Version
2.34.0
Python
3.10, 3.11
OS
Ubuntu:22.04
Reproduction script
I have a fix instead of repro:
curl https://kubernetes.default
curl: (6) Could not resolve host: kubernetes.default
If we use kubernetes set env vars or allow the kubernetes host as a parameter, we wouldn't face this issue.
The text was updated successfully, but these errors were encountered:
ltbringer
added
bug
Something that is supposed to be working; but isn't
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Nov 4, 2024
jjyao
added
P1
Issue that should be fixed within a few weeks
and removed
triage
Needs triage (eg: priority, bug/not-bug, and owning component)
labels
Nov 25, 2024
What happened + What you expected to happen
We're testing the kuberay autoscaler following this guide and this chart as a reference.
The sidecar fails with this error:
It seems the Ray Autoscaler needs to call kubernetes APIs. To do an equivalent of
kubernetes get rayclusters
using the API. The referenced part of the code makes an assumption about the kubernetes host, which is different in our case.Versions / Dependencies
Reproduction script
I have a fix instead of repro:
If we use kubernetes set env vars or allow the kubernetes host as a parameter, we wouldn't face this issue.
Issue Severity
High: It blocks me from completing my task.
The text was updated successfully, but these errors were encountered: