-
Notifications
You must be signed in to change notification settings - Fork 539
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SkyServe] Fix sky serve up
with docker login config
#2983
Conversation
I must be missing something. I'm trying to run this branch locally, but |
Hey @benbot, could you share the version of |
Thanks for submitting the PR @cblmemo! Could we also test GCP private docker repo? |
@Michaelvll The skypilot-nightly from pip works fine. I'm trying to use this branch, so I can test out deploying a docker image from a private registry. I'm seeing version |
Oops, sorry I misunderstood it. If you would like to try this branch. You could try |
Not sure if this is related, but on this branch i'm not able to run my docker images. I'm being met with this error:
|
Humm, could you share the whole log so we could identify the problem? |
@cblmemo Here are the full logs from the failure before retrying. Nothing jumps out at me as particularly useful :( https://pastebin.com/untZJcK4 <--- the same logs, but scrolled up more |
Humm could you share the whole launch log to me? This seems strange and iiuc missing some lines... BTW, just want to make sure, are you at the latest commit of this branch? |
Yeah it does seem like it's missing lines :/ The actual error is the 127 return code from the setup command, which is some kind of "Not Found" error. I was on the latest branch last I tried, but I'll run it again today and send you the full logs. It's mostly retrying the ssh connection until it connects then this failure error shows up moments later |
@cblmemo Here is another attempt, different container image, same private registry, same issue. There's nothing else in the log, really. Above this log is the provision config which contains secrets, so I don't want to paste it here. Before that it's just skypilot attempting to find a region where I haven't hit quota yet |
Oh weirdly enough, when it retries it has a different error log, but ends with the same 127 error. |
Oh I might know the problem... From the |
No this is a rockylinux container. Is there any way / are there any plans to use non Debian base images? Also, I may be blind, but I don't see where on that doc page it says that only Debian images are supported |
The challenging to support non-Debian based image is to infer the package manager in the container. We will try to resolve this 🫡 Here is a related issue: #2673 For the documentation, I attached a screenshot for your reference. Though arguably it is hard to find... I just raised a PR to highlight it. #3021 |
I confirm that this PR works properly when loading a private image from GHCR. Thanks! |
@Michaelvll this is ready for another look 👀 |
Sorry again if this is unrelated, but I'm still seeing some errors with an ubuntu container. Every
I'd put this on pastebin again, but it seems to be down. |
Hey @benbot - Thanks for reporting this! The error message is actually refers to the service controller, not replicas. Are you reusing some existing controller which have some failure history? If so, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this @cblmemo! LGTM.
Previously, when we
sky serve up
a yaml w/ docker login config, it will throw an error sayingDockerLoginConfig is not JSON serializable
, which is because we accidentally added the docker login to resources'to_yaml_config
function. This PR addressed the problem and fixed #2982 .Tested (run the relevant ones):
bash format.sh
sky serve up docker_login_service.yaml
and no error is thrown; also the service is working (one replica turns to READY).pytest tests/test_smoke.py
pytest tests/test_smoke.py::test_fill_in_the_name
bash tests/backward_comaptibility_tests.sh