Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLI] Support --ip option in sky status to output cluster IP only #2563

Merged
merged 11 commits into from
Sep 17, 2023

Conversation

cblmemo
Copy link
Collaborator

@cblmemo cblmemo commented Sep 15, 2023

Resolved #2439 .

To discuss: Do we want to enable this feature for multiple clusters? If so, what delimiter shall we use? space or newline?

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: bash tests/backward_comaptibility_tests.sh

@romilbhardwaj romilbhardwaj added this to the 0.4 milestone Sep 16, 2023
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cblmemo! Tried it out and it works well. I think we need to think about multi-node clusters too (e.g., how to get IPs of all workers), but for this PR supporting only head node should be fine.

Also I found it slightly confusing when I had just one cluster and sky status --ip simply returned the IP of that cluster without requiring a cluster name argument. Maybe its ok? Will let others chime in here.

sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated
Comment on lines 1750 to 1751
raise RuntimeError('Failed to fetch IP address. '
'Please try again later.')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if there is some better hint we can provide here than just asking to try again later. Brainstorming - what are the reasons IP fetching would fail? Are there any actionable suggestions we can provide to fix those reasons?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One possible reason could be network connection issues 🤔or some ray errors

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... I am trying to find a good way to surface those errors for debugging (i.e., print them when SKYPILOT_DEBUG=1), but looks like any errors are not propagated through handle.external_ips()... any other ideas?

sky/cli.py Outdated Show resolved Hide resolved
cblmemo and others added 2 commits September 16, 2023 14:28
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@berkeley.edu>
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@berkeley.edu>
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@berkeley.edu>
@cblmemo
Copy link
Collaborator Author

cblmemo commented Sep 16, 2023

A quick thought: Is sky status --head-ip better in terms of avoiding confusion? The Drawback is tedious though 🤔

Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cblmemo! LGTM with some minor ux nits. Since @Michaelvll is the author of the source issue, I'll let him comment on this too if the interface/UX makes sense to him.

sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated
Comment on lines 1750 to 1751
raise RuntimeError('Failed to fetch IP address. '
'Please try again later.')
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see... I am trying to find a good way to surface those errors for debugging (i.e., print them when SKYPILOT_DEBUG=1), but looks like any errors are not propagated through handle.external_ips()... any other ideas?

cblmemo and others added 3 commits September 16, 2023 17:04
Co-authored-by: Romil Bhardwaj <romil.bhardwaj@berkeley.edu>
sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
sky/cli.py Outdated Show resolved Hide resolved
Comment on lines +1725 to +1735
if ip:
if len(clusters) != 1:
with ux_utils.print_exception_no_traceback():
plural = 's' if len(clusters) > 1 else ''
cluster_num = (str(len(clusters))
if len(clusters) > 0 else 'No')
raise ValueError(
_STATUS_IP_CLUSTER_NUM_ERROR_MESSAGE.format(
cluster_num=cluster_num,
plural=plural,
verb='specified'))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we just handle this in L1751? May not be necessary to separate the two cases.

Copy link
Collaborator Author

@cblmemo cblmemo Sep 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are some special cases...
For example:

  • Say I have 3 clusters test, cluster-1, cluster-2 in local database
  • If we only check the length of cluster_records (the return value of core.status):
    • sky status --ip test abc will output the ip of cluster test, since the _get_glob_clusters doesn't find any clusters with abc, which is not desirable (we transparently ignored one cluster)
  • If we only check the length of clusters:
    • sky status --ip 'cluster-*' needs to throw an error since it will match two clusters

cblmemo and others added 3 commits September 17, 2023 12:15
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Co-authored-by: Zhanghao Wu <zhanghao.wu@outlook.com>
Copy link
Collaborator

@romilbhardwaj romilbhardwaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cblmemo - took another look, LGTM.

@cblmemo cblmemo merged commit 626d8a9 into master Sep 17, 2023
@cblmemo cblmemo deleted the sky-status-ip branch September 17, 2023 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CLI] Easier way to get cluster IP
3 participants