Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remote Jupyter server and kernel picker UX #11077

Closed
minsa110 opened this issue Aug 9, 2022 · 17 comments
Closed

Remote Jupyter server and kernel picker UX #11077

minsa110 opened this issue Aug 9, 2022 · 17 comments
Assignees
Labels
notebook-execution Kernels issues (start/restart/switch/execution, install ipykernel) notebook-remote Applies to remote Jupyter Servers polish Cleanup and polish issue

Comments

@minsa110
Copy link
Contributor

minsa110 commented Aug 9, 2022

Define idealized UX for connecting to remote Jupyter servers

Open questions

1. If we were to improve current UX/UI...

What does the user expect to see in the kernel drop down list when...

  1. They're on VS Code desktop / vscode.dev / Codespaces?
  2. They have N Python environments?
    a. Python not installed
    b. Only global Python envs (is "Global Env" descriptive enough?)
    c. Multiple Python environment manager types
  3. Opening a notebook for the N-th time (i.e. kernel suggestion)?
  4. The user is connected to a local Jupyter kernel, but have connected to a remote Jupyter server via the Specify Jupyter Server for Connections command? Current:
    image

How well are these remote server connection entry-points served?

  1. Specifying an currently running Jupyter URI with password authentication does not work
  2. Running a cell without selecting a kernel ❔
  3. Using SSH to start & connect to a Jupyter server (e.g. access remote notebook via port forwarding) ❔
    a. Command line SSH
    b. Remote-SSH extension
  4. Going backwards, clicking on 'Jupyter Server: <Local/Remote>' button doesn't give option to disconnect / "connect" back to local server

What does the user expect to see in the remote connection server drop down list when...

  1. There are multiple connection tokens for the same URI? E.g. how do we represent the notion of multiple tokens for the same server, and how would we save tokens?
  2. There are multiple older connections for the same URI? Current: image
  3. There are one or more local Jupyter servers running (i.e. http://localhost:8888/?token=<...>)?
  4. Some existing connections are active / reachable and some are inactive / unreachable?

2. If we were to "reimagine" the UX/UI...

What does the user expect to see in the VS Code window when connected to a kernel / Jupyter server?

  1. File system in the context of the kernel / server?
  2. Integrated terminal in the context of the kernel / server?
  3. Should connecting to a remote Jupyter server be treated as a Remote Window? i.e. a part of this (the >< indicator on bottom left):
    image
    a. How about when user types jupyter notebook in the integrated terminal? Should this open a Remote Window instance on VS Code?
@minsa110 minsa110 self-assigned this Aug 9, 2022
@github-actions github-actions bot added the triage-needed Issue needs to be triaged label Aug 9, 2022
@minsa110 minsa110 added notebook-execution Kernels issues (start/restart/switch/execution, install ipykernel) notebook-remote Applies to remote Jupyter Servers and removed triage-needed Issue needs to be triaged labels Aug 9, 2022
@alexkyllo
Copy link

Here are my thoughts on this UX:

  1. If we were to improve current UX/UI...

When I click to select a kernel, I see a very long list (this isn't all, I need to scroll down) that I think includes every Python binary in every virtual environment for every project I've ever worked on, on this machine. It's hard and I'm conflicted about this--I do like how it shows them alphabetically by name, shows the file location of the Python binary, and sub-groups them by environment manager. But it's a little confusing to me how it mixes the concept of a virtual environment with the concept of a Jupyter kernel, and it's not clear to me how to clean up the list to remove the ones I don't use anymore. It's also unclear to me why one of them is sometimes marked "Suggested" and I feel like the suggestion is usually not the one I want.

image

I'm not sure why I need to click this separate button, far away in the bottom tray, to connect to a remote Jupyter server. For a long time, I didn't even notice it was there:

image

When I'm connected to a remote server I can see an option to "Connect to Local Kernels" but when I'm connected to a local kernel there's no corresponding "Connect to Remote Kernels" option.

image

  1. If we were to "reimagine" the UX/UI...

I don't think treating a remote Jupyter server as a Remote Window (like WSL / SSH / Dev Container) makes sense to me because my mental model is that only the notebook itself is talking to the remote Jupyter server, not all of VS Code is talking to it. And I don't know if it's the case that remote Jupyter servers always provide shell and filesystem access.

Thinking about what I would want: I would like to have a pane on the left sidebar for the extension where I can manage a list of local and remote Jupyter kernels (maybe similar to how I would manage a list of database connections in a query editor tool), and then I could select one from that list when I run a notebook. I feel like this would give me much more control and enable me to keep my snake pit more organized. I think the discovery of existing Pythons on my machine is super useful, but it also leads to clutter and I want for a better way to manage that clutter.

@minsa110
Copy link
Contributor Author

minsa110 commented Aug 10, 2022

Adding some additional comments from our users on specific pain points to address--

Need better indication of remote connection, including updating the folder view in the context of remote

  • "It is very unclear to me if a remote connection is established. Would love to have it like in the browser, ie you can paste the jupyter link somewhere and then it directly opens the folder view everyone knows from the browser. At least with my setup, nothing happens when pasting the link in Jupyter: Specify Jupyter Server for Connections, therefore I manually have to do Create: New Jupyter Notebook which is very unintuitive. At first, I thought I am doing something wrong and the connection to the remote host doesn't work."
  • "Can not remap remote files to local when working with remote kernel."

Remote connection needs to be persisted

  • "Running remotely (macOS client, Windows desktop WSL2 host w/ RTX3090) can be frustrating as whenever my Mac goes to sleep the VS Code connection to the host is lost and must be reestablished (by clicking the button to reload the window, it takes like 30 seconds before the SSH connection is seemingly established)."
  • "cannot run notebooks in SSH remote "over night""
  • "It would be nice to keep command history across window reloads. This happens often for remote SSH session and can cause loss of complex experimental commands."
  • "[I would like to be] Staying connected to kernels on remotes ."

Want my kernel / notebook states to be persisted between remote sessions

  • "When running on a remote , i want the notebook to continue running if I am disconnected then update my ui on reconnection."
  • "need to reload previous cells incase of remote notebooks"
  • "Once the connection is back (this may or may not apply anymore, just going off memory) the notebook state is reset so I have to run all my cells again."
  • "I use notebooks together with vscode remote SSH. When I disconnect from the session and then reconnect to it, the notebook kernel must be restarted because it has lost all the previous state."
  • "When disconnect and reconnect to remote containers, I want to keep running kernels alive and keep outputs. For codes that requires long execution time."
  • "Remote notebooks require to be saved manually. This is frustrating and error prone. The work could be lost."

Connecting to remote servers with varying auth methods should be supported

  • "Cannot connect to development servers with self-signed certificates"

Experience for switching between remote servers needs to be improved

  • "[I would like to see] Setup up and choose a remote coding environment and choose it as easily as choosing a new Conda environment."
  • "[I would like to see] Smoother switching between jupyter servers (without need to reload window etc.), and especially between remote servers. "

I'd like to manage kernel persistence

  • "can't share a notebook kernel with an interactive kernel - would like to have an interactive window for the notebook"
  • "It could be very usefull to be able to easely connect a jupyter interactive window to a current running notebook. It is currently possible using a server but this is not easy to initialize. An interactive window sharing the same kernel could be usefull to test things before copying them properly in the final notebook."
  • " it is not clear how to manage the persistence of jupyter kernels after the ipynb file has been closed. I noticed that the kernels stick around and still consume resources. There should be a panel to manage the shutdown of these kernels (like in jupyter notebook), and there should be an option to enable/disable auto-shutdown of kernels when the attached ipynb is closed"
  • "Because for some reason, my computer restarts regularly, this will shut down my existing jupyter kernels and I have to restart the kernel from the start. Something like PyCharm that allows me to use a server started by the user will be great."
  • "i cannot run a separate notebook or script with the same kernel"
  • "persistent notebook kernel for long running jobs even after closing vscode"
  • "When disconnect and reconnect to remote containers, I want to keep running kernels alive and keep outputs. For codes that requires long execution time."

I'd like to manage kernels in general

  • "set a default kernel for all notebooks (on a system level)"
  • "Setting a system-wide default kernel for all notebooks"
  • "I would like to set a default kernel . I have to change the kernel every single time I open a file."
  • "Please make it easier to add own paths to python executables."
  • "I'm often confused about whether a kernel is running or not. It would be better to have a clear indicator of that."
  • "There is not much feedback if I for example restart a kernel . What is it doing? Don't even see a loading indicator, so I'm in the dark on what's going on during the restart."
  • "Kernel Status isn't displayed (or I can't find it). When restarting a kernel , there isn't a visual cue that the restart was successful. Whether the kernel is up, idle, working, dead - it isn't clear. Jupyter NB's show kernel status prominently in upper right"
  • "issue here is that when it is very slow to load it's hard to tell if the kernel hanged or not. So you're left wondering if you should just kill VSCode and try again. It may be useful to output some kind of terminal message or something that says, "we know this is loading slowly but Jupyter is still processing, please don't kill the process." Just so the user knows that progress is being made to load the damn thing."
  • "Support manual managing notebook kernels so that I don't have to worry that my long-running remote training session gets killed after my laptop sleeps for a few hours"
  • "[I would like to see] Working kernel control and status"
  • "Capability to shut down a kernel independently of notebook"
  • "hard to debug kernel crashes memory management"
UX/UI needs to be improved (general need for improvements, including buggy-ness)

Using notebooks with remote SSH is buggy
Note: it seems like this is the primary way VS Code users connect to remote servers

  • "I'm working with remote ssh mostly so every few minutes it has to reconnect and the process of finding the right kernel venv starts again.."
  • "it regularily crashes when remote developing using ssh due to oom."
  • "There are some issues using jupyter notebooks on a remote pc with ssh connection. The kernel installation was faulty recently, the workflow changed apparently. Accessing another pc via ssh requires a bit of configuration with certain network user-name-conventions It'd be cool, if there was an option to switch the user for a known host while connecting."
  • "[I would like to see] Better integration with Remote SSH connections and shared remote programming."

Using notebooks with remote Jupyter server connection is buggy

  • "It always takes a bit of effort to connect to a kernel. Especially if I'm connecting to a remote kernel, I always seem to run into some sort of issue. The latest thing I've found is that if I forget to start the Jupyter server before I open VSCode, I get into a state where I can't get it to connect until I restart VSCode with the server already running. But it just always seems like something causes a headache trying to get it to connect."
  • "For remote connections, (I think) disconnection can result in a failure mode where commands stop working but have no real error indicator."
  • "Setting/reconnecting to remote servers seems to be buggy at times. However, that is probably one of the most challenging subjects."
  • "Also I find it a bit slow and when working remotely it become unusable if the internet connection is poor."

Kernel keeps dying

  • "The kernel keeps disconnecting and it's really difficult to force the notebook to reconnect."
  • "Mostly I get popup about restarting the kernel ."
  • "Also, the kernels keep dying"

Kernel keeps hanging

  • Sometimes the kernel cannot be interrupted when a cell is running for a long time. Restarting the kernel sometimes need a double/triple click to register."
  • "Restarting the kernel does not work in most cases. Have to reopen the Editor to restart kernel . Kernel does not start once it is killed. "
    • "it too often happens that the python kernel hangs/stalls without any way to stop / restart it. ( 5-8 times yesterday) only option is to completly re-start vscode , which makes me switch back to debugging regular python"
  • "The main issue is that the kernel takes too long to restart and sometimes fails to restart at all. The taking too long is an annoyance. I used to run my own jupyter server before switching to VSCode and restart was quick. I used it frequently. Not starting is a very frustrating problem. It is inconsistent and hard to troubleshoot. Sometimes restarting VSCode doesn't even fix it. "
  • "When I want to interrupt an execution, usually I have to restart the kernel ."
  • " I cannot interrupt the kernel and it failes to restart with some python versions."

Kernel loading takes too long

  • "Something about waiting for the kernell to load"
  • "It takes forever to load the kernel and start the execution."
  • Also lots of comments about how long it takes to CONNECT to / SWITCH kernels

Learning curve for using remote servers in VS Code is high

  • "Learning curve to using extensions, sshing into remote servers was confusing itially"
  • "I need to use Cloud computing resources like Google Colab Pro or MS Azure. But, even with the extensions remote development is still to usable. It is very easy to pick/switch environments. Why isn't it that easy to set up a remote environment and switch to that while coding from VS Code?"
  • "Connecting to and using notebooks on remote servers is very hard, and the UI is not very nice"
  • "[I would like to see] connect to remote server, attach/detach to a running notebook."

@kieferrm
Copy link
Member

kieferrm commented Aug 11, 2022

Taking a step back. We show potentially a lot of information to users and it feels to me that we need to help users in understanding what they see and how to decide. Here's my mental model on what decisions a user has to take:

graph TD
   A{Do I have a running Jupyter Server?}
   B(Select Server)
   B1(Select Kernel)
   C(Select environment from the system my window runs on)
   D{Do I have environments with installed kernels?}
   E(Select environment)
   E1(Select kernel)
   F{Reuse environment?}
   G(Select environment)
   H(Create environment)

   A --> |yes| B
   B --> B1
   A --> |no | C
   C --> D
   D --> |yes| E
   E --> E1
   D --> |no| F
   F --> |yes| G
   F --> |no| H
Loading

From my point of view it would make sense to translate that decision making progress into a UI that makes sure that the user understands what decisions they make.

There are a few questions that go along with that:

  • How common is it that notebooks in the same workspace connect to different kernel?
  • Has the user made the decisions for this workspace before?
  • Does that decision making flow make sense for non-python users? I.e. when I'm a Julia user do I know what Python environment to pick that has my Julia kernel?
  • When you select an environment, is the kind of Python environment such a major decision factor that selecting the kind of Python environment should be an explicit step?
  • When I start VS Code within an activated environment does that change the decision making process?

On a smaller point, I'm 100% with you that different connection tokens should be mapped to the same URL so that the user does not have the confusion of seeing multiple.

@minsa110
Copy link
Contributor Author

minsa110 commented Aug 12, 2022

Thanks for structuring the conversation with the flow diagram for user's decision-making process!

Quick follow up q: what does it mean by "reusing" an environment in the diagram? That the user wants to use the same environment / kernel across different files in the workspace?


How common is it that notebooks in the same workspace connect to different kernel?

From my understanding, this is not abnormal for data scientists working in teams. For example, a data scientist would open a repo in a workspace that contains multiple data analysis projects nested in different folders. They then would have kernels for each project / notebook. Also, I believe that by default, notebooks each get their own kernel on Jupyter classic.

Has the user made the decisions for this workspace before?
When I start VS Code within an activated environment does that change the decision making process?

I like where this is going! This decision-making process makes sense, but I've been thinking on a similar note: how much of this can / should be streamline for different user / notebook / workspace states? A few other ones (in addition to the two you mention above) include:

  • Has the user made the decisions for this notebook before? (i.e. would this decision-making process be notebook-based or workspace-based?)
  • How much of this decision-making process should we bypass if a user doesn't care (e.g. new users who just want to run their code cell for the first time)?
  • Does the user want new notebooks / interactive windows to connect to the kernel that is already connected to an existing notebook in the same workspace (e.g. to share variables)?
  • Before the first decision on the diagram above: does the user have a notebook open

When you select an environment, is the kind of Python environment such a major decision factor that selecting the kind of Python environment should be an explicit step?

My intuition here is that this doesn't need to be an explicit step--though folks do value seeing their environments bucketed / organized by different Python environment types.

Does that decision making flow make sense for non-python users? I.e. when I'm a Julia user do I know what Python environment to pick that has my Julia kernel?

I don't have much experience with Julia, but the current way we display this is the same with how Jupyter classic does (ish), which means that a user who knows to select a Julia kernel in their Jupyter kernel will be able to do so by clicking on it (either #1 or #2 🤔):
image

How it shows on Jupyter classic:
image

@minsa110
Copy link
Contributor Author

@minsa110
Copy link
Contributor Author

Ref to prototyping env/kernel creation: #9640

@miguelsolorio
Copy link

Will be posting various scenario demos that I've mocked up.

Scenario 1: Running a notebook for the first time with no environment/extensions installed

This mirrors what we do right now out of the box and then leverages the "auto-create environment" concept that @IanMatthewHuff is currently working on.

CleanShot.2022-08-17.at.15.00.46.mp4

@IanMatthewHuff
Copy link
Member

I really like it. I suppose my main question is how much we can get away with installing on users computers without a prompt / message? I love the basically zero notification flow, but I'm a tiny bit unsure if it's ok to drop 80mb in the workspace without a confirmation. I think so? But we might have to be careful about making sure that it gets added to .gitignore and things like that.

@minsa110
Copy link
Contributor Author

I actually had a very similar comment! 😅 When we talked to users about this, they really like the ability to "confirm" what VS Code was going to do on their machine. They appreciated a message like this below, and they (to my surprise) wanted to click on "learn more" where they expected it to show them exactly what we'd install on their system so that they were confident about the changes VS Code was making to their system / repo.
image

With that, if we were to provide this true zero-notification flow, maybe it warrants an "undo" option of sorts?

@miguelsolorio
Copy link

I think we can include this dialog when installing the environment, and maybe with a "don't ask again" checkbox?

@miguelsolorio
Copy link

Scenario 2: Running a notebook and using local environments

This is aimed for the users that already have the Python/Jupyter extensions installed and has local environments. We should do our best to only show the most likely environments they would use and if we have low certainty then we show the full kernel list. I'm also proposing showing only the environments they have used in the past and allow them to remove them and access them through a "Show more kernels" menu item.

CleanShot.2022-08-18.at.11.45.35.mp4

@miguelsolorio
Copy link

Scenario 3: Running a notebook and using many local environments

This is for when we don't have great certainty that we'll pick the right kernel and directly jump to the "Select local kernel" quick pick. Once the users selected a kernel, it'll get added to the main kernel quick pick and will also be stored for the workspace when switching to a different notebook. There are other edge cases where if a user selects a different kernel in another notebook that we can discuss.

CleanShot.2022-08-18.at.14.24.46.mp4

@miguelsolorio
Copy link

In Scenarios 2 & 3, I wonder if we should be more opinionated and default to auto-creating environments as opposed to asking the user. On one hand, reducing the number of actions the user needs to do would be preferred but you also don't want to auto-create environments they didn't expect.

@miguelsolorio
Copy link

Scenario 4: Running a notebook on a remote jupyter server

This is to make sure we account for the remote server connections and how we show that in this new list. There will always be a "Connect to jupyter server" command from the kernel picker and then once you are connected to that remote, the kernels will be listed in the "Show more kernels" list. Showing multiple remote kernels is TBD.

CleanShot.2022-08-18.at.15.09.25.mp4

@minsa110
Copy link
Contributor Author

Thank you for making the clean shot videos for these scenarios! I have a few comments / questions.

Scenario 2

  • Is the main kernel quick pick based on the context of the workspace or is this a shared "suggested" list for VS Code in general? IOW, would the list of kernels in the main kernel quick pick change based on which workspace I'm in?
  • If the user was previously connected to a kernel from the workspace, should we just skip all clicks and directly connect them to that kernel (even after a VS Code reload)?
  • I see that the global Python environments are not listed in the "show more" list. Is this intentional?
  • I also see that the Python environment types are not categorized in the "show more" list. Is this intentional?
  • During the sync, we talked about caching the kernel list (for perf sake) and then adding a way to "refresh" the list. I wonder if the "show more" button can double as that "refresh" button?

Scenario 3

  • Thinking out loud here. It seems like this is what most users will see when they first try to run notebooks in VS Code. Then, they will subsequently see the main kernel quick pick with the shortened list based on what VS Code knows about the workspace. I wonder if that will cause any confusion? It seems like a "minor" confusion if at all, but still wanted to leave this comment as a food for thought.
  • The initial list makes it hard for user to find the "connect to a Jupyter server" option, because they'd have to know to click on the "back" arrow. I wonder if we should add that option to the bottom of this list? Or if there is a better way to surface that?
  • We've worked to reduce clicks in the "getting started" effort last iteration, where we'd automate some parts of the user flow where they'd have to install missing extensions. Would that experience flow into this experience where for example: a user would click "Run all" --> they click "Install necessary extensions (Python + Jupyter)" --> we'd immediately show them this list (without requiring the user to click the "Run all" or "Select kernel" button again)?
  • A few other entry points to consider for selecting kernels:
    • Attempting to activate an environment via integrated terminal with a notebook open (e.g. conda activate <env>)
    • Attempting to open a notebook from the integrated terminal (e.g. cd-ing into a folder and running jupyter notebook)

Scenario 4

  • One of the user confusion points was seeing remote & local kernels in the same list. Since the main kernel quick pick is now more of our somewhat strong suggestion / guesstimate, and going off of the assumption that the main kernel quick pick is workspace based (not like an MRU list for VS Code), I wonder if we should only include the remote kernel in the main kernel quick pick?
  • A few other entry points to consider for connecting to a remote server are:
    • Jupyter Server connection indication on the bottom right image
    • Attempting to connect to a remote Jupyter server via integrated terminal (e.g. ssh -L <port on remote server>:localhost:<port to map to> <remote_user>@<remote_host>)

Other food for thought...

  • What would this look like for users who doesn't have Python and/or Python env manager not installed?

@IanMatthewHuff
Copy link
Member

@minsa110 I think that #11159 is going to address many of the questions that you asked regarding Scenario 2. I don't think the UX walkthroughs will have the full info on what kernels make the "short-list" cut line, that will be up to us to figure out.

Scenario 3 I go have to admit I didn't quite grok the same way that Scenario 2 made sense to me. I think the only main reason might be that the back button also wasn't clear to me at that point the same way that Soojin wasn't sure on that. Adding the connect button to the bottom might help there.

I would say that based on how we figure out the cut-line for scenario two we could make scenario 3 way less common. If we have nothing above the cut line then it might be good to just "raise up" something like the highest version global python or the "create environment" command to be above the cut line and just do scenario #2.

I'm interested in what scenario 4 might look like with multiple remote connections. At some point we are pushing against the bounds of what we can do with the Quick Pick UI, but wondering about something like this:

  • Local Kernel A
  • Local Kernel B
  • Connect Remote Server

Then connect a remote server

  • Remote Kernel A (server 1)
  • Remote Kernel B (server 1)
  • Show Local Kernels
  • Connect Remote Server

Then connect a second remote server

  • Remote Kernel C (server 2)
  • Remote Kernel D (server 2)
  • Show Server 1 Kernels
  • Show Local Kernels
  • Connect Remote Server

Like the quick pick is always only ever showing one set of kernels, but it's easy to swap between multiple connected servers and local kernel views.

@daviddossett daviddossett added the polish Cleanup and polish issue label Dec 6, 2022
@minsa110
Copy link
Contributor Author

minsa110 commented Aug 4, 2023

Closing in favor of the changes we made, grouping the kernel types by categories

@minsa110 minsa110 closed this as completed Aug 4, 2023
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Sep 19, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
notebook-execution Kernels issues (start/restart/switch/execution, install ipykernel) notebook-remote Applies to remote Jupyter Servers polish Cleanup and polish issue
Projects
None yet
Development

No branches or pull requests

6 participants