Handle errors in `jupyter_client.provisioning.provisioner_base::KernelProvisionerBase.launch_kernel` #960

dchirikov · 2023-07-25T12:34:07Z

I'm currently working on a new custom kernel provisioner but I'm finding it a bit tricky to handle errors properly. It seems like the function KernelProvisionerBase.launch_kernel should always give back a KernelConnectionInfo structure (That's what type hints say). But sometimes, especially when the kernel needs to run on another computer (a remote host), this might not be possible due to some infra or scheduling issues. This means the kernel can't start, and there's no connection info will be returned from the call.

From what I can tell, the only way to let the system know something's gone wrong is to raise a RuntimeError() exception. This kind of works, but it also makes a mess, filling JupyterLab's error output (stderr) with tons of scary and confusing text. I'd really prefer just a couple lines of clean log.error() messages instead in stdeout with copy of them to JupyterLab's frontend user is looking at.

But that's not the biggest problem I'm facing. When I try to shut down JupyterLab, a kernel_id for the kernel which doesn't exist pops up. Here's what that looks like:

      File "/w/.tox/py311/lib/python3.11/site-packages/jupyter_client/multikernelmanager.py", line 306, in _async_shutdown_all
        await asyncio.gather(*futs)
      File "/w/.tox/py311/lib/python3.11/site-packages/jupyter_server/services/kernels/kernelmanager.py", line 418, in _async_shutdown_kernel
        self._check_kernel_id(kernel_id)
      File "/w/.tox/py311/lib/python3.11/site-packages/jupyter_server/services/kernels/kernelmanager.py", line 532, in _check_kernel_id
        raise web.HTTPError(404, "Kernel does not exist: %s" % kernel_id)
    tornado.web.HTTPError: HTTP 404: Not Found (Kernel does not exist: 0546b17a-af0b-4599-8473-cdcaa4011254)

This 0546b17a-af0b-4599-8473-cdcaa4011254 is a kernel_id of never running kernel which did not return its connection_info. So I was thinking I am doing something not quite right and need to indicate a kernel spawning error differently.

Thanks in advance.

The text was updated successfully, but these errors were encountered:

kevin-bates · 2023-07-25T14:08:10Z

From what I can tell, the only way to let the system know something's gone wrong is to raise a RuntimeError() exception. This kind of works, but it also makes a mess, filling JupyterLab's error output (stderr) with tons of scary and confusing text. I'd really prefer just a couple lines of clean log.error() messages instead in stdeout with copy of them to JupyterLab's frontend user is looking at.

Raising exceptions is the way startup errors are expected to be propagated. How those exceptions are displayed to the user is a different matter (and probably something to deal with in the Lab layer).

But sometimes, especially when the kernel needs to run on another computer (a remote host), this might not be possible due to some infra or scheduling issues. This means the kernel can't start, and there's no connection info will be returned from the call.

Starting remote kernels introduces multiple layers in which errors (and delays) can occur. Work was done in Jupyter Server on Pending Kernels that you may want to look at. IIRC, they need to be enabled, but provide some better handling for startup delays specifically for these kinds of issues. However, if your failure is a hard failure (and not just due to things taking longer) raising an exception is what is expected. (This was true even prior to provisioners, albeit the "provisioner" was essentially the Popen call.)

You might also take a look at the Gateway Provisioner classes. It provides a RemoteProvisionerBase class, as well as a ContainerProvisionerBase (if you're working with containers), that can be subclassed while providing most of the necessary infrastructure. If you have questions about these, please open an issue in that repo for further discussion.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle errors in `jupyter_client.provisioning.provisioner_base::KernelProvisionerBase.launch_kernel` #960

Handle errors in `jupyter_client.provisioning.provisioner_base::KernelProvisionerBase.launch_kernel` #960

dchirikov commented Jul 25, 2023

kevin-bates commented Jul 25, 2023

Handle errors in jupyter_client.provisioning.provisioner_base::KernelProvisionerBase.launch_kernel #960

Handle errors in jupyter_client.provisioning.provisioner_base::KernelProvisionerBase.launch_kernel #960

Comments

dchirikov commented Jul 25, 2023

kevin-bates commented Jul 25, 2023

Handle errors in `jupyter_client.provisioning.provisioner_base::KernelProvisionerBase.launch_kernel` #960

Handle errors in `jupyter_client.provisioning.provisioner_base::KernelProvisionerBase.launch_kernel` #960