Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dask.distributed.Client() keeps spawning new workers with FileNotFoundError #7515

Closed
FlorisCalkoen opened this issue Jan 31, 2023 · 8 comments

Comments

@FlorisCalkoen
Copy link

Describe the issue:

While installing some JupyterLab extensions into my JupyterLab mamba environment I found out that dask.distributed.Client() breaks in JupyterLab when older versions of jupyter_server are used. For example, installing voila downgrades jupyter_server to 1.23.5. After that downgrade Client keeps spawning clients with FileNotFoundError.

Minimal Complete Verifiable Example:

Create an environment that fails:

mamba create -n testenv jupyterlab dask distributed voila
mamba activate testenv
jupyter lab

Open a new notebook and run:

from dask.distributed import Client
Client()

Traceback

Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/Users/calkoen/mambaforge/envs/testenv/lib/python3.10/multiprocessing/spawn.py", line 116, in spawn_main
    exitcode = _main(fd, parent_sentinel)
  File "/Users/calkoen/mambaforge/envs/testenv/lib/python3.10/multiprocessing/spawn.py", line 125, in _main
    prepare(preparation_data)
  File "/Users/calkoen/mambaforge/envs/testenv/lib/python3.10/multiprocessing/spawn.py", line 236, in prepare
    _fixup_main_from_path(data['init_main_from_path'])
  File "/Users/calkoen/mambaforge/envs/testenv/lib/python3.10/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
    main_content = runpy.run_path(main_path,
  File "/Users/calkoen/mambaforge/envs/testenv/lib/python3.10/runpy.py", line 288, in run_path
    code, fname = _get_code_from_file(run_name, path_name)
  File "/Users/calkoen/mambaforge/envs/testenv/lib/python3.10/runpy.py", line 252, in _get_code_from_file
    with io.open_code(decoded_path) as f:
FileNotFoundError: [Errno 2] No such file or directory: '/Users/calkoen/.dotfiles/mamba/3e21c34c-8a88-4a3a-981c-f9dcd89a3d8c'
2023-01-31 15:52:15,076 - distributed.nanny - WARNING - Restarting worker

Environment:

  • Dask version: 2023.1.1
  • Distributed version: 2023.1.1
  • Voila version: 0.4.0
  • jupyter_server: 1.23.5
  • Python version: 3.11
  • Operating System: MacOS
  • Install method mamba/conda:
@jrbourbeau
Copy link
Member

Thanks for reporting @FlorisCalkoen. I'm able to reproduce the issue. It looks like there's an issue spawning subprocesses when we attempt to create workers, though I'll admit based on the error message I have no idea what might be causing it. You seemed to jupyter_server as the culprit -- I'm curious how you got to that conclusion.

FWIW you can configure dask to use other process creation methods. fork is working for me locally:

dask.config.set({"distributed.worker.multiprocessing-method": "fork"})

@FlorisCalkoen
Copy link
Author

FlorisCalkoen commented Feb 2, 2023

Thanks for the workaround!

So I think it's related to jupyter_server because I noticed that this issue happened every time when I installed a package that triggered a downgrade of jupyter_server. One way to check this is:

mamba create -n testenv jupyterlab dask distributed
mamba activate testenv
jupyter lab 

Now you can start a client without any problem:

from dask.distributed import Client
Client()

However, when you would install a package that triggers a downgrade of jupyter_server, like voila or jupyter-resource-usage the dask.distributed.Client breaks; mamba install voila Similarly, mamba create -n testenv jupyterlab dask distributed jupyter_server=1.23.5 also fails.

@FlorisCalkoen
Copy link
Author

FlorisCalkoen commented Feb 2, 2023

Microsoft planetary computer has both voila and dask available in their JupyterLab environment. These are their versions:

import voila
import jupyter_server
import dask
import distributed
import jupyter_resource_usage

print(f"voila: {voila.__version__}")
print(f"jupyter_server: {jupyter_server.__version__}")
print(f"jupyter_resource_usage: {jupyter_resource_usage.__version__}")
print(f"dask: {dask.__version__}")
print(f"distributed: {distributed.__version__}")

voila: 0.3.6
jupyter_server: 1.18.1
jupyter_resource_usage: 0.6.2
dask: 2022.8.0
distributed: 2022.8.0

@FlorisCalkoen
Copy link
Author

@jrbourbeau , you're right that it doesn't seem related to having older versions of jupyter_server. I keep getting this spawning error in different circumstances as well, but I can't find a pattern in the package / jupyter extension that are installed in the environment. Unfortunately I no longer have access to my previously working environment, so it's difficult to find out what exactly changed..

@FlorisCalkoen FlorisCalkoen changed the title dask.distributed.Client() keeps spawning new workers with FileNotFoundError when using older jupyter_server versions dask.distributed.Client() keeps spawning new workers with FileNotFoundError Feb 3, 2023
@keewis
Copy link

keewis commented Feb 10, 2023

this seems at least related to jupyter-server/jupyter_server#1198, and I think we can pin ipykernel to <6.21 to avoid that behavior for now (I've been getting the NameError from that issue, but if I rename the notebook without restarting the jupyter session I get a FileNotFoundError)

@jrbourbeau
Copy link
Member

Thanks for tracking that down @keewis. @FlorisCalkoen does pinning ipykernel<6.21 help?

@FlorisCalkoen
Copy link
Author

Thank you @keewis, @jrbourbeau the solution by @keewis works!

@jrbourbeau
Copy link
Member

@FlorisCalkoen just noting there's a new ipykernel release https://pypi.org/project/ipykernel/6.21.2 that should fix the underlying issue jupyter-server/jupyter_server#1198.

Given there's a workaround and a new upstream ipykernel release, I'm going to close this issue. Thanks all!

jbusecke added a commit to jbusecke/pangeo-docker-images that referenced this issue Feb 14, 2023
Pinning ipykernel to avoid issue raised in dask/distributed#7536 dask/distributed#7515
scottyhq pushed a commit to pangeo-data/pangeo-docker-images that referenced this issue Feb 15, 2023
* Pin ipykernel>6.21.2

Pinning ipykernel to avoid issue raised in dask/distributed#7536 dask/distributed#7515

* [condalock-command] autogenerated conda-lock files

---------

Co-authored-by: pangeo-bot <pangeo-bot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants