-
-
Notifications
You must be signed in to change notification settings - Fork 718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
too many file descriptors in select() on windows #686
Comments
Are those processes client processes or worker processes? If a client is gathering data from many different workers, it may end up having too many active connections at once. |
these are worker processes. (1 thread each). I decided to partition my data more (because having some other issues). |
@pitrou likely workers. The client normally only communicates with the scheduler. All data to the client is routed through the scheduler because it is common for workers to not be publicly visible. @jreback what kind of computation are you doing? My guess is that a few of the workers are either serving/requesting data to/from many other workers. On the requesting side we could try to ensure that we only collect from a few workers at once. There are some possible performance benefits to this as well for large shuffles. On the serving side I don't know how to limit Tornado to refuse connections or to ensure that they're cleaning up well. @jreback are you able to provide a full traceback? It would be interesting to see where this problem arose. |
|
I've filed ContinuumIO/anaconda-issues#1241 |
This is a simple load from a remote source (kind of like s3, but not exactly). which has worked flawlessly for quite a while. Difference now is I am using 2x partitions) (was 8000, now 16000). And with a few more workers. |
so same exact computation worked perfectly when I had 4000 tasks (and 500 cores) instead. |
We have significantly reduced the number of open sockets between workers and the scheduler, but there are at least one or two per worker process. The scheduler must have the ability to open at least this many files. On windows this is tricky, because this limit is hard coded. We have increased this hard-coded limit in the conda-forge and conda defaults recipes to something like two thousand. For the moment I think that this is all that we can do without significant changes. Closing. |
on windows (using 750 workers, 12000 tasks, scheduler on linux though). stock python build.
dask - 0.12.0
distributed - 1.14.3
some processes can error out with
too many file descriptors in select()
The text was updated successfully, but these errors were encountered: