Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEATURE] A smarter way to determine default npartitions in Dask #304

Closed
goodwanghan opened this issue Feb 5, 2022 · 1 comment · Fixed by #301
Closed

[FEATURE] A smarter way to determine default npartitions in Dask #304

goodwanghan opened this issue Feb 5, 2022 · 1 comment · Fixed by #301
Labels
dask enhancement New feature or request
Milestone

Comments

@goodwanghan
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

Currently, the default number of partitions is hardcoded to 16, the number of cores is hardcoded to 2.

Describe the solution you'd like

We can use psutil and multiprocessing to try to get CPU count, if not found, we hard code. It should try psutil first, if the package is not installed, try multiprocessing. psutil is preferred because it can return physical cores. But I am not sure if we should make this a requirement or not.

@goodwanghan goodwanghan added enhancement New feature or request dask labels Feb 5, 2022
@goodwanghan goodwanghan added this to the 0.6.6 milestone Feb 5, 2022
@goodwanghan goodwanghan linked a pull request Feb 5, 2022 that will close this issue
@goodwanghan
Copy link
Collaborator Author

Dask already has CPU_COUNT see https://github.com/dask/dask/blob/007cbd4e6779c3311af91ba4ecdfe731f23ddf58/dask/system.py#L53

So we will use it directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
dask enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant