-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[dask] Use client to persist collections #6722
Conversation
Hi @trivialfis, here's my proposal for solving #6712. I ran the included test with the current master branch and was able to reproduce the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the fix! A small error in test.
#6726 Will fix the CI issue.
tests/python/test_with_dask.py
Outdated
asynchronous=True, | ||
dashboard_address=0) as cluster: | ||
async with Client(cluster, asynchronous=True) as client: | ||
X, y, w = generate_array() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
generate_array(with_weights=True).
Thanks for the fix. Should I wait for #6726 to get merged to update this? |
Yup. Waiting for review. |
Codecov Report
@@ Coverage Diff @@
## master #6722 +/- ##
==========================================
- Coverage 81.55% 81.53% -0.03%
==========================================
Files 13 13
Lines 3719 3769 +50
==========================================
+ Hits 3033 3073 +40
- Misses 686 696 +10
Continue to review full report at Codecov.
|
This attempts to solve #6712 by using the
client
object when persisting collections. This ensures that the futures are computed by theclient
passed toxgb.dask.DaskDMatrix
and allows for several trainings to happen concurrently on different clusters.