-
-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduced cpu performance gaps disscussion #6697
Comments
I'm confused that why is thread-safe prediction has any impact on training? I thought we have enabled prediction cache for both regression and multi-class on CPU? |
If the prediction happens during evaluation, we can use thread local static storage I believe. As for the JSON thing, it's occurred during the end of training, which is used to release memory. |
Yes prediction caching is enabled for regression, binary and multiclass classification but for |
is it possible to have some alternative |
Thanks for the reply. I think it's possible to change the subsampling implementation to allow cache. As for the serialization time, I will try to find a way to remove it, either by BSON or by a better way to release memory. |
The thread safety was mostly for dask interface, also some other feature requests. In dask, the prediction is done on each block of data in parallel. |
so if i correctly understand we could initialize |
Closed by #7545 . |
Two performance problems were discovered in recently merged PRs:
Would it be possible to have not thread safe
PredictDMatrix
call to reduce overheads for subsampling cases and have initialization buffer only once on first training iteration? I'd appreciate for any ideas.Before that we have an option to save old behavior via setting
enable-experimental-json-serialization
toFalse
but currently there is no such possibility.Could you share your thoughts about best way to mitigate it?
The text was updated successfully, but these errors were encountered: