Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduced cpu performance gaps disscussion #6697

Closed
ShvetsKS opened this issue Feb 9, 2021 · 8 comments
Closed

Introduced cpu performance gaps disscussion #6697

ShvetsKS opened this issue Feb 9, 2021 · 8 comments

Comments

@ShvetsKS
Copy link
Contributor

ShvetsKS commented Feb 9, 2021

Two performance problems were discovered in recently merged PRs:

  1. thread safety prediction: Make prediction thread safe. #6648
santander training stage before #6648 thread safe #6648 and master #6696 PR
full training 136s 165s 140s
PredictRaw 55s 78s 59s

Would it be possible to have not thread safe PredictDMatrix call to reduce overheads for subsampling cases and have initialization buffer only once on first training iteration? I'd appreciate for any ideas.

  1. drop binary format: Drop binary format for memory snapshot. #6513 leads to stable 10% perf regression, observed for Higgs1m dataset after this commit (16s vs 14.3s full training).
    Before that we have an option to save old behavior via setting enable-experimental-json-serialization to False but currently there is no such possibility.
    Could you share your thoughts about best way to mitigate it?
@ShvetsKS ShvetsKS changed the title Introduced cpu performance gaps Introduced cpu performance gaps disscussion Feb 9, 2021
@trivialfis
Copy link
Member

trivialfis commented Feb 9, 2021

I'm confused that why is thread-safe prediction has any impact on training? I thought we have enabled prediction cache for both regression and multi-class on CPU?

@trivialfis
Copy link
Member

If the prediction happens during evaluation, we can use thread local static storage I believe.

As for the JSON thing, it's occurred during the end of training, which is used to release memory.

@ShvetsKS
Copy link
Contributor Author

ShvetsKS commented Feb 9, 2021

I'm confused that why is thread-safe prediction has any impact on training? I thought we have enabled prediction cache for both regression and multi-class on CPU?

Yes prediction caching is enabled for regression, binary and multiclass classification but for subsample==1 case only

@ShvetsKS
Copy link
Contributor Author

ShvetsKS commented Feb 9, 2021

As for the JSON thing, it's occurred during the end of training, which is used to release memory.

is it possible to have some alternative enable-experimental-json-serialization to reduce observed gaps?

@trivialfis
Copy link
Member

Thanks for the reply. I think it's possible to change the subsampling implementation to allow cache.

As for the serialization time, I will try to find a way to remove it, either by BSON or by a better way to release memory.

@trivialfis
Copy link
Member

The thread safety was mostly for dask interface, also some other feature requests. In dask, the prediction is done on each block of data in parallel.

@ShvetsKS
Copy link
Contributor Author

The thread safety was mostly for dask interface, also some other feature requests. In dask, the prediction is done on each block of data in parallel.

so if i correctly understand we could initialize feat_vecs externally in training part function PredictRaw only once on 0-th iteration and propagate it right to PredictDMatrix

@trivialfis
Copy link
Member

Closed by #7545 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants