-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve inference loop on GPU devices #7896
Conversation
Codecov Report
@@ Coverage Diff @@
## master #7896 +/- ##
==========================================
- Coverage 89.42% 89.41% -0.01%
==========================================
Files 458 459 +1
Lines 26871 26918 +47
==========================================
+ Hits 24029 24069 +40
- Misses 2842 2849 +7
... and 2 files with indirect coverage changes 📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
933a267
to
9bb1aba
Compare
@rusty1s Any new comments here? |
9bb1aba
to
83141a2
Compare
Will merge ASAP. |
0da96da
to
928ca79
Compare
@akihironitta Could you PTAL? |
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
928ca79
to
c389e60
Compare
This PR adds `--cached-loader` option, that enables `CachedLoader` in inference loop. Waiting for #7896. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
In layer-wise inference loop, we perform computations as shown on the following pseudocode: ``` for layer in layers for batch in loader do inference per layer ``` In models that have more than one layer, we can benefit from caching batches during the first walk through the data. This PR introduces `CachedLoader`, which transfers batches to a pointed device and caches them. Additionally, an auxiliary function was provided, `make_batches_cacheable`, which decorates `BasicGNN` instance with a custom inference loop. Selected performance results (gained on Intel PVC): ``` Speedup: gcn[2L]+Reddit: 1.53x gcn[3L]+Reddit: 1.69x sage[2L]+Reddit: 1.55x sage[3L]+Reddit: 2.02x gcn[2L]+ogbn-products: 1.72x gcn[3L]+ogbn-products: 2.11x sage[2L]+ogbn-products: 1.83x sage[3L]+ogbn-products: 2.44x ``` Caching mechanism did not have a significant impact on models with a single layer. Drawbacks: - User should be aware that caching mechanism requires additional device memory to be allocated. In experiments, approximately 1GB was needed for the `Reddit` dataset. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>
…am#7897) This PR adds `--cached-loader` option, that enables `CachedLoader` in inference loop. Waiting for pyg-team#7896. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
In layer-wise inference loop, we perform computations as shown on the following pseudocode: ``` for layer in layers for batch in loader do inference per layer ``` In models that have more than one layer, we can benefit from caching batches during the first walk through the data. This PR introduces `CachedLoader`, which transfers batches to a pointed device and caches them. Additionally, an auxiliary function was provided, `make_batches_cacheable`, which decorates `BasicGNN` instance with a custom inference loop. Selected performance results (gained on Intel PVC): ``` Speedup: gcn[2L]+Reddit: 1.53x gcn[3L]+Reddit: 1.69x sage[2L]+Reddit: 1.55x sage[3L]+Reddit: 2.02x gcn[2L]+ogbn-products: 1.72x gcn[3L]+ogbn-products: 2.11x sage[2L]+ogbn-products: 1.83x sage[3L]+ogbn-products: 2.44x ``` Caching mechanism did not have a significant impact on models with a single layer. Drawbacks: - User should be aware that caching mechanism requires additional device memory to be allocated. In experiments, approximately 1GB was needed for the `Reddit` dataset. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>
This PR adds `--cached-loader` option, that enables `CachedLoader` in inference loop. Waiting for #7896. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>
In layer-wise inference loop, we perform computations as shown on the following pseudocode:
In models that have more than one layer, we can benefit from caching batches during the
first walk through the data. This PR introduces
CachedLoader
, which transfersbatches to a pointed device and caches them. Additionally, an auxiliary function was provided,
make_batches_cacheable
, which decoratesBasicGNN
instance with a custom inferenceloop.
Selected performance results (gained on Intel PVC):
Caching mechanism did not have a significant impact on models with a single layer.
Drawbacks:
In experiments, approximately 1GB was needed for the
Reddit
dataset.