Improve inference loop on GPU devices #7896

DamianSzwichtenberg · 2023-08-17T07:21:14Z

In layer-wise inference loop, we perform computations as shown on the following pseudocode:

for layer in layers
    for batch in loader
        do inference per layer

In models that have more than one layer, we can benefit from caching batches during the
first walk through the data. This PR introduces CachedLoader, which transfers
batches to a pointed device and caches them. Additionally, an auxiliary function was provided,
make_batches_cacheable, which decorates BasicGNN instance with a custom inference
loop.

Selected performance results (gained on Intel PVC):

Speedup:
gcn[2L]+Reddit: 1.53x
gcn[3L]+Reddit: 1.69x
sage[2L]+Reddit: 1.55x
sage[3L]+Reddit: 2.02x
gcn[2L]+ogbn-products: 1.72x
gcn[3L]+ogbn-products: 2.11x
sage[2L]+ogbn-products: 1.83x
sage[3L]+ogbn-products: 2.44x

Caching mechanism did not have a significant impact on models with a single layer.

Drawbacks:

User should be aware that caching mechanism requires additional device memory to be allocated.
In experiments, approximately 1GB was needed for the Reddit dataset.

codecov · 2023-08-17T07:27:58Z

Codecov Report

Merging #7896 (5b81af4) into master (ecc4b76) will decrease coverage by 0.01%.
Report is 1 commits behind head on master.
The diff coverage is 81.25%.

❗ Current head 5b81af4 differs from pull request most recent head 014b2fa. Consider uploading reports for the commit 014b2fa to get more accurate results

@@            Coverage Diff             @@
##           master    #7896      +/-   ##
==========================================
- Coverage   89.42%   89.41%   -0.01%     
==========================================
  Files         458      459       +1     
  Lines       26871    26918      +47     
==========================================
+ Hits        24029    24069      +40     
- Misses       2842     2849       +7

Files Changed	Coverage Δ
torch_geometric/loader/cache.py	`78.37% <78.37%> (ø)`
torch_geometric/nn/models/basic_gnn.py	`90.39% <90.00%> (-0.07%)`	⬇️
torch_geometric/loader/__init__.py	`100.00% <100.00%> (ø)`

... and 2 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

torch_geometric/loader/cache.py

torch_geometric/nn/models/basic_gnn.py

DamianSzwichtenberg · 2023-08-24T05:32:31Z

@rusty1s Any new comments here?

rusty1s · 2023-08-24T12:53:59Z

Will merge ASAP.

torch_geometric/loader/cache.py

test/loader/test_cache.py

DamianSzwichtenberg · 2023-08-28T06:42:20Z

@akihironitta Could you PTAL?

for more information, see https://pre-commit.ci

This PR adds `--cached-loader` option, that enables `CachedLoader` in inference loop. Waiting for #7896. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>

In layer-wise inference loop, we perform computations as shown on the following pseudocode: ``` for layer in layers for batch in loader do inference per layer ``` In models that have more than one layer, we can benefit from caching batches during the first walk through the data. This PR introduces `CachedLoader`, which transfers batches to a pointed device and caches them. Additionally, an auxiliary function was provided, `make_batches_cacheable`, which decorates `BasicGNN` instance with a custom inference loop. Selected performance results (gained on Intel PVC): ``` Speedup: gcn[2L]+Reddit: 1.53x gcn[3L]+Reddit: 1.69x sage[2L]+Reddit: 1.55x sage[3L]+Reddit: 2.02x gcn[2L]+ogbn-products: 1.72x gcn[3L]+ogbn-products: 2.11x sage[2L]+ogbn-products: 1.83x sage[3L]+ogbn-products: 2.44x ``` Caching mechanism did not have a significant impact on models with a single layer. Drawbacks: - User should be aware that caching mechanism requires additional device memory to be allocated. In experiments, approximately 1GB was needed for the `Reddit` dataset. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>

…am#7897) This PR adds `--cached-loader` option, that enables `CachedLoader` in inference loop. Waiting for pyg-team#7896. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>

In layer-wise inference loop, we perform computations as shown on the following pseudocode: ``` for layer in layers for batch in loader do inference per layer ``` In models that have more than one layer, we can benefit from caching batches during the first walk through the data. This PR introduces `CachedLoader`, which transfers batches to a pointed device and caches them. Additionally, an auxiliary function was provided, `make_batches_cacheable`, which decorates `BasicGNN` instance with a custom inference loop. Selected performance results (gained on Intel PVC): ``` Speedup: gcn[2L]+Reddit: 1.53x gcn[3L]+Reddit: 1.69x sage[2L]+Reddit: 1.55x sage[3L]+Reddit: 2.02x gcn[2L]+ogbn-products: 1.72x gcn[3L]+ogbn-products: 2.11x sage[2L]+ogbn-products: 1.83x sage[3L]+ogbn-products: 2.44x ``` Caching mechanism did not have a significant impact on models with a single layer. Drawbacks: - User should be aware that caching mechanism requires additional device memory to be allocated. In experiments, approximately 1GB was needed for the `Reddit` dataset. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: rusty1s <matthias.fey@tu-dortmund.de>

This PR adds `--cached-loader` option, that enables `CachedLoader` in inference loop. Waiting for #7896. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Matthias Fey <matthias.fey@tu-dortmund.de>

DamianSzwichtenberg requested a review from rusty1s August 17, 2023 07:21

DamianSzwichtenberg self-assigned this Aug 17, 2023

DamianSzwichtenberg requested review from wsad1, mananshah99, a team and EdisonLeeeee as code owners August 17, 2023 07:21

DamianSzwichtenberg mentioned this pull request Aug 17, 2023

Add possibility to use CachedLoader in inference benchmarks #7897

Merged

DamianSzwichtenberg added feature nn loader labels Aug 17, 2023

rusty1s reviewed Aug 17, 2023

View reviewed changes

torch_geometric/loader/cache.py Outdated Show resolved Hide resolved

torch_geometric/nn/models/basic_gnn.py Outdated Show resolved Hide resolved

DamianSzwichtenberg force-pushed the cached-loader branch 4 times, most recently from 933a267 to 9bb1aba Compare August 22, 2023 08:41

DamianSzwichtenberg force-pushed the cached-loader branch from 9bb1aba to 83141a2 Compare August 24, 2023 09:18

akihironitta reviewed Aug 24, 2023

View reviewed changes

torch_geometric/loader/cache.py Outdated Show resolved Hide resolved

torch_geometric/loader/cache.py Outdated Show resolved Hide resolved

test/loader/test_cache.py Outdated Show resolved Hide resolved

DamianSzwichtenberg force-pushed the cached-loader branch 2 times, most recently from 0da96da to 928ca79 Compare August 28, 2023 06:32

DamianSzwichtenberg and others added 7 commits August 28, 2023 08:43

Add CachedLoader implementation

b247d29

Update CHANGELOG.md

cb0436a

[pre-commit.ci] auto fixes from pre-commit.com hooks

55cee57

for more information, see https://pre-commit.ci

Fix PEP8 errors

43b59fb

Fix tests

0a6894b

Update code design

ebaa07a

[pre-commit.ci] auto fixes from pre-commit.com hooks

c0447ff

for more information, see https://pre-commit.ci

DamianSzwichtenberg and others added 2 commits August 28, 2023 08:43

Fix flake8 errors

375b3fb

[pre-commit.ci] auto fixes from pre-commit.com hooks

c389e60

for more information, see https://pre-commit.ci

DamianSzwichtenberg force-pushed the cached-loader branch from 928ca79 to c389e60 Compare August 28, 2023 06:44

rusty1s added 2 commits August 28, 2023 12:37

update

4ca3c6e

update

5b81af4

rusty1s approved these changes Aug 28, 2023

View reviewed changes

Merge branch 'master' into cached-loader

014b2fa

rusty1s enabled auto-merge (squash) August 28, 2023 12:45

rusty1s merged commit 627e0db into pyg-team:master Aug 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve inference loop on GPU devices #7896

Improve inference loop on GPU devices #7896

DamianSzwichtenberg commented Aug 17, 2023 •

edited

Loading

codecov bot commented Aug 17, 2023 •

edited

Loading

DamianSzwichtenberg commented Aug 24, 2023

rusty1s commented Aug 24, 2023

DamianSzwichtenberg commented Aug 28, 2023

Improve inference loop on GPU devices #7896

Improve inference loop on GPU devices #7896

Conversation

DamianSzwichtenberg commented Aug 17, 2023 • edited Loading

codecov bot commented Aug 17, 2023 • edited Loading

Codecov Report

DamianSzwichtenberg commented Aug 24, 2023

rusty1s commented Aug 24, 2023

DamianSzwichtenberg commented Aug 28, 2023

DamianSzwichtenberg commented Aug 17, 2023 •

edited

Loading

codecov bot commented Aug 17, 2023 •

edited

Loading