[Benchmark] Adding inference benchmark suite #4915

mszarma · 2022-07-04T09:15:28Z

Adding inference benchmark for different num_layers, hidden_channels and batch_sizes.

Datasets: reddit, ognb-products, ogbn-mag
Models: gcn, gat, edge_conv, pna_conv , rgat as to_hetero(gat) , rgcn as to_hetero(graphsage)
Loader: NeighborLoader
Modes: measure time for graph sampling + inference , pure gnn mode - prepare batches and measure just models inference time

Co-authored-by: @kgajdamo , @dszwicht

codecov · 2022-07-04T09:18:33Z

Codecov Report

Merging #4915 (c97fa76) into master (b757161) will decrease coverage by 0.01%.
The diff coverage is 77.27%.

@@            Coverage Diff             @@
##           master    #4915      +/-   ##
==========================================
- Coverage   82.83%   82.82%   -0.02%     
==========================================
  Files         331      331              
  Lines       18047    18069      +22     
==========================================
+ Hits        14949    14965      +16     
- Misses       3098     3104       +6

Impacted Files	Coverage Δ
torch_geometric/nn/models/basic_gnn.py	`88.88% <50.00%> (-3.42%)`	⬇️
torch_geometric/nn/conv/pna_conv.py	`92.68% <100.00%> (+1.25%)`	⬆️

Adding inference folder Adding inference benchmark for different hyper params and batchsize: datasets: reddit, ognb-products, ogbn-mag models: gcn, gat, edge_conv, pna_conv , to_hetero(gat) , to_hetero(graphsage) using NeighborLoader Co-authored-by: kgajdamo , dszwicht

for more information, see https://pre-commit.ci

NeighborLoader takes a long time to initialize, and there is no need to re-create it every time the number of layers changes. So in order to speed up the tests evaluation, it can be elevated to an outer loop.

rusty1s

Thank you! This looks good. I think it would be great to re-use some existing code from torch_geometric.nn.models.basic_gnn (instead of writing new models for every benchmark). Happy to add an "inference-mode" to these models directly within PyG.

benchmark/inference/edgeconv.py

benchmark/inference/gat.py

benchmark/inference/utils.py

benchmark/inference/inference_benchmark.py

torch_geometric/nn/models/basic_gnn.py

rusty1s · 2022-07-15T16:27:54Z

torch_geometric/nn/conv/pna_conv.py

@@ -169,3 +170,19 @@ def __repr__(self):
        return (f'{self.__class__.__name__}({self.in_channels}, '
                f'{self.out_channels}, towers={self.towers}, '
                f'edge_dim={self.edge_dim})')
+
+    @staticmethod


Can we briefly test this in nn/conv/test_pna_conv?

torch_geometric/nn/conv/pna_conv.py

benchmark/inference/hetero_sage.py

benchmark/inference/hetero_gat.py

mananshah99

Thank you very much for the updates! Left some comments, please let us know if you have any questions.

benchmark/inference/inference_benchmark.py

mananshah99 · 2022-07-20T05:11:59Z

benchmark/inference/inference_benchmark.py

+        hetero = True if dataset_name == 'ogbn-mag' else False
+        mask = ('paper', None) if hetero else None
+        degree = None
+
+        data = dataset[0].to(device)
+        inputs_channels = data.x_dict['paper'].size(
+            -1) if hetero else dataset.num_features


Lines 29 and 33-34 seem specific to ogbn-mag (instead of hetero), and will likely break if we add more heterogeneous datasets in the future. Can we condition on if dataset_name == 'ogbn-mag' instead?

Good point. Done

mananshah99 · 2022-07-20T05:14:39Z

benchmark/inference/inference_benchmark.py

+        mask = ('paper', None) if hetero else None
+        degree = None
+
+        data = dataset[0].to(device)


Can we return dataset[0] from get_dataset, so this logic can be handled properly for datasets that may override __getitem__?

torch_geometric/nn/models/basic_gnn.py

mananshah99 · 2022-07-20T05:24:41Z

benchmark/inference/inference_benchmark.py

+                    subgraph_loader = NeighborLoader(
+                        copy.copy(data),
+                        num_neighbors=[-1],
+                        input_nodes=mask,
+                        batch_size=batch_size,
+                        shuffle=False,
+                        num_workers=args.num_workers,
+                    )
+                    subgraph_loader.data.n_id = torch.arange(data.num_nodes)
+
+                for layers in args.num_layers:
+                    if hetero:
+                        subgraph_loader = NeighborLoader(
+                            copy.copy(data),
+                            num_neighbors=[args.hetero_num_neighbors] * layers,
+                            input_nodes=mask,
+                            batch_size=batch_size,
+                            shuffle=False,
+                            num_workers=args.num_workers,
+                        )


Not sure I understand the differences for NeighborLoader between homogeneous and heterogeneous graphs, as well as the interplay between num_neighbors and num_layers. For the purposes of this benchmark, it should be the case that len(num_neighbors) == len(num_layers) (for both homogeneous and heterogeneous graphs).

If we want to be able to specify the number of neighbors for each layer, we can consolidate this logic and just have num_neighbors = args.num_neighbors. Does that make sense?

@mananshah99 Maybe i will point out few things to make the situation more clear:

num_layers is parameter for benchmark, it is list of layer sizes to bench for the models -> layer(s) is passed to model to define the number of layers to create in model in basic_gnn and hetero_* . So we are benching in the loop models created with different number of layers separately.

Inference for basic_gnn is performed "layer-wise" and inference for hetero is performed "batch-wise". In case of layer-wise approach we always working only on the nearest neighborhood (neighborloader num_neighbors "len" is always 1) . In case of batch-wise we need to define neighborloader num_neighbor's as deep as the layers size.

The hetero_num_neighbors args was added only because right now inference for hetero in batch-wise mode is so long. Added such arg to have an option for user/CI to easily change the num_neighbors to do some smaller bench.

Creation of loader's takes time and we are performing in the loop many benchmarks for different setups so to save the time for homogeneous workloads where we don't depend on layer size we create loader before the layers size loop but for hetero models we need to know the layer size in the moment of creating NeighborLoader object so we create it in the num_layers loop.

Let me know if that explains the situation and approaches.

For homogeneous graphs, we do inference via layer-wise computation. As such, I think the code is correct. We should nonetheless document this properly here to avoid confusion.

mananshah99 · 2022-07-20T05:29:38Z

benchmark/inference/inference_benchmark.py

+        default=['edge_conv', 'gat', 'gcn', 'pna_conv', 'rgat',
+                 'rgcn'], type=str)


Can we automatically obtain this from supported_sets.values()?

The idea of that default param is to have an option to change the performed bench suites during some checking etc. - for user's convenience who can do it in CLI or change directly in script - I personally would prefer to have it written explicitly in the list - pulling from supported_sets would not give such a flexibility.
WDYT?

I think we would need to get the union of all values. +1

Probably okay to leave it as it is for now :)

mananshah99 · 2022-07-20T05:29:50Z

benchmark/inference/inference_benchmark.py

+    argparser.add_argument('--datasets', nargs='+',
+                           default=['ogbn-mag', 'ogbn-products',
+                                    'reddit'], type=str)


Can we automatically obtain this from supported_sets.keys()?

Please look into the comment above.

mananshah99 · 2022-07-20T05:31:34Z

benchmark/inference/inference_benchmark.py

+    argparser.add_argument(
+        '--hetero-num-neighbors', default=-1, type=int,
+        help='number of neighbors to sample per layer for hetero workloads')
+    argparser.add_argument('--num-workers', default=2, type=int)


Is this a reasonable default (instead of cpu_count // 2, for example?)

Shortly: As for now - for NeighborLoader should be good enough - we can always changed it when we will do some more analysis of workloads and while improving the software stack.
More: Dataloading is tightly coupled with model inference/train loop - the dataloader "producing" efficiency interact the model "consuming" efficiency. In general empirical experience shows that X num_workers can be optimal for cpu cores in range Y to Z - the accurate values may wary and depends things like HW (e.g. single vs multi sockets, memory) and OS/env settings.

mananshah99 · 2022-07-20T05:32:36Z

benchmark/inference/inference_benchmark.py

+        '--models', nargs='+',
+        default=['edge_conv', 'gat', 'gcn', 'pna_conv', 'rgat',
+                 'rgcn'], type=str)
+    argparser.add_argument('--root', default='../../data', type=str)


Can we document this parameter?

mananshah99 · 2022-07-20T05:33:25Z

benchmark/inference/inference_benchmark.py

+    argparser.add_argument('--root', default='../../data', type=str)
+    argparser.add_argument('--eval-batch-sizes', nargs='+',
+                           default=[512, 1024, 2048, 4096, 8192], type=int)
+    argparser.add_argument('--num-layers', nargs='+', default=[2, 3], type=int)


Same as comments above, let's unify this (along with hetero num neighbors) for both homogeneous and heterogeneous graphs.

Please look into the comment above.

…rch_geometric into inference_benchmark

rusty1s

I think this looks mostly great. Left a few minor comments.

benchmark/inference/hetero_gat.py

rusty1s · 2022-07-21T07:51:39Z

benchmark/inference/utils.py

+        print(f'Model {name} not supported!')
+
+    if name == 'rgat':
+        model = model_type(metadata, params['hidden_channels'],


Suggested change

model = model_type(metadata, params['hidden_channels'],

return model_type(metadata, params['hidden_channels'],

It is recommended to return immediately. The following elif can be converted to if.

rusty1s · 2022-07-21T07:52:35Z

benchmark/inference/utils.py

+
+def get_model(name, params, metadata=None):
+    try:
+        model_type = models_dict[name]


Suggested change

model_type = models_dict[name]

model_type = models_dict.get(name, None)

to avoid the try/catch? Personal preference though, I guess.

rusty1s · 2022-07-21T07:54:31Z

benchmark/inference/utils.py

+
+def get_model(name, params, metadata=None):
+    try:
+        model_type = models_dict[name]


Suggested change

model_type = models_dict[name]

Model = models_dict[name]

Can we make this uppercase so that it is clear that it returns a class?

Good point - done.

rusty1s · 2022-07-21T07:55:57Z

benchmark/inference/utils.py

+        kwargs['heads'] = params['num_heads']
+        model = model_type(params['inputs_channels'],
+                           params['hidden_channels'], params['num_layers'],
+                           params['output_channels'], **kwargs)


Suggested change

params['output_channels'], **kwargs)

params['output_channels'], heads=params['num_heads'],)

I think this is cleaner than first constructing a dictionary.

rusty1s · 2022-07-21T08:10:45Z

benchmark/inference/inference_benchmark.py

+                        shuffle=False,
+                        num_workers=args.num_workers,
+                    )
+                    subgraph_loader.data.n_id = torch.arange(data.num_nodes)


Can be dropped IMO.

rusty1s · 2022-07-21T08:12:28Z

benchmark/inference/inference_benchmark.py

+                    subgraph_loader = NeighborLoader(
+                        copy.copy(data),
+                        num_neighbors=[-1],
+                        input_nodes=mask,
+                        batch_size=batch_size,
+                        shuffle=False,
+                        num_workers=args.num_workers,
+                    )
+                    subgraph_loader.data.n_id = torch.arange(data.num_nodes)
+
+                for layers in args.num_layers:
+                    if hetero:
+                        subgraph_loader = NeighborLoader(
+                            copy.copy(data),
+                            num_neighbors=[args.hetero_num_neighbors] * layers,
+                            input_nodes=mask,
+                            batch_size=batch_size,
+                            shuffle=False,
+                            num_workers=args.num_workers,
+                        )


For homogeneous graphs, we do inference via layer-wise computation. As such, I think the code is correct. We should nonetheless document this properly here to avoid confusion.

rusty1s · 2022-07-21T08:12:42Z

benchmark/inference/inference_benchmark.py

+                            shuffle=False,
+                            num_workers=args.num_workers,
+                        )
+                        subgraph_loader.data.n_id = torch.arange(


Drop as well?

rusty1s · 2022-07-21T08:14:06Z

benchmark/inference/inference_benchmark.py

+    argparser.add_argument('--datasets', nargs='+',
+                           default=['ogbn-mag', 'ogbn-products',
+                                    'reddit'], type=str)


rusty1s · 2022-07-21T08:14:32Z

benchmark/inference/inference_benchmark.py

+        default=['edge_conv', 'gat', 'gcn', 'pna_conv', 'rgat',
+                 'rgcn'], type=str)


I think we would need to get the union of all values. +1

rusty1s · 2022-07-21T08:18:26Z

Please also add your PR to the CHANGELOG.md :)

mszarma and others added 2 commits July 4, 2022 17:04

[pre-commit.ci] auto fixes from pre-commit.com hooks

5999565

for more information, see https://pre-commit.ci

mananshah99 requested review from mananshah99 and rusty1s July 5, 2022 15:46

rusty1s assigned mszarma Jul 7, 2022

rusty1s added 0 - Priority P0 benchmark nn feature labels Jul 7, 2022

mszarma and others added 2 commits July 7, 2022 11:30

Merge branch 'pyg-team:master' into inference_benchmark

9f8df28

move NeighborLoader initialization to the outer loop

f678ba4

NeighborLoader takes a long time to initialize, and there is no need to re-create it every time the number of layers changes. So in order to speed up the tests evaluation, it can be elevated to an outer loop.

rusty1s reviewed Jul 8, 2022

View reviewed changes

yanbing-j reviewed Jul 10, 2022

View reviewed changes

benchmark/inference/inference_benchmark.py Outdated Show resolved Hide resolved

mingfeima mentioned this pull request Jul 11, 2022

[Roadmap] CPU Performance Optimization for PyG #4891

Open

32 tasks

mszarma added 5 commits July 12, 2022 10:04

Merge branch 'pyg-team:master' into inference_benchmark

750e6c5

Refactor - device, degree, get_dataset

38d0798

add get_degree staticmethod to PNAConv

1252a3c

Merge branch 'pyg-team:master' into inference_benchmark

ba77ab7

Merge branch 'pyg-team:master' into inference_benchmark

188e807

mszarma force-pushed the inference_benchmark branch from 6d21c30 to 81c4d01 Compare July 15, 2022 15:52

rusty1s reviewed Jul 15, 2022

View reviewed changes

apply changes to use basic_gnn models

81c4d01

mszarma force-pushed the inference_benchmark branch from fa7beaf to 1c6f95e Compare July 19, 2022 10:43

mszarma and others added 2 commits July 19, 2022 18:51

refactor - hetero models, add get_degree_histogram test , progress_bat

1c6f95e

Merge branch 'master' into inference_benchmark

f74fbf7

mananshah99 reviewed Jul 20, 2022

View reviewed changes

mszarma added 2 commits July 20, 2022 19:03

apply changes after Manan's review

139f655

Merge branch 'inference_benchmark' of https://github.com/mszarma/pyto…

5bc9420

…rch_geometric into inference_benchmark

rusty1s approved these changes Jul 21, 2022

View reviewed changes

mszarma merged commit 11f16e3 into pyg-team:master Jul 21, 2022

mszarma added 2 commits July 21, 2022 21:13

apply changes after Matthias's review

2b4883c

Merge branch 'master' into inference_benchmark

c97fa76

		default=['edge_conv', 'gat', 'gcn', 'pna_conv', 'rgat',
		'rgcn'], type=str)

	model = model_type(metadata, params['hidden_channels'],
	return model_type(metadata, params['hidden_channels'],

	model_type = models_dict[name]
	model_type = models_dict.get(name, None)

	params['output_channels'], **kwargs)
	params['output_channels'], heads=params['num_heads'],)

[Benchmark] Adding inference benchmark suite #4915

[Benchmark] Adding inference benchmark suite #4915

Conversation

mszarma commented Jul 4, 2022

codecov bot commented Jul 4, 2022 • edited Loading

Codecov Report

rusty1s left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mananshah99 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rusty1s left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rusty1s Jul 21, 2022 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rusty1s commented Jul 21, 2022

codecov bot commented Jul 4, 2022 •

edited

Loading

rusty1s Jul 21, 2022 •

edited

Loading