Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

compression benchmark #2742

Merged
merged 9 commits into from
Aug 11, 2020
Merged

compression benchmark #2742

merged 9 commits into from
Aug 11, 2020

Conversation

suiguoxin
Copy link
Member

@suiguoxin suiguoxin commented Jul 27, 2020

Filter pruning experiment with : SimulatedAnnealing, NetAdapt, AutoCompress, L1Filter, L2Filter, FPGMPruner on cifar10 with resnet18, resnet50 and vgg16.

This #PR includes the following contents:

  • experiment result presentation & analysis
  • source code and instruction fore re-implementation
  • experiment result in json format

For ActivationAPoZRankFilterPruner, ActivationMeanRankFilterPruner & AGPPruner, I plan to add them in this benchmark after refactoring.

@colorjam
Copy link
Contributor

colorjam commented Jul 28, 2020

are .json files necessary? suggest merge them into one file

@@ -0,0 +1,88 @@
To provide an initial insight into the performance of various channel pruning algorithms,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to use term channel pruning and filter pruning to represent different pruning methods:
channel pruning: prune input channel
filter pruning: prune output channel
Our current pruner implemented as filter pruning, we may support channel pruning in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest to use term channel pruning and filter pruning to represent different pruning methods:
channel pruning: prune input channel
filter pruning: prune output channel
Our current pruner implemented as filter pruning, we may support channel pruning in the future.

Thx, fixed.

print('Speed up model saved to %s', args.experiment_data_dir)

with open(os.path.join(args.experiment_data_dir, 'performance.json'), 'w+') as f:
with open(os.path.join(args.experiment_data_dir, 'result.json'), 'w+') as f:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we save all results into one json? maybe use a key to identify different experiment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we save all results into one json? maybe use a key to identify different experiment?

Merged to 3 files. One file by every dataset/model combination.


CIFAR-10, ResNet50:

![](../../../examples/model_compress/experiment_result/img/performance_comparison_resnet50.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it easier to read if the X axis is changed to sparsity/flops ratio ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see big difference...

train(args, model, device, train_loader,
criterion, optimizer, epoch)
scheduler.step()
if args.load_pretrained_model:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have the LeNet benchmark result?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No

def get_input_size(dataset):
if dataset == 'mnist':
input_size = (1, 1, 28, 28)
elif dataset in ['cifar10', 'imagenet']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why input size for imagenet is 32x32? do we resize the image for imagenet?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an error here, changed to 256*256. Imagenet experiment is not performed in this PR.

@@ -0,0 +1,88 @@
To provide an initial insight into the performance of various channel pruning algorithms,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to add title for this doc, for example "Comparison of Pruning Algorithms"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to add title for this doc, for example "Comparison of Pruning Algorithms"
Title 'Comparison of Filter Pruning Algorithms' added

- One-shot pruners: L1Filter, L2Filter, FPGMPruner
- Only **channel pruning** performances are compared here.

For the auto-pruners, `L1FilterPruner` is used as the base algorithm. That is to say, after the sparsities distribution among the layers is decided by the scheduling algorithm, `L1FilterPruner` is used to performn real pruning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not clear what are the auto-pruners? Pruners with scheduling?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it is not clear what are the auto-pruners? Pruners with scheduling?

Yes, fixed

* Pruners:
- These pruners are included:
- Pruners with scheduling : SimulatedAnnealing, NetAdapt, AutoCompress
- One-shot pruners: L1Filter, L2Filter, FPGMPruner
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to mention how to set each layer's sparsity for these one-shot pruners

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to mention how to set each layer's sparsity for these one-shot pruners

Added here.


CIFAR-10, VGG16:

![](../../../examples/model_compress/experiment_result/img/performance_comparison_vgg16.png)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to use different node types (not just different colors) for each pruner.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

better to use different node types (not just different colors) for each pruner.

Fixed

From the experiment result, we get the following conclusions:

* Given the constraint on the number of parameters, the pruners with scheduling ( `AutoCompress` , `SimualatedAnnealing` ) performs better than the others when the constraint is strict. However, they have no such advantage in FLOPs/Performances comparison since only number of parameters constraint is considered in the optimization process;
* The basic algorithms `L1FilterPruner` , `L2FilterPruner` , `FPGMPruner` performs very similarly in these experiments;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good summary. for this one, can I say for some model/dataset one-shot pruners perform similar to the pruners with scheduling, even though they set the same sparsity for every layer?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good summary. for this one, can I say for some model/dataset one-shot pruners perform similar to the pruners with scheduling, even though they set the same sparsity for every layer?

We cannot say pruners with scheduling and basic pruners are similar generally because they are different given different evaluation metrics.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't FLOPs related to just filter number and channel number(which are actually the same thing)? And filter number also represents parameters. How does it not improve FLOPs? I can imagine some architecture like xception won't improve because filter number reduction may not affect channel number. But it doesn't make sense for VGG16. Is it the same situation? Please correct me if I am wrong 😄

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't FLOPs related to just filter number and channel number(which are actually the same thing)? And filter number also represents parameters. How does it not improve FLOPs? I can imagine some architecture like xception won't improve because filter number reduction may not affect channel number. But it doesn't make sense for VGG16. Is it the same situation? Please correct me if I am wrong 😄

Thanks for your reply. FLOPs is also related to image resolution, the input size of different layers are different.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh thanks for reminding me. Aren't Input size/feature map size almost the same size no matter which pruner? If so, the FLOPs should be correspond to params.


### Implementation Details

* The experiment results are all collected with the default configuration of the pruners in nni.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by 'default configuration'?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do you mean by 'default configuration'?

Explained.

@@ -22,5 +22,6 @@ For details, please refer to the following tutorials:
Automatic Model Compression <Compressor/AutoCompression>
Model Speedup <Compressor/ModelSpeedup>
Compression Utilities <Compressor/CompressionUtils>
Compression Benchmark <Compressor/Benchmark>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not benchmark, it is benchmark result. So better to move it to Use Cases and Solutions under Performance measurement, comparison and analysis

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not benchmark, it is benchmark result. So better to move it to Use Cases and Solutions under Performance measurement, comparison and analysis

Exactly, thanks

@@ -0,0 +1,93 @@
import argparse
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the folder name experiment_result can be changed to comparison_of_pruners

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the folder name experiment_result can be changed to comparison_of_pruners

ok, thx

@suiguoxin
Copy link
Member Author

are .json files necessary? suggest merge them into one file

Merged to 3 files. One file by every dataset/model combination.

@ultmaster ultmaster merged commit accb40f into microsoft:master Aug 11, 2020
LovPe pushed a commit to LovPe/nni that referenced this pull request Aug 17, 2020
@suiguoxin suiguoxin deleted the benchmark branch August 20, 2020 08:45
@lianxintao
Copy link

lianxintao commented Sep 29, 2020

Hello, To use auto_pruners_torch.py in NNI, many settings need to pass argparse.ArgumentParser, can you provide these setting lists for pruners such as AutoCompressPruner, L1FilterPruner, etc., so that the experimental results can be reproduced on NNI. Thank you.

@suiguoxin
Copy link
Member Author

Hello, To use auto_pruners_torch.py in NNI, many settings need to pass argparse.ArgumentParser, can you provide these setting lists for pruners such as AutoCompressPruner, L1FilterPruner, etc., so that the experimental results can be reproduced on NNI. Thank you.

Thanks for your question. To reproduce the results that we present in nni, just use the default config for each pruner. Please refer to Implementation Details. Notice that the performances of your original models (model without pruning) may slightly vary from ours.

@lianxintao
Copy link

Thank you for your reply, but the results of this experiment are disappointing. I wonder if you have conducted similar experiments on more complex tasks such as target detection, semantic segmentation.Is it possible that the experimental results will be different for more complex tasks.

@QuanluZhang
Copy link
Contributor

@lianxintao , we agree that the benchmarking results can be further enriched and improved. We will keep improving it. We highly encourage external contributors to contribute more benchmarking results and good-performing compression algorithms.

@lianxintao
Copy link

@lianxintao , we agree that the benchmarking results can be further enriched and improved. We will keep improving it. We highly encourage external contributors to contribute more benchmarking results and good-performing compression algorithms.
Oh, I see, thanks for your reply.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants