This repository has been archived by the owner on Sep 18, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Constraint-aware one-shot pruners #2657
Merged
Merged
Changes from 109 commits
Commits
Show all changes
111 commits
Select commit
Hold shift + click to select a range
3515b29
constraint-aware pruner
1c84925
Constrained Structure pruner.
3ed53cb
Constrained pruner.
a415206
Constrained one-shot pruner.
5fa19fc
Constraint aware pruner.
bed63fe
Constrained one-shot pruner.
12c24e5
Constrained one shot pruner.
45e62c2
Constrained-aware one-shot pruner.
aeb8aaf
Update the doc.
70dac7c
reformat the unit test.
5362328
Add test case for constrained-aware pruners.
bd375a4
Remove the unnecessary log function.
9d0fb79
fix pylint errors.
211a047
Add the docs for the constrained pruners.
a11cf48
empty commit
899b6f9
Merge branch 'master' of https://github.com/microsoft/nni into constr…
439426f
Add an accuracy comparsion benchmark for Constrained Pruner.
a08dac6
update
39eadcc
Merge branch 'master' of https://github.com/microsoft/nni into constr…
fab3315
update the benchmark
4bc3c48
Update constrained pruner benchmark.
6c32a7c
update
9923ea1
update
5a35c8e
update
00eb006
fix a bug.
65676ad
update.
50d3468
update
d358cb2
update
174233c
tmp branch
b36589c
update
482e500
update
e5b262e
support imagenet for auto_pruners_torch
2591abc
update
0ee36b8
and a switch for the constrained pruner
19df319
update
b2f03b7
update
e263349
update
fca51bf
update
0fabf61
update
1ac7050
update
e55fe10
bug in the sm pruner
c1f9a45
update
9aa1558
add one more mile stone
e9b39fb
fix a bug caused by the expand and clone
d7cc452
add a constrained switch for the auto compress pruner
9183b93
add support for imagenet
e3226ee
unfinish
eca8577
attention pruner unfinished
ad58382
update
90c1c47
merge from master
12c289e
update
4029b1c
update
f3098fb
update
1fd32f5
update
755ce8b
update
1639867
updata
fe78c59
update
5c54faf
update
f5d4060
update
e5bcd6a
add no dependency
f625f81
use softmax in the attention pruner
e5f3e01
update
fcc984c
update
13d4f38
update
b7bac26
update
b3d1ac9
update
90d1e45
add the unit test.
cf7c936
update
85fc79f
update
b593ba3
update
6682cb3
update
25beb8f
update doc string
649ecfd
update the documentation
fb09b3f
Remove the attention pruner.
da5525a
remove the mobilenet_v2 for cifar10
0920efe
reset the auto_pruners_torch.py
74f4ec4
update the example to the new interface.
45966c9
fix pylint errors
e02fb90
update the example
cf626f8
fix a bug when counting flops
f444ebc
add several new one-shot pruners
a0d1e97
support more one_shot prunersw
59f0fe1
test
68e4563
fix a bug in the original apoz pruner
bddb70f
update
646324a
update
c7ba084
update
7a54cd6
update
c42de2c
update
9b9bb09
update
82f4fdb
update the unit test
8438ff5
update the examples
f9028f5
rm the test_dependency_aware
8afa53a
update
5c6d60e
update
78f3fc6
update the doc
2125e98
update rst
ae62671
Merge branch 'master' of https://github.com/microsoft/nni into constr…
8963a01
update
3691f23
Merge branch 'master' of https://github.com/microsoft/nni into constr…
29029ee
update doc
a8f3f74
update the doc
9bf7667
update doc
b7b7150
update the doc
e68cec0
update
c9c5329
update
ef51a10
update
4acabaa
update
80aec67
add some evaluation results
3a73e30
update
d5bbe48
update the doc
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,107 @@ | ||
# Dependency-aware Mode for Filter Pruning | ||
|
||
Currently, we have several filter pruning algorithm for the convolutional layers: FPGM Pruner, L1Filter Pruner, L2Filter Pruner, Activation APoZ Rank Filter Pruner, Activation Mean Rank Filter Pruner, Taylor FO On Weight Pruner. In these filter pruning algorithms, the pruner will prune each convolutional layer separately. While pruning a convolution layer, the algorithm will quantify the importance of each filter based on some specific rules(such as l1-norm), and prune the less important filters. | ||
|
||
As [dependency analysis utils](./CompressionUtils.md) shows, if the output channels of two convolutional layers(conv1, conv2) are added together, then these two conv layers have channel dependency with each other(more details please see [Compression Utils](./CompressionUtils.md)). Take the following figure as an example. | ||
![](../../img/mask_conflict.jpg) | ||
|
||
If we prune the first 50% of output channels(filters) for conv1, and prune the last 50% of output channels for conv2. Although both layers have pruned 50% of the filters, the speedup module still needs to add zeros to align the output channels. In this case, we cannot harvest the speed benefit from the model pruning. | ||
|
||
|
||
To better gain the speed benefit of the model pruning, we add a dependency-aware mode for the Filter Pruner. In the dependency-aware mode, the pruner prunes the model not only based on the l1 norm of each filter, but also the topology of the whole network architecture. | ||
|
||
In the dependency-aware mode(`dependency_aware` is set `True`), the pruner will try to prune the same output channels for the layers that have the channel dependencies with each other, as shown in the following figure. | ||
|
||
![](../../img/dependency-aware.jpg) | ||
|
||
Take the dependency-aware mode of L1Filter Pruner as an example. Specifically, the pruner will calculate the L1 norm (for example) sum of all the layers in the dependency set for each channel. Obviously, the number of channels that can actually be pruned of this dependency set in the end is determined by the minimum sparsity of layers in this dependency set(denoted by `min_sparsity`). According to the L1 norm sum of each channel, the pruner will prune the same `min_sparsity` channels for all the layers. Next, the pruner will additionally prune `sparsity` - `min_sparsity` channels for each convolutional layer based on its own L1 norm of each channel. For example, suppose the output channels of `conv1` , `conv2` are added together and the configured sparsities of `conv1` and `conv2` are 0.3, 0.2 respectively. In this case, the `dependency-aware pruner` will | ||
|
||
- First, prune the same 20% of channels for `conv1` and `conv2` according to L1 norm sum of `conv1` and `conv2`. | ||
- Second, the pruner will additionally prune 10% channels for `conv1` according to the L1 norm of each channel of `conv1`. | ||
|
||
In addition, for the convolutional layers that have more than one filter group, `dependency-aware pruner` will also try to prune the same number of the channels for each filter group. Overall, this pruner will prune the model according to the L1 norm of each filter and try to meet the topological constrains(channel dependency, etc) to improve the final speed gain after the speedup process. | ||
|
||
In the dependency-aware mode, the pruner will provide a better speed gain from the model pruning. | ||
|
||
## Usage | ||
In this section, we will show how to enable the dependency-aware mode for the filter pruner. Currently, only the one-shot pruners such as FPGM Pruner, L1Filter Pruner, L2Filter Pruner, Activation APoZ Rank Filter Pruner, Activation Mean Rank Filter Pruner, Taylor FO On Weight Pruner, support the dependency-aware mode. | ||
|
||
To enable the dependency-aware mode for `L1FilterPruner`: | ||
```python | ||
from nni.compression.torch import L1FilterPruner | ||
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }] | ||
# dummy_input is necessary for the dependency_aware mode | ||
dummy_input = torch.ones(1, 3, 224, 224).cuda() | ||
pruner = L1FilterPruner(model, config_list, dependency_aware=True, dummy_input=dummy_input) | ||
pruner.compress() | ||
``` | ||
|
||
To enable the dependency-aware mode for `L2FilterPruner`: | ||
```python | ||
from nni.compression.torch import L2FilterPruner | ||
config_list = [{ 'sparsity': 0.8, 'op_types': ['Conv2d'] }] | ||
# dummy_input is necessary for the dependency_aware mode | ||
dummy_input = torch.ones(1, 3, 224, 224).cuda() | ||
pruner = L2FilterPruner(model, config_list, dependency_aware=True, dummy_input=dummy_input) | ||
pruner.compress() | ||
``` | ||
|
||
To enable the dependency-aware mode for `FPGMPruner`: | ||
```python | ||
from nni.compression.torch import FPGMPruner | ||
config_list = [{ | ||
'sparsity': 0.5, | ||
'op_types': ['Conv2d'] | ||
}] | ||
# dummy_input is necessary for the dependency_aware mode | ||
dummy_input = torch.ones(1, 3, 224, 224).cuda() | ||
pruner = FPGMPruner(model, config_list, dependency_aware=True, dummy_input=dummy_input) | ||
pruner.compress() | ||
``` | ||
|
||
To enable the dependency-aware mode for `ActivationAPoZRankFilterPruner` | ||
```python | ||
from nni.compression.torch import ActivationAPoZRankFilterPruner | ||
config_list = [{ | ||
'sparsity': 0.5, | ||
'op_types': ['Conv2d'] | ||
}] | ||
# dummy_input is necessary for the dependency_aware mode | ||
dummy_input = torch.ones(1, 3, 224, 224).cuda() | ||
pruner = ActivationAPoZRankFilterPruner(model, config_list, statistics_batch_num=1, , dependency_aware=True, dummy_input=dummy_input) | ||
pruner.compress() | ||
``` | ||
|
||
To enable the dependency-aware mode for `ActivationMeanRankFilterPruner`: | ||
|
||
```python | ||
from nni.compression.torch import ActivationMeanRankFilterPruner | ||
config_list = [{ | ||
'sparsity': 0.5, | ||
'op_types': ['Conv2d'] | ||
}] | ||
# dummy_input is necessary for the dependency-aware mode and the | ||
# dummy_input should be on the same device with the model | ||
dummy_input = torch.ones(1, 3, 224, 224).cuda() | ||
pruner = ActivationMeanRankFilterPruner(model, config_list, statistics_batch_num=1, dependency_aware=True, dummy_input=dummy_input) | ||
pruner.compress() | ||
``` | ||
|
||
To enable the dependency-aware mode for `TaylorFOWeightFilterPruner`: | ||
```python | ||
from nni.compression.torch import TaylorFOWeightFilterPruner | ||
config_list = [{ | ||
'sparsity': 0.5, | ||
'op_types': ['Conv2d'] | ||
}] | ||
dummy_input = torch.ones(1, 3, 224, 224).cuda() | ||
pruner = TaylorFOWeightFilterPruner(model, config_list, statistics_batch_num=1, dependency_aware=True, dummy_input=dummy_input) | ||
pruner.compress() | ||
``` | ||
|
||
## Evaluation | ||
In order to compare the performance of the pruner with or without the dependency-aware mode, we use L1FilterPruner to prune the Mobilenet_v2 separately when the dependency-aware mode is turned on and off. To simplify the experiment, we use the uniform pruning which means we allocate the same sparsity for all convolutional layers in the model. | ||
We trained a Mobilenet_v2 model on the cifar10 dataset and prune the model based on this pretrained checkpoint. The following figure shows the accuracy and FLOPs of the model pruned by different pruners. | ||
![](../../img/mobilev2_l1_cifar.jpg) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this figure looks great! |
||
|
||
In the figure, the `Dependency-aware` represents the L1FilterPruner with dependency-aware mode enabled. `L1 Filter` is the normal `L1FilterPruner` without the dependency-aware mode, and the `No-Dependency` means pruner only prunes the layers that has no channel dependency with other layers. As we can see in the figure, when the dependency-aware mode enabled, the pruner can bring higher accuracy under the same Flops. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
################# | ||
Pruning | ||
################# | ||
|
||
NNI provides several pruning algorithms that support fine-grained weight pruning and structural filter pruning. | ||
It supports Tensorflow and PyTorch with unified interface. | ||
For users to prune their models, they only need to add several lines in their code. | ||
For the structural filter pruning, NNI also provides a dependency-aware mode. In the dependency-aware mode, the | ||
filter pruner will get better speed gain after the speedup. | ||
|
||
For details, please refer to the following tutorials: | ||
|
||
.. toctree:: | ||
:maxdepth: 2 | ||
|
||
Pruners <Compressor/Pruner> | ||
Dependency Aware Mode <Compressor/DependencyAware> |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can simplify this example code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, how about we just keep the example of
L1FilterPruner
and remove the others?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good point, and above this line
pruner = L1FilterPruner(model, config_list, dependency_aware=True, dummy_input=dummy_input)
you can use comments to show the usage of other pruners, one comment line for each pruner