This repository has been archived by the owner on Sep 18, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
fix pruner bugs and add model compression README #1624
Merged
+336
−235
Merged
Changes from 8 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
a116a81
fix builtin pruners bug
fb73513
use type_as
6805816
fix pruner bugs and add model compression README
0036534
Merge branch 'v1.1' into v1.1
tanglang96 ee7ab1d
Merge branch 'v1.1' into v1.1
98a24fb
fix example bugs
1574421
Merge branch 'v1.1' of https://github.com/tanglang96/nni into v1.1
3b14db4
add AutoCompression.md and remove sensitive pruner
tanglang96 4158105
fix tf pruner bugs
tanglang96 3f0d034
update overview
tanglang96 17f992e
Merge branch 'v1.1' into v1.1
tanglang96 5fdccee
Pruner.md
tanglang96 536aefe
Merge branch 'v1.1' of https://github.com/tanglang96/nni into v1.1
tanglang96 ed553a2
Merge branch 'v1.1' into v1.1
liuzhe-lz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,118 @@ | ||
# Automatic Model Compression on NNI | ||
|
||
TBD. | ||
It's convenient to implement auto model compression with NNI compression and NNI tuners | ||
|
||
## First, model compression with NNI | ||
|
||
You can easily compress a model with NNI compression. Take pruning for example, you can prune a pretrained model with LevelPruner like this | ||
|
||
```python | ||
from nni.compression.torch import LevelPruner | ||
config_list = [{ 'sparsity': 0.8, 'op_types': 'default' }] | ||
pruner = LevelPruner(config_list) | ||
pruner(model) | ||
``` | ||
|
||
```{ 'sparsity': 0.8, 'op_types': 'default' }```means that **all layers with weight will be compressed with the same 0.8 sparsity**. When ```pruner(model)``` called, the model is compressed with masks and after that you can normally fine tune this model and **pruned weights won't be updated** which have been masked. | ||
|
||
## Then, make this automatic | ||
|
||
The previous example manually choosed LevelPruner and pruned all layers with the same sparsity, this is obviously sub-optimal because different layers may have different redundancy. Layer sparsity should be carefully tuned to achieve least model performance degradation and this can be done with NNI tuners. | ||
|
||
The first thing we need to do is to design a search space, here we use a nested search space which contains choosing pruning algorithm and optimizing layer sparsity. | ||
|
||
```json | ||
{ | ||
"prune_method": { | ||
"_type": "choice", | ||
"_value": [ | ||
{ | ||
"_name": "agp", | ||
"conv0_sparsity": { | ||
"_type": "uniform", | ||
"_value": [ | ||
0.1, | ||
0.9 | ||
] | ||
}, | ||
"conv1_sparsity": { | ||
"_type": "uniform", | ||
"_value": [ | ||
0.1, | ||
0.9 | ||
] | ||
}, | ||
}, | ||
{ | ||
"_name": "level", | ||
"conv0_sparsity": { | ||
"_type": "uniform", | ||
"_value": [ | ||
0.1, | ||
0.9 | ||
] | ||
}, | ||
"conv1_sparsity": { | ||
"_type": "uniform", | ||
"_value": [ | ||
0.01, | ||
0.9 | ||
] | ||
}, | ||
} | ||
] | ||
} | ||
} | ||
``` | ||
|
||
Then we need to modify our codes for few lines | ||
|
||
```python | ||
import nni | ||
from nni.compression.torch import * | ||
params = nni.get_parameters() | ||
conv0_sparsity = params['prune_method']['conv0_sparsity'] | ||
conv1_sparsity = params['prune_method']['conv1_sparsity'] | ||
# these raw sparsity should be scaled if you need total sparsity constrained | ||
config_list_level = [{ 'sparsity': conv0_sparsity, 'op_name': 'conv0' }, | ||
{ 'sparsity': conv1_sparsity, 'op_name': 'conv1' }] | ||
config_list_agp = [{'initial_sparsity': 0, 'final_sparsity': conv0_sparsity, | ||
'start_epoch': 0, 'end_epoch': 3, | ||
'frequency': 1,'op_name': 'conv0' }, | ||
{'initial_sparsity': 0, 'final_sparsity': conv1_sparsity, | ||
'start_epoch': 0, 'end_epoch': 3, | ||
'frequency': 1,'op_name': 'conv1' },] | ||
PRUNERS = {'level':LevelPruner(config_list_level),'agp':AGP_Pruner(config_list_agp)} | ||
pruner = PRUNERS(params['prune_method']['_name']) | ||
pruner(model) | ||
... # fine tuning | ||
acc = evaluate(model) # evaluation | ||
nni.report_final_results(acc) | ||
``` | ||
|
||
Last, define our task and automatically tuning pruning methods with layers sparsity | ||
|
||
```yaml | ||
authorName: default | ||
experimentName: Auto_Compression | ||
trialConcurrency: 2 | ||
maxExecDuration: 100h | ||
maxTrialNum: 500 | ||
#choice: local, remote, pai | ||
trainingServicePlatform: local | ||
#choice: true, false | ||
useAnnotation: False | ||
searchSpacePath: search_space.json | ||
tuner: | ||
#choice: TPE, Random, Anneal... | ||
builtinTunerName: TPE | ||
classArgs: | ||
#choice: maximize, minimize | ||
optimize_mode: maximize | ||
trial: | ||
command: bash run_prune.sh | ||
codeDir: . | ||
gpuNum: 1 | ||
|
||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -48,7 +48,7 @@ from nni.compression.tensorflow import AGP_Pruner | |
config_list = [{ | ||
'initial_sparsity': 0, | ||
'final_sparsity': 0.8, | ||
'start_epoch': 1, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what is the meaning of start_epoch=0, end_epoch=10? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @QuanluZhang start_epoch=0, end_epoch=10 means pruning starts from epoch 0 and ends at 10, previous default start_epoch=1 in the algorithm but we usually start from 0, so I modified them all. |
||
'start_epoch': 0, | ||
'end_epoch': 10, | ||
'frequency': 1, | ||
'op_types': 'default' | ||
|
@@ -62,7 +62,7 @@ from nni.compression.torch import AGP_Pruner | |
config_list = [{ | ||
'initial_sparsity': 0, | ||
'final_sparsity': 0.8, | ||
'start_epoch': 1, | ||
'start_epoch': 0, | ||
'end_epoch': 10, | ||
'frequency': 1, | ||
'op_types': 'default' | ||
|
@@ -92,41 +92,3 @@ You can view example for more information | |
|
||
*** | ||
|
||
## Sensitivity Pruner | ||
In [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626), author Song Han and provide an algorithm to find the sensitivity of each layer and set the pruning threshold to each layer. | ||
|
||
>We used the sensitivity results to find each layer’s threshold: for example, the smallest threshold was applied to the most sensitive layer, which is the first convolutional layer... The pruning threshold is chosen as a quality parameter multiplied by the standard deviation of a layer’s weights | ||
|
||
### Usage | ||
You can prune weight step by step and reach one target sparsity by Sensitivity Pruner with the code below. | ||
|
||
Tensorflow code | ||
```python | ||
from nni.compression.tensorflow import SensitivityPruner | ||
config_list = [{ 'sparsity':0.8, 'op_types': 'default' }] | ||
pruner = SensitivityPruner(config_list) | ||
pruner(tf.get_default_graph()) | ||
``` | ||
PyTorch code | ||
```python | ||
from nni.compression.torch import SensitivityPruner | ||
config_list = [{ 'sparsity':0.8, 'op_types': 'default' }] | ||
pruner = SensitivityPruner(config_list) | ||
pruner(model) | ||
``` | ||
Like AGP Pruner, you should update mask information every epoch by adding code below | ||
|
||
Tensorflow code | ||
```python | ||
pruner.update_epoch(epoch, sess) | ||
``` | ||
PyTorch code | ||
```python | ||
pruner.update_epoch(epoch) | ||
``` | ||
You can view example for more information | ||
|
||
#### User configuration for Sensitivity Pruner | ||
* **sparsity:** This is to specify the sparsity operations to be compressed to | ||
|
||
*** |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Run model compression examples | ||
|
||
You can run these examples easily like this, take torch pruning for example | ||
|
||
```bash | ||
python main_torch_pruner.py | ||
``` | ||
|
||
This example uses AGP Pruner. Initiating a pruner needs a user provided configuration which can be provided in two ways: | ||
|
||
- By reading ```configure_example.yaml```, this can make code clean when your configuration is complicated | ||
- Directly config in your codes | ||
|
||
In our example, we simply config model compression in our codes like this | ||
|
||
```python | ||
configure_list = [{ | ||
'initial_sparsity': 0, | ||
'final_sparsity': 0.8, | ||
'start_epoch': 0, | ||
'end_epoch': 10, | ||
'frequency': 1, | ||
'op_type': 'default' | ||
}] | ||
pruner = AGP_Pruner(configure_list) | ||
``` | ||
|
||
When ```pruner(model)``` is called, your model is injected with masks as embedded operations. For example, a layer takes a weight as input, we will insert an operation between the weight and the layer, this operation takes the weight as input and outputs a new weight applied by the mask. Thus, the masks are applied at any time the computation goes through the operations. You can fine-tune your model **without** any modifications. | ||
|
||
```python | ||
for epoch in range(10): | ||
# update_epoch is for pruner to be aware of epochs, so that it could adjust masks during training. | ||
pruner.update_epoch(epoch) | ||
print('# Epoch {} #'.format(epoch)) | ||
train(model, device, train_loader, optimizer) | ||
test(model, device, test_loader) | ||
``` | ||
|
||
QuanluZhang marked this conversation as resolved.
Show resolved
Hide resolved
|
||
When fine tuning finished, pruned weights are all masked and you can get masks like this | ||
|
||
``` | ||
masks = pruner.mask_list | ||
layer_name = xxx | ||
mask = masks[layer_name] | ||
``` | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
AGPruner: | ||
config: | ||
- | ||
start_epoch: 1 | ||
start_epoch: 0 | ||
end_epoch: 10 | ||
frequency: 1 | ||
initial_sparsity: 0.05 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“We have provided two naive compression algorithms and four popular ones for users, including three pruning algorithms and three quantization algorithms:” this line should also be updated