This repository has been archived by the owner on Sep 18, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
fix pruner bugs and add model compression README (#1624)
* fix builtin pruners bug * use type_as * fix pruner bugs and add model compression README * fix example bugs * add AutoCompression.md and remove sensitive pruner * fix tf pruner bugs * update overview * Pruner.md
- Loading branch information
1 parent
8f778aa
commit 9d468d2
Showing
11 changed files
with
336 additions
and
235 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,118 @@ | ||
# Automatic Model Compression on NNI | ||
|
||
TBD. | ||
It's convenient to implement auto model compression with NNI compression and NNI tuners | ||
|
||
## First, model compression with NNI | ||
|
||
You can easily compress a model with NNI compression. Take pruning for example, you can prune a pretrained model with LevelPruner like this | ||
|
||
```python | ||
from nni.compression.torch import LevelPruner | ||
config_list = [{ 'sparsity': 0.8, 'op_types': 'default' }] | ||
pruner = LevelPruner(config_list) | ||
pruner(model) | ||
``` | ||
|
||
```{ 'sparsity': 0.8, 'op_types': 'default' }```means that **all layers with weight will be compressed with the same 0.8 sparsity**. When ```pruner(model)``` called, the model is compressed with masks and after that you can normally fine tune this model and **pruned weights won't be updated** which have been masked. | ||
|
||
## Then, make this automatic | ||
|
||
The previous example manually choosed LevelPruner and pruned all layers with the same sparsity, this is obviously sub-optimal because different layers may have different redundancy. Layer sparsity should be carefully tuned to achieve least model performance degradation and this can be done with NNI tuners. | ||
|
||
The first thing we need to do is to design a search space, here we use a nested search space which contains choosing pruning algorithm and optimizing layer sparsity. | ||
|
||
```json | ||
{ | ||
"prune_method": { | ||
"_type": "choice", | ||
"_value": [ | ||
{ | ||
"_name": "agp", | ||
"conv0_sparsity": { | ||
"_type": "uniform", | ||
"_value": [ | ||
0.1, | ||
0.9 | ||
] | ||
}, | ||
"conv1_sparsity": { | ||
"_type": "uniform", | ||
"_value": [ | ||
0.1, | ||
0.9 | ||
] | ||
}, | ||
}, | ||
{ | ||
"_name": "level", | ||
"conv0_sparsity": { | ||
"_type": "uniform", | ||
"_value": [ | ||
0.1, | ||
0.9 | ||
] | ||
}, | ||
"conv1_sparsity": { | ||
"_type": "uniform", | ||
"_value": [ | ||
0.01, | ||
0.9 | ||
] | ||
}, | ||
} | ||
] | ||
} | ||
} | ||
``` | ||
|
||
Then we need to modify our codes for few lines | ||
|
||
```python | ||
import nni | ||
from nni.compression.torch import * | ||
params = nni.get_parameters() | ||
conv0_sparsity = params['prune_method']['conv0_sparsity'] | ||
conv1_sparsity = params['prune_method']['conv1_sparsity'] | ||
# these raw sparsity should be scaled if you need total sparsity constrained | ||
config_list_level = [{ 'sparsity': conv0_sparsity, 'op_name': 'conv0' }, | ||
{ 'sparsity': conv1_sparsity, 'op_name': 'conv1' }] | ||
config_list_agp = [{'initial_sparsity': 0, 'final_sparsity': conv0_sparsity, | ||
'start_epoch': 0, 'end_epoch': 3, | ||
'frequency': 1,'op_name': 'conv0' }, | ||
{'initial_sparsity': 0, 'final_sparsity': conv1_sparsity, | ||
'start_epoch': 0, 'end_epoch': 3, | ||
'frequency': 1,'op_name': 'conv1' },] | ||
PRUNERS = {'level':LevelPruner(config_list_level),'agp':AGP_Pruner(config_list_agp)} | ||
pruner = PRUNERS(params['prune_method']['_name']) | ||
pruner(model) | ||
... # fine tuning | ||
acc = evaluate(model) # evaluation | ||
nni.report_final_results(acc) | ||
``` | ||
|
||
Last, define our task and automatically tuning pruning methods with layers sparsity | ||
|
||
```yaml | ||
authorName: default | ||
experimentName: Auto_Compression | ||
trialConcurrency: 2 | ||
maxExecDuration: 100h | ||
maxTrialNum: 500 | ||
#choice: local, remote, pai | ||
trainingServicePlatform: local | ||
#choice: true, false | ||
useAnnotation: False | ||
searchSpacePath: search_space.json | ||
tuner: | ||
#choice: TPE, Random, Anneal... | ||
builtinTunerName: TPE | ||
classArgs: | ||
#choice: maximize, minimize | ||
optimize_mode: maximize | ||
trial: | ||
command: bash run_prune.sh | ||
codeDir: . | ||
gpuNum: 1 | ||
|
||
``` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Run model compression examples | ||
|
||
You can run these examples easily like this, take torch pruning for example | ||
|
||
```bash | ||
python main_torch_pruner.py | ||
``` | ||
|
||
This example uses AGP Pruner. Initiating a pruner needs a user provided configuration which can be provided in two ways: | ||
|
||
- By reading ```configure_example.yaml```, this can make code clean when your configuration is complicated | ||
- Directly config in your codes | ||
|
||
In our example, we simply config model compression in our codes like this | ||
|
||
```python | ||
configure_list = [{ | ||
'initial_sparsity': 0, | ||
'final_sparsity': 0.8, | ||
'start_epoch': 0, | ||
'end_epoch': 10, | ||
'frequency': 1, | ||
'op_type': 'default' | ||
}] | ||
pruner = AGP_Pruner(configure_list) | ||
``` | ||
|
||
When ```pruner(model)``` is called, your model is injected with masks as embedded operations. For example, a layer takes a weight as input, we will insert an operation between the weight and the layer, this operation takes the weight as input and outputs a new weight applied by the mask. Thus, the masks are applied at any time the computation goes through the operations. You can fine-tune your model **without** any modifications. | ||
|
||
```python | ||
for epoch in range(10): | ||
# update_epoch is for pruner to be aware of epochs, so that it could adjust masks during training. | ||
pruner.update_epoch(epoch) | ||
print('# Epoch {} #'.format(epoch)) | ||
train(model, device, train_loader, optimizer) | ||
test(model, device, test_loader) | ||
``` | ||
|
||
When fine tuning finished, pruned weights are all masked and you can get masks like this | ||
|
||
``` | ||
masks = pruner.mask_list | ||
layer_name = xxx | ||
mask = masks[layer_name] | ||
``` | ||
|
||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
AGPruner: | ||
config: | ||
- | ||
start_epoch: 1 | ||
start_epoch: 0 | ||
end_epoch: 10 | ||
frequency: 1 | ||
initial_sparsity: 0.05 | ||
|
Oops, something went wrong.