-
Notifications
You must be signed in to change notification settings - Fork 1.8k
[Model Compression] auto compression #3631
Changes from all commits
668ad4b
9ae11f7
070111b
b882f0d
d15f49d
03d21cf
dea696f
a85abc7
8544847
8ef3e0d
5f185a4
cc85d47
c6ee7a3
627ec1c
ea692b1
ddc3c06
6339a53
75af835
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
Auto Compression with NNI Experiment | ||
==================================== | ||
|
||
If you want to compress your model, but don't know what compression algorithm to choose, or don't know what sparsity is suitable for your model, or just want to try more possibilities, auto compression may help you. | ||
Users can choose different compression algorithms and define the algorithms' search space, then auto compression will launch an NNI experiment and try different compression algorithms with varying sparsity automatically. | ||
Of course, in addition to the sparsity rate, users can also introduce other related parameters into the search space. | ||
If you don't know what is search space or how to write search space, `this <./Tutorial/SearchSpaceSpec.rst>`__ is for your reference. | ||
Auto compression using experience is similar to the NNI experiment in python. | ||
The main differences are as follows: | ||
|
||
* Use a generator to help generate search space object. | ||
* Need to provide the model to be compressed, and the model should have already been pre-trained. | ||
* No need to set ``trial_command``, additional need to set ``auto_compress_module`` as ``AutoCompressionExperiment`` input. | ||
|
||
Generate search space | ||
--------------------- | ||
|
||
Due to the extensive use of nested search space, we recommend a using generator to configure search space. | ||
The following is an example. Using ``add_config()`` add subconfig, then ``dumps()`` search space dict. | ||
|
||
.. code-block:: python | ||
|
||
from nni.algorithms.compression.pytorch.auto_compress import AutoCompressionSearchSpaceGenerator | ||
|
||
generator = AutoCompressionSearchSpaceGenerator() | ||
generator.add_config('level', [ | ||
{ | ||
"sparsity": { | ||
"_type": "uniform", | ||
"_value": [0.01, 0.99] | ||
}, | ||
'op_types': ['default'] | ||
} | ||
]) | ||
generator.add_config('qat', [ | ||
{ | ||
'quant_types': ['weight', 'output'], | ||
'quant_bits': { | ||
'weight': 8, | ||
'output': 8 | ||
}, | ||
'op_types': ['Conv2d', 'Linear'] | ||
}]) | ||
|
||
search_space = generator.dumps() | ||
|
||
Now we support the following pruners and quantizers: | ||
|
||
.. code-block:: python | ||
|
||
PRUNER_DICT = { | ||
'level': LevelPruner, | ||
'slim': SlimPruner, | ||
'l1': L1FilterPruner, | ||
'l2': L2FilterPruner, | ||
'fpgm': FPGMPruner, | ||
'taylorfo': TaylorFOWeightFilterPruner, | ||
'apoz': ActivationAPoZRankFilterPruner, | ||
'mean_activation': ActivationMeanRankFilterPruner | ||
} | ||
|
||
QUANTIZER_DICT = { | ||
'naive': NaiveQuantizer, | ||
'qat': QAT_Quantizer, | ||
'dorefa': DoReFaQuantizer, | ||
'bnn': BNNQuantizer | ||
} | ||
|
||
Provide user model for compression | ||
---------------------------------- | ||
|
||
Users need to inherit ``AbstractAutoCompressionModule`` and override the abstract class function. | ||
|
||
.. code-block:: python | ||
|
||
from nni.algorithms.compression.pytorch.auto_compress import AbstractAutoCompressionModule | ||
|
||
class AutoCompressionModule(AbstractAutoCompressionModule): | ||
@classmethod | ||
def model(cls) -> nn.Module: | ||
... | ||
return _model | ||
|
||
@classmethod | ||
def evaluator(cls) -> Callable[[nn.Module], float]: | ||
... | ||
return _evaluator | ||
|
||
Users need to implement at least ``model()`` and ``evaluator()``. | ||
If you use iterative pruner, you need to additional implement ``optimizer_factory()``, ``criterion()`` and ``sparsifying_trainer()``. | ||
If you want to finetune the model after compression, you need to implement ``optimizer_factory()``, ``criterion()``, ``post_compress_finetuning_trainer()`` and ``post_compress_finetuning_epochs()``. | ||
The ``optimizer_factory()`` should return a factory function, the input is an iterable variable, i.e. your ``model.parameters()``, and the output is an optimizer instance. | ||
The two kinds of ``trainer()`` should return a trainer with input ``model, optimizer, criterion, current_epoch``. | ||
The full abstract interface refers to :githublink:`interface.py <nni/algorithms/compression/pytorch/auto_compress/interface.py>`. | ||
An example of ``AutoCompressionModule`` implementation refers to :githublink:`auto_compress_module.py <examples/model_compress/auto_compress/torch/auto_compress_module.py>`. | ||
|
||
Launch NNI experiment | ||
--------------------- | ||
|
||
Similar to launch from python, the difference is no need to set ``trial_command`` and put the user-provided ``AutoCompressionModule`` as ``AutoCompressionExperiment`` input. | ||
|
||
.. code-block:: python | ||
|
||
from pathlib import Path | ||
from nni.algorithms.compression.pytorch.auto_compress import AutoCompressionExperiment | ||
|
||
from auto_compress_module import AutoCompressionModule | ||
|
||
experiment = AutoCompressionExperiment(AutoCompressionModule, 'local') | ||
experiment.config.experiment_name = 'auto compression torch example' | ||
experiment.config.trial_concurrency = 1 | ||
experiment.config.max_trial_number = 10 | ||
experiment.config.search_space = search_space | ||
experiment.config.trial_code_directory = Path(__file__).parent | ||
experiment.config.tuner.name = 'TPE' | ||
experiment.config.tuner.class_args['optimize_mode'] = 'maximize' | ||
experiment.config.training_service.use_active_gpu = True | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If I am not mistaken, this feature is for users who want to try different model compression algorithms without many effort. I think some of they would be confused about the experiment config setting if they are not familiar with NNI. Maybe we should tell user what these experiment parameters are or refer to related NNI doc which introduces parameters in detail. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Good suggestion, trying to refactor and use the original config for less effort. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Refactor and now we can use |
||
|
||
experiment.run(8088) |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,129 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT license. | ||
|
||
from typing import Callable, Optional, Iterable | ||
|
||
import torch | ||
import torch.nn as nn | ||
import torch.optim as optim | ||
import torch.nn.functional as F | ||
from torchvision import datasets, transforms | ||
|
||
from nni.algorithms.compression.pytorch.auto_compress import AbstractAutoCompressionModule | ||
|
||
torch.manual_seed(1) | ||
|
||
class LeNet(nn.Module): | ||
def __init__(self): | ||
super(LeNet, self).__init__() | ||
self.conv1 = nn.Conv2d(1, 32, 3, 1) | ||
self.conv2 = nn.Conv2d(32, 64, 3, 1) | ||
self.dropout1 = nn.Dropout2d(0.25) | ||
self.dropout2 = nn.Dropout2d(0.5) | ||
self.fc1 = nn.Linear(9216, 128) | ||
self.fc2 = nn.Linear(128, 10) | ||
|
||
def forward(self, x): | ||
x = self.conv1(x) | ||
x = F.relu(x) | ||
x = self.conv2(x) | ||
x = F.relu(x) | ||
x = F.max_pool2d(x, 2) | ||
x = self.dropout1(x) | ||
x = torch.flatten(x, 1) | ||
x = self.fc1(x) | ||
x = F.relu(x) | ||
x = self.dropout2(x) | ||
x = self.fc2(x) | ||
output = F.log_softmax(x, dim=1) | ||
return output | ||
|
||
_use_cuda = torch.cuda.is_available() | ||
|
||
_train_kwargs = {'batch_size': 64} | ||
_test_kwargs = {'batch_size': 1000} | ||
if _use_cuda: | ||
_cuda_kwargs = {'num_workers': 1, | ||
'pin_memory': True, | ||
'shuffle': True} | ||
_train_kwargs.update(_cuda_kwargs) | ||
_test_kwargs.update(_cuda_kwargs) | ||
|
||
_transform = transforms.Compose([ | ||
transforms.ToTensor(), | ||
transforms.Normalize((0.1307,), (0.3081,)) | ||
]) | ||
|
||
_device = torch.device("cuda" if _use_cuda else "cpu") | ||
|
||
_train_loader = None | ||
_test_loader = None | ||
|
||
def _train(model, optimizer, criterion, epoch): | ||
global _train_loader | ||
if _train_loader is None: | ||
dataset = datasets.MNIST('./data', train=True, download=True, transform=_transform) | ||
_train_loader = torch.utils.data.DataLoader(dataset, **_train_kwargs) | ||
model.train() | ||
for data, target in _train_loader: | ||
data, target = data.to(_device), target.to(_device) | ||
optimizer.zero_grad() | ||
output = model(data) | ||
loss = criterion(output, target) | ||
loss.backward() | ||
optimizer.step() | ||
|
||
def _test(model): | ||
global _test_loader | ||
if _test_loader is None: | ||
dataset = datasets.MNIST('./data', train=False, transform=_transform) | ||
_test_loader = torch.utils.data.DataLoader(dataset, **_test_kwargs) | ||
model.eval() | ||
test_loss = 0 | ||
correct = 0 | ||
with torch.no_grad(): | ||
for data, target in _test_loader: | ||
data, target = data.to(_device), target.to(_device) | ||
output = model(data) | ||
test_loss += F.nll_loss(output, target, reduction='sum').item() | ||
pred = output.argmax(dim=1, keepdim=True) | ||
correct += pred.eq(target.view_as(pred)).sum().item() | ||
test_loss /= len(_test_loader.dataset) | ||
acc = 100 * correct / len(_test_loader.dataset) | ||
print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format( | ||
test_loss, correct, len(_test_loader.dataset), acc)) | ||
return acc | ||
|
||
_model = LeNet().to(_device) | ||
_model.load_state_dict(torch.load('mnist_pretrain_lenet.pth')) | ||
|
||
class AutoCompressionModule(AbstractAutoCompressionModule): | ||
@classmethod | ||
def model(cls) -> nn.Module: | ||
return _model | ||
|
||
@classmethod | ||
def evaluator(cls) -> Callable[[nn.Module], float]: | ||
return _test | ||
|
||
@classmethod | ||
def optimizer_factory(cls) -> Optional[Callable[[Iterable], optim.Optimizer]]: | ||
def _optimizer_factory(params: Iterable): | ||
return torch.optim.SGD(params, lr=0.01) | ||
return _optimizer_factory | ||
|
||
@classmethod | ||
def criterion(cls) -> Optional[Callable]: | ||
return F.nll_loss | ||
|
||
@classmethod | ||
def sparsifying_trainer(cls, compress_algorithm_name: str) -> Optional[Callable[[nn.Module, optim.Optimizer, Callable, int], None]]: | ||
return _train | ||
|
||
@classmethod | ||
def post_compress_finetuning_trainer(cls, compress_algorithm_name: str) -> Optional[Callable[[nn.Module, optim.Optimizer, Callable, int], None]]: | ||
return _train | ||
|
||
@classmethod | ||
def post_compress_finetuning_epochs(cls, compress_algorithm_name: str) -> int: | ||
return 2 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT license. | ||
|
||
from pathlib import Path | ||
|
||
from nni.algorithms.compression.pytorch.auto_compress import AutoCompressionExperiment, AutoCompressionSearchSpaceGenerator | ||
|
||
from auto_compress_module import AutoCompressionModule | ||
|
||
generator = AutoCompressionSearchSpaceGenerator() | ||
generator.add_config('level', [ | ||
{ | ||
"sparsity": { | ||
"_type": "uniform", | ||
"_value": [0.01, 0.99] | ||
}, | ||
'op_types': ['default'] | ||
} | ||
]) | ||
generator.add_config('l1', [ | ||
{ | ||
"sparsity": { | ||
"_type": "uniform", | ||
"_value": [0.01, 0.99] | ||
}, | ||
'op_types': ['Conv2d'] | ||
} | ||
]) | ||
generator.add_config('qat', [ | ||
{ | ||
'quant_types': ['weight', 'output'], | ||
'quant_bits': { | ||
'weight': 8, | ||
'output': 8 | ||
}, | ||
'op_types': ['Conv2d', 'Linear'] | ||
}]) | ||
search_space = generator.dumps() | ||
|
||
experiment = AutoCompressionExperiment(AutoCompressionModule, 'local') | ||
experiment.config.experiment_name = 'auto compression torch example' | ||
experiment.config.trial_concurrency = 1 | ||
experiment.config.max_trial_number = 10 | ||
experiment.config.search_space = search_space | ||
experiment.config.trial_code_directory = Path(__file__).parent | ||
experiment.config.tuner.name = 'TPE' | ||
experiment.config.tuner.class_args['optimize_mode'] = 'maximize' | ||
experiment.config.training_service.use_active_gpu = True | ||
|
||
experiment.run(8088) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
# Copyright (c) Microsoft Corporation. | ||
# Licensed under the MIT license. | ||
|
||
from .experiment import AutoCompressionExperiment | ||
from .interface import AbstractAutoCompressionModule | ||
from .utils import AutoCompressionSearchSpaceGenerator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recommend using a?