Skip to content

Commit

Permalink
Merge pull request #233 from microsoft/master
Browse files Browse the repository at this point in the history
merge master
  • Loading branch information
SparkSnail authored Feb 21, 2020
2 parents 3fe117f + 24fa461 commit aa31674
Show file tree
Hide file tree
Showing 288 changed files with 23,179 additions and 14,200 deletions.
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ The tool manages automated machine learning (AutoML) experiments, **dispatches a
* Researchers and data scientists who want to easily **implement and experiement new AutoML algorithms**, may it be: hyperparameter tuning algorithm, neural architect search algorithm or model compression algorithm.
* ML Platform owners who want to **support AutoML in their platform**.

### **NNI v1.3 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
### **NNI v1.4 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**

## **NNI capabilities in a glance**
NNI provides CommandLine Tool as well as an user friendly WebUI to manage training experiements. With the extensible API, you can customize your own AutoML algorithms and training services. To make it easy for new users, NNI also provides a set of build-in stat-of-the-art AutoML algorithms and out of box support for popular training platforms.
Expand Down Expand Up @@ -177,9 +177,9 @@ Within the following table, we summarized the current NNI capabilities, we are g
</td>
<td style="border-top:#FF0000 solid 0px;">
<ul>
<li><a href="docs/en_US/sdk_reference.rst">Python API</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/autotune_ref.html#trial">Python API</a></li>
<li><a href="docs/en_US/Tutorial/AnnotationSpec.md">NNI Annotation</a></li>
<li><a href="docs/en_US/Tutorial/Installation.md">Supported OS</a></li>
<li><a href="https://nni.readthedocs.io/en/latest/installation.html">Supported OS</a></li>
</ul>
</td>
<td style="border-top:#FF0000 solid 0px;">
Expand Down Expand Up @@ -216,9 +216,9 @@ Windows
python -m pip install --upgrade nni
```

If you want to try latest code, please [install NNI](docs/en_US/Tutorial/Installation.md) from source code.
If you want to try latest code, please [install NNI](https://nni.readthedocs.io/en/latest/installation.html) from source code.

For detail system requirements of NNI, please refer to [here](docs/en_US/Tutorial/Installation.md#system-requirements).
For detail system requirements of NNI, please refer to [here](https://nni.readthedocs.io/en/latest/Tutorial/InstallationLinux.html#system-requirements) for Linux & macOS, and [here](https://nni.readthedocs.io/en/latest/Tutorial/InstallationWin.html#system-requirements) for Windows.

Note:

Expand All @@ -233,7 +233,7 @@ The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is
* Download the examples via clone the source code.

```bash
git clone -b v1.3 https://github.com/Microsoft/nni.git
git clone -b v1.4 https://github.com/Microsoft/nni.git
```

* Run the MNIST example.
Expand Down
8 changes: 4 additions & 4 deletions README_zh_CN.md
Original file line number Diff line number Diff line change
Expand Up @@ -172,9 +172,9 @@ NNI 提供命令行工具以及友好的 WebUI 来管理训练的 Experiment。
</td>
<td style="border-top:#FF0000 solid 0px;">
<ul>
<li><a href="docs/zh_CN/sdk_reference.rst">Python API</a></li>
<li><a href="https://nni.readthedocs.io/zh/latest/autotune_ref.html#trial">Python API</a></li>
<li><a href="docs/zh_CN/Tutorial/AnnotationSpec.md">NNI Annotation</a></li>
<li><a href="docs/zh_CN/Tutorial/Installation.md">支持的操作系统</a></li>
<li><a href="https://nni.readthedocs.io/zh/latest/installation.html">支持的操作系统</a></li>
</ul>
</td>
<td style="border-top:#FF0000 solid 0px;">
Expand Down Expand Up @@ -211,9 +211,9 @@ Windows
python -m pip install --upgrade nni
```

如果想要尝试最新代码,可通过源代码[安装 NNI](docs/zh_CN/Tutorial/Installation.md)
如果想试试最新代码,可参考从源代码[安装 NNI](https://nni.readthedocs.io/zh/latest/installation.html)

有关 NNI 的详细系统要求,参考[这里](docs/zh_CN/Tutorial/Installation.md#system-requirements)
Linux 和 macOS 下 NNI 系统需求[参考这里](https://nni.readthedocs.io/zh/latest/Tutorial/InstallationLinux.html#system-requirements) ,Windows [参考这里](https://nni.readthedocs.io/zh/latest/Tutorial/InstallationWin.html#system-requirements)

注意:

Expand Down
10 changes: 5 additions & 5 deletions azure-pipelines.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,8 +26,8 @@ jobs:
yarn eslint
displayName: 'Run eslint'
- script: |
python3 -m pip install torch==0.4.1 --user
python3 -m pip install torchvision==0.2.1 --user
python3 -m pip install torch==1.2.0 --user
python3 -m pip install torchvision==0.4.0 --user
python3 -m pip install tensorflow==1.13.1 --user
python3 -m pip install keras==2.1.6 --user
python3 -m pip install gym onnx --user
Expand Down Expand Up @@ -91,8 +91,8 @@ jobs:
echo "##vso[task.setvariable variable=PATH]${HOME}/Library/Python/3.7/bin:${PATH}"
displayName: 'Install nni toolkit via source code'
- script: |
python3 -m pip install torch==0.4.1 --user
python3 -m pip install torchvision==0.2.1 --user
python3 -m pip install torch==1.2.0 --user
python3 -m pip install torchvision==0.4.0 --user
python3 -m pip install tensorflow==1.13.1 --user
ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)" < /dev/null 2> /dev/null
brew install swig@3
Expand Down Expand Up @@ -131,7 +131,7 @@ jobs:
- script: |
python -m pip install scikit-learn==0.20.0 --user
python -m pip install keras==2.1.6 --user
python -m pip install https://download.pytorch.org/whl/cu90/torch-0.4.1-cp36-cp36m-win_amd64.whl --user
python -m pip install torch===1.2.0 torchvision===0.4.1 -f https://download.pytorch.org/whl/torch_stable.html --user
python -m pip install torchvision --user
python -m pip install tensorflow==1.13.1 --user
displayName: 'Install dependencies'
Expand Down
2 changes: 1 addition & 1 deletion deployment/docker/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ RUN python3 -m pip --no-cache-dir install Keras==2.1.6
# PyTorch
#
RUN python3 -m pip --no-cache-dir install torch==1.2.0
RUN python3 -m pip install torchvision==0.4.0
RUN python3 -m pip install torchvision==0.5.0

#
# sklearn 0.20.0
Expand Down
105 changes: 105 additions & 0 deletions docs/en_US/Compressor/ModelSpeedup.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Speed up Masked Model

*This feature is still in Alpha version.*

## Introduction

Pruning algorithms usually use weight masks to simulate the real pruning. Masks can be used
to check model performance of a specific pruning (or sparsity), but there is no real speedup.
Since model speedup is the ultimate goal of model pruning, we try to provide a tool to users
to convert a model to a smaller one based on user provided masks (the masks come from the
pruning algorithms).

There are two types of pruning. One is fine-grained pruning, it does not change the shape of weights, and input/output tensors. Sparse kernel is required to speed up a fine-grained pruned layer. The other is coarse-grained pruning (e.g., channels), shape of weights and input/output tensors usually change due to such pruning. To speed up this kind of pruning, there is no need to use sparse kernel, just replace the pruned layer with smaller one. Since the support of sparse kernels in community is limited, we only support the speedup of coarse-grained pruning and leave the support of fine-grained pruning in future.

## Design and Implementation

To speed up a model, the pruned layers should be replaced, either replaced with smaller layer for coarse-grained mask, or replaced with sparse kernel for fine-grained mask. Coarse-grained mask usually changes the shape of weights or input/output tensors, thus, we should do shape inference to check are there other unpruned layers should be replaced as well due to shape change. Therefore, in our design, there are two main steps: first, do shape inference to find out all the modules that should be replaced; second, replace the modules. The first step requires topology (i.e., connections) of the model, we use `jit.trace` to obtain the model grpah for PyTorch.

For each module, we should prepare four functions, three for shape inference and one for module replacement. The three shape inference functions are: given weight shape infer input/output shape, given input shape infer weight/output shape, given output shape infer weight/input shape. The module replacement function returns a newly created module which is smaller.

## Usage

```python
from nni.compression.speedup.torch import ModelSpeedup
# model: the model you want to speed up
# dummy_input: dummy input of the model, given to `jit.trace`
# masks_file: the mask file created by pruning algorithms
m_speedup = ModelSpeedup(model, dummy_input.to(device), masks_file)
m_speedup.speedup_model()
dummy_input = dummy_input.to(device)
start = time.time()
out = model(dummy_input)
print('elapsed time: ', time.time() - start)
```
For complete examples please refer to [the code](https://github.com/microsoft/nni/tree/master/examples/model_compress/model_speedup.py)

NOTE: The current implementation only works on torch 1.3.1 and torchvision 0.4.2

## Limitations

Since every module requires four functions for shape inference and module replacement, this is a large amount of work, we only implemented the ones that are required by the examples. If you want to speed up your own model which cannot supported by the current implementation, you are welcome to contribute.

For PyTorch we can only replace modules, if functions in `forward` should be replaced, our current implementation does not work. One workaround is make the function a PyTorch module.

## Speedup Results of Examples

The code of these experiments can be found [here](https://github.com/microsoft/nni/tree/master/examples/model_compress/model_speedup.py).

### slim pruner example

on one V100 GPU,
input tensor: `torch.randn(64, 3, 32, 32)`

|Times| Mask Latency| Speedup Latency |
|---|---|---|
| 1 | 0.01197 | 0.005107 |
| 2 | 0.02019 | 0.008769 |
| 4 | 0.02733 | 0.014809 |
| 8 | 0.04310 | 0.027441 |
| 16 | 0.07731 | 0.05008 |
| 32 | 0.14464 | 0.10027 |

### fpgm pruner example

on cpu,
input tensor: `torch.randn(64, 1, 28, 28)`,
too large variance

|Times| Mask Latency| Speedup Latency |
|---|---|---|
| 1 | 0.01383 | 0.01839 |
| 2 | 0.01167 | 0.003558 |
| 4 | 0.01636 | 0.01088 |
| 40 | 0.14412 | 0.08268 |
| 40 | 1.29385 | 0.14408 |
| 40 | 0.41035 | 0.46162 |
| 400 | 6.29020 | 5.82143 |

### l1filter pruner example

on one V100 GPU,
input tensor: `torch.randn(64, 3, 32, 32)`

|Times| Mask Latency| Speedup Latency |
|---|---|---|
| 1 | 0.01026 | 0.003677 |
| 2 | 0.01657 | 0.008161 |
| 4 | 0.02458 | 0.020018 |
| 8 | 0.03498 | 0.025504 |
| 16 | 0.06757 | 0.047523 |
| 32 | 0.10487 | 0.086442 |

### APoZ pruner example

on one V100 GPU,
input tensor: `torch.randn(64, 3, 32, 32)`

|Times| Mask Latency| Speedup Latency |
|---|---|---|
| 1 | 0.01389 | 0.004208 |
| 2 | 0.01628 | 0.008310 |
| 4 | 0.02521 | 0.014008 |
| 8 | 0.03386 | 0.023923 |
| 16 | 0.06042 | 0.046183 |
| 32 | 0.12421 | 0.087113 |
10 changes: 5 additions & 5 deletions docs/en_US/Compressor/Overview.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Model Compression with NNI
As larger neural networks with more layers and nodes are considered, reducing their storage and computational cost becomes critical, especially for some real-time applications. Model compression can be used to address this problem.

We are glad to announce the alpha release for model compression toolkit on top of NNI, it's still in the experiment phase which might evolve based on usage feedback. We'd like to invite you to use, feedback and even contribute.
We are glad to introduce model compression toolkit on top of NNI, it's still in the experiment phase which might evolve based on usage feedback. We'd like to invite you to use, feedback and even contribute.

NNI provides an easy-to-use toolkit to help user design and use compression algorithms. It currently supports PyTorch with unified interface. For users to compress their models, they only need to add several lines in their code. There are some popular model compression algorithms built-in in NNI. Users could further use NNI's auto tuning power to find the best compressed model, which is detailed in [Auto Model Compression](./AutoCompression.md). On the other hand, users could easily customize their new compression algorithms using NNI's interface, refer to the tutorial [here](#customize-new-compression-algorithms).

Expand Down Expand Up @@ -335,9 +335,9 @@ class YourQuantizer(Quantizer):
If you do not customize `QuantGrad`, the default backward is Straight-Through Estimator.
_Coming Soon_ ...

## **Reference and Feedback**
## Reference and Feedback
* To [report a bug](https://github.com/microsoft/nni/issues/new?template=bug-report.md) for this feature in GitHub;
* To [file a feature or improvement request](https://github.com/microsoft/nni/issues/new?template=enhancement.md) for this feature in GitHub;
* To know more about [Feature Engineering with NNI](https://github.com/microsoft/nni/blob/master/docs/en_US/FeatureEngineering/Overview.md);
* To know more about [NAS with NNI](https://github.com/microsoft/nni/blob/master/docs/en_US/NAS/Overview.md);
* To know more about [Hyperparameter Tuning with NNI](https://github.com/microsoft/nni/blob/master/docs/en_US/Tuner/BuiltinTuner.md);
* To know more about [Feature Engineering with NNI](../FeatureEngineering/Overview.md);
* To know more about [NAS with NNI](../NAS/Overview.md);
* To know more about [Hyperparameter Tuning with NNI](../Tuner/BuiltinTuner.md);
46 changes: 46 additions & 0 deletions docs/en_US/Compressor/QuickStart.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Quick Start to Compress a Model

NNI provides very simple APIs for compressing a model. The compression includes pruning algorithms and quantization algorithms. The usage of them are the same, thus, here we use slim pruner as an example to show the usage. The complete code of this example can be found [here](https://github.com/microsoft/nni/blob/master/examples/model_compress/slim_torch_cifar10.py).

## Write configuration

Write a configuration to specify the layers that you want to prune. The following configuration means pruning all the `BatchNorm2d`s to sparsity 0.7 while keeping other layers unpruned.

```python
configure_list = [{
'sparsity': 0.7,
'op_types': ['BatchNorm2d'],
}]
```

The specification of configuration can be found [here](Overview.md#user-configuration-for-a-compression-algorithm). Note that different pruners may have their own defined fields in configuration, for exmaple `start_epoch` in AGP pruner. Please refer to each pruner's [usage](Overview.md#supported-algorithms) for details, and adjust the configuration accordingly.

## Choose a compression algorithm

Choose a pruner to prune your model. First instantiate the chosen pruner with your model and configuration as arguments, then invoke `compress()` to compress your model.

```python
pruner = SlimPruner(model, configure_list)
model = pruner.compress()
```

Then, you can train your model using traditional training approach (e.g., SGD), pruning is applied transparently during the training. Some pruners prune once at the beginning, the following training can be seen as fine-tune. Some pruners prune your model iteratively, the masks are adjusted epoch by epoch during training.

## Export compression result

After training, you get accuracy of the pruned model. You can export model weights to a file, and the generated masks to a file as well. Exporting onnx model is also supported.

```python
pruner.export_model(model_path='pruned_vgg19_cifar10.pth', mask_path='mask_vgg19_cifar10.pth')
```

## Speed up the model

Masks do not provide real speedup of your model. The model should be speeded up based on the exported masks, thus, we provide an API to speed up your model as shown below. After invoking `apply_compression_results` on your model, your model becomes a smaller one with shorter inference latency.

```python
from nni.compression.torch import apply_compression_results
apply_compression_results(model, 'mask_vgg19_cifar10.pth')
```

Please refer to [here](ModelSpeedup.md) for detailed description.
18 changes: 12 additions & 6 deletions docs/en_US/FeatureEngineering/Overview.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,19 +6,25 @@ For now, we support the following feature selector:
- [GradientFeatureSelector](./GradientFeatureSelector.md)
- [GBDTSelector](./GBDTSelector.md)

These selectors are suitable for tabular data(which means it doesn't include image, speech and text data).

# How to use?
In addition, those selector only for feature selection. If you want to:
1) generate high-order combined features on nni while doing feature selection;
2) leverage your distributed resources;
you could try this [example](https://github.com/microsoft/nni/tree/master/examples/feature_engineering/auto-feature-engineering).

## How to use?

```python
from nni.feature_engineering.gradient_selector import GradientFeatureSelector
from nni.feature_engineering.gradient_selector import FeatureGradientSelector
# from nni.feature_engineering.gbdt_selector import GBDTSelector

# load data
...
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

# initlize a selector
fgs = GradientFeatureSelector(...)
fgs = FeatureGradientSelector(...)
# fit data
fgs.fit(X_train, y_train)
# get improtant features
Expand All @@ -30,7 +36,7 @@ print(fgs.get_selected_features(...))

When using the built-in Selector, you first need to `import` a feature selector, and `initialize` it. You could call the function `fit` in the selector to pass the data to the selector. After that, you could use `get_seleteced_features` to get important features. The function parameters in different selectors might be different, so you need to check the docs before using it.

# How to customize?
## How to customize?

NNI provides _state-of-the-art_ feature selector algorithm in the builtin-selector. NNI also supports to build a feature selector by yourself.

Expand Down Expand Up @@ -239,7 +245,7 @@ print("Pipeline Score: ", pipeline.score(X_train, y_train))

```

# Benchmark
## Benchmark

`Baseline` means without any feature selection, we directly pass the data to LogisticRegression. For this benchmark, we only use 10% data from the train as test data. For the GradientFeatureSelector, we only take the top20 features. The metric is the mean accuracy on the given test data and labels.

Expand All @@ -257,7 +263,7 @@ The dataset of benchmark could be download in [here](https://www.csie.ntu.edu.tw

The code could be refenrence `/examples/feature_engineering/gradient_feature_selector/benchmark_test.py`.

## **Reference and Feedback**
## Reference and Feedback
* To [report a bug](https://github.com/microsoft/nni/issues/new?template=bug-report.md) for this feature in GitHub;
* To [file a feature or improvement request](https://github.com/microsoft/nni/issues/new?template=enhancement.md) for this feature in GitHub;
* To know more about [Neural Architecture Search with NNI](https://github.com/microsoft/nni/blob/master/docs/en_US/NAS/Overview.md);
Expand Down
Loading

0 comments on commit aa31674

Please sign in to comment.