Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

merge master #208

Merged
merged 12 commits into from
Oct 24, 2019
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ NNI (Neural Network Intelligence) is a toolkit to help users run automated machi
The tool dispatches and runs trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different environments like local machine, remote servers and cloud.


### **NNI [v1.0](https://github.com/Microsoft/nni/blob/master/docs/en_US/Release_v1.0.md) has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
### **NNI v1.1 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**

<p align="center">
<a href="#nni-has-been-released"><img src="docs/img/overview.svg" /></a>
Expand Down Expand Up @@ -211,7 +211,7 @@ Linux and MacOS
* Run the following commands in an environment that has `python >= 3.5`, `git` and `wget`.

```bash
git clone -b v1.0 https://github.com/Microsoft/nni.git
git clone -b v1.1 https://github.com/Microsoft/nni.git
cd nni
source install.sh
```
Expand All @@ -221,7 +221,7 @@ Windows
* Run the following commands in an environment that has `python >=3.5`, `git` and `PowerShell`

```bash
git clone -b v1.0 https://github.com/Microsoft/nni.git
git clone -b v1.1 https://github.com/Microsoft/nni.git
cd nni
powershell -ExecutionPolicy Bypass -file install.ps1
```
Expand All @@ -232,12 +232,12 @@ For NNI on Windows, please refer to [NNI on Windows](docs/en_US/Tutorial/NniOnWi

**Verify install**

The following example is an experiment built on TensorFlow. Make sure you have **TensorFlow installed** before running it.
The following example is an experiment built on TensorFlow. Make sure you have **TensorFlow 1.x installed** before running it. Note that **currently Tensorflow 2.0 is NOT supported**.

* Download the examples via clone the source code.

```bash
git clone -b v1.0 https://github.com/Microsoft/nni.git
git clone -b v1.1 https://github.com/Microsoft/nni.git
```

Linux and MacOS
Expand Down
2 changes: 1 addition & 1 deletion deployment/pypi/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@
'requests',
'astor',
'PythonWebHDFS',
'hyperopt',
'hyperopt==0.1.2',
'json_tricks',
'numpy',
'scipy',
Expand Down
117 changes: 116 additions & 1 deletion docs/en_US/Compressor/AutoCompression.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,118 @@
# Automatic Model Compression on NNI

TBD.
It's convenient to implement auto model compression with NNI compression and NNI tuners

## First, model compression with NNI

You can easily compress a model with NNI compression. Take pruning for example, you can prune a pretrained model with LevelPruner like this

```python
from nni.compression.torch import LevelPruner
config_list = [{ 'sparsity': 0.8, 'op_types': 'default' }]
pruner = LevelPruner(config_list)
pruner(model)
```

```{ 'sparsity': 0.8, 'op_types': 'default' }```means that **all layers with weight will be compressed with the same 0.8 sparsity**. When ```pruner(model)``` called, the model is compressed with masks and after that you can normally fine tune this model and **pruned weights won't be updated** which have been masked.

## Then, make this automatic

The previous example manually choosed LevelPruner and pruned all layers with the same sparsity, this is obviously sub-optimal because different layers may have different redundancy. Layer sparsity should be carefully tuned to achieve least model performance degradation and this can be done with NNI tuners.

The first thing we need to do is to design a search space, here we use a nested search space which contains choosing pruning algorithm and optimizing layer sparsity.

```json
{
"prune_method": {
"_type": "choice",
"_value": [
{
"_name": "agp",
"conv0_sparsity": {
"_type": "uniform",
"_value": [
0.1,
0.9
]
},
"conv1_sparsity": {
"_type": "uniform",
"_value": [
0.1,
0.9
]
},
},
{
"_name": "level",
"conv0_sparsity": {
"_type": "uniform",
"_value": [
0.1,
0.9
]
},
"conv1_sparsity": {
"_type": "uniform",
"_value": [
0.01,
0.9
]
},
}
]
}
}
```

Then we need to modify our codes for few lines

```python
import nni
from nni.compression.torch import *
params = nni.get_parameters()
conv0_sparsity = params['prune_method']['conv0_sparsity']
conv1_sparsity = params['prune_method']['conv1_sparsity']
# these raw sparsity should be scaled if you need total sparsity constrained
config_list_level = [{ 'sparsity': conv0_sparsity, 'op_name': 'conv0' },
{ 'sparsity': conv1_sparsity, 'op_name': 'conv1' }]
config_list_agp = [{'initial_sparsity': 0, 'final_sparsity': conv0_sparsity,
'start_epoch': 0, 'end_epoch': 3,
'frequency': 1,'op_name': 'conv0' },
{'initial_sparsity': 0, 'final_sparsity': conv1_sparsity,
'start_epoch': 0, 'end_epoch': 3,
'frequency': 1,'op_name': 'conv1' },]
PRUNERS = {'level':LevelPruner(config_list_level),'agp':AGP_Pruner(config_list_agp)}
pruner = PRUNERS(params['prune_method']['_name'])
pruner(model)
... # fine tuning
acc = evaluate(model) # evaluation
nni.report_final_results(acc)
```

Last, define our task and automatically tuning pruning methods with layers sparsity

```yaml
authorName: default
experimentName: Auto_Compression
trialConcurrency: 2
maxExecDuration: 100h
maxTrialNum: 500
#choice: local, remote, pai
trainingServicePlatform: local
#choice: true, false
useAnnotation: False
searchSpacePath: search_space.json
tuner:
#choice: TPE, Random, Anneal...
builtinTunerName: TPE
classArgs:
#choice: maximize, minimize
optimize_mode: maximize
trial:
command: bash run_prune.sh
codeDir: .
gpuNum: 1

```

6 changes: 4 additions & 2 deletions docs/en_US/Compressor/Overview.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,16 @@
# Compressor

We are glad to announce the alpha release for model compression toolkit on top of NNI, it's still in the experiment phase which might evolve based on usage feedback. We'd like to invite you to use, feedback and even contribute.

NNI provides an easy-to-use toolkit to help user design and use compression algorithms. It supports Tensorflow and PyTorch with unified interface. For users to compress their models, they only need to add several lines in their code. There are some popular model compression algorithms built-in in NNI. Users could further use NNI's auto tuning power to find the best compressed model, which is detailed in [Auto Model Compression](./AutoCompression.md). On the other hand, users could easily customize their new compression algorithms using NNI's interface, refer to the tutorial [here](#customize-new-compression-algorithms).

## Supported algorithms
We have provided two naive compression algorithms and four popular ones for users, including three pruning algorithms and three quantization algorithms:
We have provided two naive compression algorithms and three popular ones for users, including two pruning algorithms and three quantization algorithms:

|Name|Brief Introduction of Algorithm|
|---|---|
| [Level Pruner](./Pruner.md#level-pruner) | Pruning the specified ratio on each weight based on absolute values of weights |
| [AGP Pruner](./Pruner.md#agp-pruner) | Automated gradual pruning (To prune, or not to prune: exploring the efficacy of pruning for model compression) [Reference Paper](https://arxiv.org/abs/1710.01878)|
| [Sensitivity Pruner](./Pruner.md#sensitivity-pruner) | Learning both Weights and Connections for Efficient Neural Networks. [Reference Paper](https://arxiv.org/abs/1506.02626)|
| [Naive Quantizer](./Quantizer.md#naive-quantizer) | Quantize weights to default 8 bits |
| [QAT Quantizer](./Quantizer.md#qat-quantizer) | Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference. [Reference Paper](http://openaccess.thecvf.com/content_cvpr_2018/papers/Jacob_Quantization_and_Training_CVPR_2018_paper.pdf)|
| [DoReFa Quantizer](./Quantizer.md#dorefa-quantizer) | DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients. [Reference Paper](https://arxiv.org/abs/1606.06160)|
Expand Down
46 changes: 4 additions & 42 deletions docs/en_US/Compressor/Pruner.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,7 +48,7 @@ from nni.compression.tensorflow import AGP_Pruner
config_list = [{
'initial_sparsity': 0,
'final_sparsity': 0.8,
'start_epoch': 1,
'start_epoch': 0,
'end_epoch': 10,
'frequency': 1,
'op_types': 'default'
Expand All @@ -62,7 +62,7 @@ from nni.compression.torch import AGP_Pruner
config_list = [{
'initial_sparsity': 0,
'final_sparsity': 0.8,
'start_epoch': 1,
'start_epoch': 0,
'end_epoch': 10,
'frequency': 1,
'op_types': 'default'
Expand All @@ -86,47 +86,9 @@ You can view example for more information
#### User configuration for AGP Pruner
* **initial_sparsity:** This is to specify the sparsity when compressor starts to compress
* **final_sparsity:** This is to specify the sparsity when compressor finishes to compress
* **start_epoch:** This is to specify the epoch number when compressor starts to compress
* **start_epoch:** This is to specify the epoch number when compressor starts to compress, default start from epoch 0
* **end_epoch:** This is to specify the epoch number when compressor finishes to compress
* **frequency:** This is to specify every *frequency* number epochs compressor compress once
* **frequency:** This is to specify every *frequency* number epochs compressor compress once, default frequency=1

***

## Sensitivity Pruner
In [Learning both Weights and Connections for Efficient Neural Networks](https://arxiv.org/abs/1506.02626), author Song Han and provide an algorithm to find the sensitivity of each layer and set the pruning threshold to each layer.

>We used the sensitivity results to find each layer’s threshold: for example, the smallest threshold was applied to the most sensitive layer, which is the first convolutional layer... The pruning threshold is chosen as a quality parameter multiplied by the standard deviation of a layer’s weights
### Usage
You can prune weight step by step and reach one target sparsity by Sensitivity Pruner with the code below.

Tensorflow code
```python
from nni.compression.tensorflow import SensitivityPruner
config_list = [{ 'sparsity':0.8, 'op_types': 'default' }]
pruner = SensitivityPruner(config_list)
pruner(tf.get_default_graph())
```
PyTorch code
```python
from nni.compression.torch import SensitivityPruner
config_list = [{ 'sparsity':0.8, 'op_types': 'default' }]
pruner = SensitivityPruner(config_list)
pruner(model)
```
Like AGP Pruner, you should update mask information every epoch by adding code below

Tensorflow code
```python
pruner.update_epoch(epoch, sess)
```
PyTorch code
```python
pruner.update_epoch(epoch)
```
You can view example for more information

#### User configuration for Sensitivity Pruner
* **sparsity:** This is to specify the sparsity operations to be compressed to

***
97 changes: 97 additions & 0 deletions docs/en_US/TrialExample/RocksdbExamples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
# Tuning RocksDB on NNI

## Overview

[RocksDB](https://github.com/facebook/rocksdb) is a popular high performance embedded key-value database used in production systems at various web-scale enterprises including Facebook, Yahoo!, and LinkedIn.. It is a fork of [LevelDB](https://github.com/google/leveldb) by Facebook optimized to exploit many central processing unit (CPU) cores, and make efficient use of fast storage, such as solid-state drives (SSD), for input/output (I/O) bound workloads.

The performance of RocksDB is highly contingent on its tuning. However, because of the complexity of its underlying technology and a large number of configurable parameters, a good configuration is sometimes hard to obtain. NNI can help to address this issue. NNI supports many kinds of tuning algorithms to search the best configuration of RocksDB, and support many kinds of environments like local machine, remote servers and cloud.

This example illustrates how to use NNI to search the best configuration of RocksDB for a `fillrandom` benchmark supported by a benchmark tool `db_bench`, which is an official benchmark tool provided by RocksDB itself. Therefore, before running this example, please make sure NNI is installed and [`db_bench`](https://github.com/facebook/rocksdb/wiki/Benchmarking-tools) is in your `PATH`. Please refer to [here](../Tutorial/QuickStart.md) for detailed information about installation and preparing of NNI environment, and [here](https://github.com/facebook/rocksdb/blob/master/INSTALL.md) for compiling RocksDB as well as `db_bench`.

We also provide a simple script [`db_bench_installation.sh`](../../../examples/trials/systems/rocksdb-fillrandom/db_bench_installation.sh) helping to compile and install `db_bench` as well as its dependencies on Ubuntu. Installing RocksDB on other systems can follow the same procedure.

*code directory: [`example/trials/systems/rocksdb-fillrandom`](../../../examples/trials/systems/rocksdb-fillrandom)*

## Experiment setup

There are mainly three steps to setup an experiment of tuning systems on NNI. Define search space with a `json` file, write a benchmark code, and start NNI experiment by passing a config file to NNI manager.

### Search Space

For simplicity, this example tunes three parameters, `write_buffer_size`, `min_write_buffer_num` and `level0_file_num_compaction_trigger`, for writing 16M keys with 20 Bytes of key size and 100 Bytes of value size randomly, based on writing operations per second (OPS). `write_buffer_size` sets the size of a single memtable. Once memtable exceeds this size, it is marked immutable and a new one is created. `min_write_buffer_num` is the minimum number of memtables to be merged before flushing to storage. Once the number of files in level 0 reaches `level0_file_num_compaction_trigger`, level 0 to level 1 compaction is triggered.

In this example, the search space is specified by a `search_space.json` file as shown below. Detailed explanation of search space could be found [here](../Tutorial/SearchSpaceSpec.md).

```json
{
"write_buffer_size": {
"_type": "quniform",
"_value": [2097152, 16777216, 1048576]
},
"min_write_buffer_number_to_merge": {
"_type": "quniform",
"_value": [2, 16, 1]
},
"level0_file_num_compaction_trigger": {
"_type": "quniform",
"_value": [2, 16, 1]
}
}
```

*code directory: [`example/trials/systems/rocksdb-fillrandom/search_space.json`](../../../examples/trials/systems/rocksdb-fillrandom/search_space.json)*

### Benchmark code

Benchmark code should receive a configuration from NNI manager, and report the corresponding benchmark result back. Following NNI APIs are designed for this purpose. In this example, writing operations per second (OPS) is used as a performance metric. Please refer to [here](Trials.md) for detailed information.

* Use `nni.get_next_parameter()` to get next system configuration.
* Use `nni.report_final_result(metric)` to report the benchmark result.

*code directory: [`example/trials/systems/rocksdb-fillrandom/main.py`](../../../examples/trials/systems/rocksdb-fillrandom/main.py)*

### Config file

One could start a NNI experiment with a config file. A config file for NNI is a `yaml` file usually including experiment settings (`trialConcurrency`, `maxExecDuration`, `maxTrialNum`, `trial gpuNum`, etc.), platform settings (`trainingServicePlatform`, etc.), path settings (`searchSpacePath`, `trial codeDir`, etc.) and tuner settings (`tuner`, `tuner optimize_mode`, etc.). Please refer to [here](../Tutorial/QuickStart.md) for more information.

Here is an example of tuning RocksDB with SMAC algorithm:

*code directory: [`example/trials/systems/rocksdb-fillrandom/config_smac.yml`](../../../examples/trials/systems/rocksdb-fillrandom/config_smac.yml)*

Here is an example of tuning RocksDB with TPE algorithm:

*code directory: [`example/trials/systems/rocksdb-fillrandom/config_tpe.yml`](../../../examples/trials/systems/rocksdb-fillrandom/config_tpe.yml)*

Other tuners can be easily adopted in the same way. Please refer to [here](../Tuner/BuiltinTuner.md) for more information.

Finally, we could enter the example folder and start the experiment using following commands:

```bash
# tuning RocksDB with SMAC tuner
nnictl create --config ./config_smac.yml
# tuning RocksDB with TPE tuner
nnictl create --config ./config_tpe.yml
```

## Experiment results

We ran these two examples on the same machine with following details:

* 16 * Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
* 465 GB of rotational hard drive with ext4 file system
* 128 GB of RAM
* Kernel version: 4.15.0-58-generic
* NNI version: v1.0-37-g1bd24577
* RocksDB version: 6.4
* RocksDB DEBUG_LEVEL: 0

The detailed experiment results are shown in the below figure. Horizontal axis is sequential order of trials. Vertical axis is the metric, write OPS in this example. Blue dots represent trials for tuning RocksDB with SMAC tuner, and orange dots stand for trials for tuning RocksDB with TPE tuner.

![image](../../../examples/trials/systems/rocksdb-fillrandom/plot.png)

Following table lists the best trials and corresponding parameters and metric obtained by the two tuners. Unsurprisingly, both of them found the same optimal configuration for `fillrandom` benchmark.

| Tuner | Best trial | Best OPS | write_buffer_size | min_write_buffer_number_to_merge | level0_file_num_compaction_trigger |
| :---: | :--------: | :------: | :---------------: | :------------------------------: | :--------------------------------: |
| SMAC | 255 | 779289 | 2097152 | 7.0 | 7.0 |
| TPE | 169 | 761456 | 2097152 | 7.0 | 7.0 |
1 change: 1 addition & 0 deletions docs/en_US/TrialExample/Trials.md
Original file line number Diff line number Diff line change
Expand Up @@ -163,3 +163,4 @@ For more information, please refer to [HowToDebug](../Tutorial/HowToDebug.md)
* [How to tune Scikit-learn on NNI](SklearnExamples.md)
* [Automatic Model Architecture Search for Reading Comprehension.](SquadEvolutionExamples.md)
* [Tuning GBDT on NNI](GbdtExample.md)
* [Tuning RocksDB on NNI](RocksdbExamples.md)
2 changes: 1 addition & 1 deletion docs/en_US/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@
# The short X.Y version
version = ''
# The full version, including alpha/beta/rc tags
release = 'v1.0'
release = 'v1.1'

# -- General configuration ---------------------------------------------------

Expand Down
Loading