Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Request for Integrating the new NAS algorithm: Cream #2705

Merged
merged 74 commits into from
Nov 27, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
a426e1a
integrate CREAM NAS algorithm
penghouwen Jul 20, 2020
57a2c40
Update README.md
penghouwen Jul 20, 2020
c15832f
Update Cream.md
penghouwen Jul 31, 2020
806937c
Update Cream.md
penghouwen Jul 31, 2020
b13fed0
Update Cream.md
penghouwen Jul 31, 2020
d7c3217
Update requirements.txt
penghouwen Jul 31, 2020
1951db0
Update Cream.md
penghouwen Jul 31, 2020
bce9cf2
Update requirements.txt
penghouwen Jul 31, 2020
cda252b
Update Cream.md
penghouwen Jul 31, 2020
29b6d5d
Update Cream.md
penghouwen Jul 31, 2020
e548cbc
Update Cream.md
penghouwen Jul 31, 2020
d5c95c6
Update Cream.md
penghouwen Jul 31, 2020
c73b95c
Update trainer.py
penghouwen Aug 1, 2020
0adaf7c
Update mutator.py
penghouwen Aug 1, 2020
85f01e9
Update Cream.md
penghouwen Aug 3, 2020
047fd86
Update Cream.md
penghouwen Aug 3, 2020
b22f92c
Update Cream.md
penghouwen Aug 3, 2020
3ee7591
Update Cream.md
penghouwen Aug 3, 2020
be81d53
Update Cream.md
penghouwen Aug 3, 2020
9535466
Fix pipeline for merging into NNI
ultmaster Aug 4, 2020
0892e66
Fix typo
ultmaster Aug 4, 2020
999d18c
Merge pull request #1 from ultmaster/fix-cream-before-merge
penghouwen Aug 4, 2020
2e13a23
Fix pipeline
ultmaster Aug 5, 2020
22a3f46
Add files via upload
penghouwen Aug 7, 2020
b1777f0
Update Cream.md
penghouwen Aug 7, 2020
4fcbaa9
Update CDARTS.md
penghouwen Aug 7, 2020
2205433
Update Cream.md
penghouwen Aug 7, 2020
b277b96
Update CDARTS.md
penghouwen Aug 7, 2020
8d413bf
Update CDARTS.md
penghouwen Aug 7, 2020
cc9f336
Update distributed_train.sh
penghouwen Sep 3, 2020
9494493
Update distributed_test.sh
penghouwen Sep 3, 2020
6a332ff
Update Cream.md
penghouwen Sep 3, 2020
b35ccac
init
Sep 27, 2020
c872739
Merge pull request #2 from mapleam/master
penghouwen Sep 27, 2020
9289614
Update supernet.py
Z7zuqer Sep 27, 2020
ab9d398
1)remove timm
Sep 27, 2020
82eee8d
Merge pull request #3 from mapleam/master
penghouwen Sep 27, 2020
2559697
Delete cream.jpg
penghouwen Oct 22, 2020
e48d293
Add files via upload
penghouwen Oct 22, 2020
a71563b
Update Cream.md
penghouwen Oct 22, 2020
c00c58e
version 1.0
Z7zuqer Nov 18, 2020
4d72a70
version 2.0
Z7zuqer Nov 21, 2020
60e5197
Merge pull request #4 from mapleam/master
penghouwen Nov 23, 2020
37518fa
Update Cream.md
penghouwen Nov 23, 2020
e04200c
Update Cream.md
penghouwen Nov 23, 2020
47dce8c
Update Cream.md
penghouwen Nov 23, 2020
5231d3b
Update Cream.md
penghouwen Nov 23, 2020
ce698c3
Update Cream.md
penghouwen Nov 23, 2020
0d63ceb
Update Cream.md
penghouwen Nov 23, 2020
274fb23
version 3.0
Z7zuqer Nov 23, 2020
931c47b
Merge branch 'master' into master
Z7zuqer Nov 23, 2020
59b1339
Merge pull request #5 from mapleam/master
penghouwen Nov 23, 2020
c162f39
Update Cream.md
penghouwen Nov 23, 2020
de8c261
Update Cream.md
penghouwen Nov 23, 2020
ae45787
Update Cream.md
Z7zuqer Nov 23, 2020
43101c1
Update retrain.py
Z7zuqer Nov 23, 2020
36ddeaf
Update test.py
Z7zuqer Nov 23, 2020
97451af
Update retrain.py
Z7zuqer Nov 23, 2020
96cfb17
Merge branch 'master' into master
Z7zuqer Nov 23, 2020
d735a25
Merge pull request #6 from mapleam/master
penghouwen Nov 23, 2020
8d24833
version 4.0
Z7zuqer Nov 23, 2020
a53cc5f
Merge remote-tracking branch 'origin/master'
Z7zuqer Nov 23, 2020
cce57e5
version 4.0
Z7zuqer Nov 23, 2020
d9cfd2f
Merge pull request #7 from mapleam/master
penghouwen Nov 24, 2020
0f8f8bf
Update Cream.md
penghouwen Nov 24, 2020
879bfeb
Update Cream.md
penghouwen Nov 24, 2020
85b17b4
Merge branch 'master' into master
penghouwen Nov 24, 2020
0cf817b
Move code dir
ultmaster Nov 24, 2020
fdeb0b9
Fix trainer and retrain optimizer
ultmaster Nov 25, 2020
d11e4cf
Update Cream.md
penghouwen Nov 25, 2020
06af2cb
Fix syntax warning
ultmaster Nov 26, 2020
6cb3b97
Fix syntax warning (again)
ultmaster Nov 26, 2020
9996098
Fix docs build warnings
ultmaster Nov 26, 2020
02b8e72
Merge branch 'master' of github.com:penghouwen/nni into cream-master
ultmaster Nov 26, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,7 @@ Within the following table, we summarized the current NNI capabilities, we are g
<li><a href="docs/en_US/NAS/Proxylessnas.md">ProxylessNAS</a></li>
<li><a href="docs/en_US/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a></li>
<li><a href="docs/en_US/NAS/TextNAS.md">TextNAS</a></li>
<li><a href="docs/en_US/NAS/Cream.md">Cream</a></li>
</ul>
</ul>
<a href="docs/en_US/Compression/Overview.md">Model Compression</a>
Expand Down
10 changes: 6 additions & 4 deletions docs/en_US/NAS/CDARTS.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,17 @@

# CDARTS

## Introduction

CDARTS builds a cyclic feedback mechanism between the search and evaluation networks. First, the search network generates an initial topology for evaluation, so that the weights of the evaluation network can be optimized. Second, the architecture topology in the search network is further optimized by the label supervision in classification, as well as the regularization from the evaluation network through feature distillation. Repeating the above cycle results in a joint optimization of the search and evaluation networks, and thus enables the evolution of the topology to fit the final evaluation network.
[CDARTS](https://arxiv.org/pdf/2006.10724.pdf) builds a cyclic feedback mechanism between the search and evaluation networks. First, the search network generates an initial topology for evaluation, so that the weights of the evaluation network can be optimized. Second, the architecture topology in the search network is further optimized by the label supervision in classification, as well as the regularization from the evaluation network through feature distillation. Repeating the above cycle results in a joint optimization of the search and evaluation networks, and thus enables the evolution of the topology to fit the final evaluation network.

In implementation of `CdartsTrainer`, it first instantiates two models and two mutators (one for each). The first model is the so-called "search network", which is mutated with a `RegularizedDartsMutator` -- a mutator with subtle differences with `DartsMutator`. The second model is the "evaluation network", which is mutated with a discrete mutator that leverages the previous search network mutator, to sample a single path each time. Trainers train models and mutators alternatively. Users can refer to [references](#reference) if they are interested in more details on these trainers and mutators.
In implementation of `CdartsTrainer`, it first instantiates two models and two mutators (one for each). The first model is the so-called "search network", which is mutated with a `RegularizedDartsMutator` -- a mutator with subtle differences with `DartsMutator`. The second model is the "evaluation network", which is mutated with a discrete mutator that leverages the previous search network mutator, to sample a single path each time. Trainers train models and mutators alternatively. Users can refer to [paper](https://arxiv.org/pdf/2006.10724.pdf) if they are interested in more details on these trainers and mutators.

## Reproduction Results

This is CDARTS based on the NNI platform, which currently supports CIFAR10 search and retrain. ImageNet search and retrain should also be supported, and we provide corresponding interfaces. Our reproduced results on NNI are slightly lower than the paper, but much higher than the original DARTS. Here we show the results of three independent experiments on CIFAR10.

| Runs | Paper | NNI |
| Runs | Paper | NNI |
| ---- |:-------------:| :-----:|
| 1 | 97.52 | 97.44 |
| 2 | 97.53 | 97.48 |
Expand All @@ -19,7 +20,7 @@ This is CDARTS based on the NNI platform, which currently supports CIFAR10 searc

## Examples

[Example code](https://github.com/microsoft/nni/tree/v1.9/examples/nas/cdarts)
[Example code](https://github.com/microsoft/nni/tree/master/examples/nas/cdarts)

```bash
# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
Expand Down Expand Up @@ -55,3 +56,4 @@ bash run_retrain_cifar.sh
.. autoclass:: nni.algorithms.nas.pytorch.cdarts.RegularizedMutatorParallel
:members:
```

127 changes: 127 additions & 0 deletions docs/en_US/NAS/Cream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Cream of the Crop: Distilling Prioritized Paths For One-Shot Neural Architecture Search

**[[Paper]](https://papers.nips.cc/paper/2020/file/d072677d210ac4c03ba046120f0802ec-Paper.pdf) [[Models-Google Drive]](https://drive.google.com/drive/folders/1NLGAbBF9bA1IUAxKlk2VjgRXhr6RHvRW?usp=sharing)[[Models-Baidu Disk (PWD: wqw6)]](https://pan.baidu.com/s/1TqQNm2s14oEdyNPimw3T9g) [[BibTex]](https://scholar.googleusercontent.com/scholar.bib?q=info:ICWVXc_SsKAJ:scholar.google.com/&output=citation&scisdr=CgUmooXfEMfTi0cV5aU:AAGBfm0AAAAAX7sQ_aXoamdKRaBI12tAVN8REq1VKNwM&scisig=AAGBfm0AAAAAX7sQ_RdYtp6BSro3zgbXVJU2MCgsG730&scisf=4&ct=citation&cd=-1&hl=ja)** <br/>

In this work, we present a simple yet effective architecture distillation method. The central idea is that subnetworks can learn collaboratively and teach each other throughout the training process, aiming to boost the convergence of individual models. We introduce the concept of prioritized path, which refers to the architecture candidates exhibiting superior performance during training. Distilling knowledge from the prioritized paths is able to boost the training of subnetworks. Since the prioritized paths are changed on the fly depending on their performance and complexity, the final obtained paths are the cream of the crop. The discovered architectures achieve superior performance compared to the recent [MobileNetV3](https://arxiv.org/abs/1905.02244) and [EfficientNet](https://arxiv.org/abs/1905.11946) families under aligned settings.

<div >
<img src="https://github.com/microsoft/Cream/blob/main/demo/intro.jpg" width="800"/>
</div>


## Reproduced Results
Top-1 Accuracy on ImageNet. The top-1 accuracy of Cream search algorithm surpasses MobileNetV3 and EfficientNet-B0/B1 on ImageNet.
The training with 16 Gpus is a little bit superior than 8 Gpus, as below.

| Model (M Flops) | 8Gpus | 16Gpus |
| ---- |:-------------:| :-----:|
| 14M | 53.7 | 53.8 |
| 43M | 65.8 | 66.5 |
| 114M | 72.1 | 72.8 |
| 287M | 76.7 | 77.6 |
| 481M | 78.9 | 79.2 |
| 604M | 79.4 | 80.0 |

<table style="border: none">
<th><img src="./../../img/cream_flops100.jpg" alt="drawing" width="400"/></th>
<th><img src="./../../img/cream_flops600.jpg" alt="drawing" width="400"/></th>
</table>

## Examples

[Example code](https://github.com/microsoft/nni/tree/master/examples/nas/cream)

Please run the following scripts in the example folder.

## Data Preparation

You need to first download the [ImageNet-2012](http://www.image-net.org/) to the folder `./data/imagenet` and move the validation set to the subfolder `./data/imagenet/val`. To move the validation set, you cloud use the following script: <https://raw.githubusercontent.com/soumith/imagenetloader.torch/master/valprep.sh>

Put the imagenet data in `./data`. It should be like following:

```
./data/imagenet/train
./data/imagenet/val
...
```

## Quick Start

### I. Search

First, build environments for searching.

```
pip install -r ./requirements

git clone https://github.com/NVIDIA/apex.git
cd apex
python setup.py install --cpp_ext --cuda_ext
```

To search for an architecture, you need to configure the parameters `FLOPS_MINIMUM` and `FLOPS_MAXIMUM` to specify the desired model flops, such as [0,600]MB flops. You can specify the flops interval by changing these two parameters in `./configs/train.yaml`

```
FLOPS_MINIMUM: 0 # Minimum Flops of Architecture
FLOPS_MAXIMUM: 600 # Maximum Flops of Architecture
```

For example, if you expect to search an architecture with model flops <= 200M, please set the `FLOPS_MINIMUM` and `FLOPS_MAXIMUM` to be `0` and `200`.

After you specify the flops of the architectures you would like to search, you can search an architecture now by running:

```
python -m torch.distributed.launch --nproc_per_node=8 ./train.py --cfg ./configs/train.yaml
```

The searched architectures need to be retrained and obtain the final model. The final model is saved in `.pth.tar` format. Retraining code will be released soon.

### II. Retrain

To train searched architectures, you need to configure the parameter `MODEL_SELECTION` to specify the model Flops. To specify which model to train, you should add `MODEL_SELECTION` in `./configs/retrain.yaml`. You can select one from [14,43,112,287,481,604], which stands for different Flops(MB).

```
MODEL_SELECTION: 43 # Retrain 43m model
MODEL_SELECTION: 481 # Retrain 481m model
......
```

To train random architectures, you need specify `MODEL_SELECTION` to `-1` and configure the parameter `INPUT_ARCH`:

```
MODEL_SELECTION: -1 # Train random architectures
INPUT_ARCH: [[0], [3], [3, 3], [3, 1, 3], [3, 3, 3, 3], [3, 3, 3], [0]] # Random Architectures
......
```

After adding `MODEL_SELECTION` in `./configs/retrain.yaml`, you need to use the following command to train the model.

```
python -m torch.distributed.launch --nproc_per_node=8 ./retrain.py --cfg ./configs/retrain.yaml
```

### III. Test

To test our trained of models, you need to use `MODEL_SELECTION` in `./configs/test.yaml` to specify which model to test.

```
MODEL_SELECTION: 43 # test 43m model
MODEL_SELECTION: 481 # test 470m model
......
```

After specifying the flops of the model, you need to write the path to the resume model in `./test.sh`.

```
RESUME_PATH: './43.pth.tar'
RESUME_PATH: './481.pth.tar'
......
```

We provide 14M/43M/114M/287M/481M/604M pretrained models in [google drive](https://drive.google.com/drive/folders/1CQjyBryZ4F20Rutj7coF8HWFcedApUn2) or [[Models-Baidu Disk (password: wqw6)]](https://pan.baidu.com/s/1TqQNm2s14oEdyNPimw3T9g) .

After downloading the pretrained models and adding `MODEL_SELECTION` and `RESUME_PATH` in './configs/test.yaml', you need to use the following command to test the model.

```
python -m torch.distributed.launch --nproc_per_node=8 ./test.py --cfg ./configs/test.yaml
```
3 changes: 2 additions & 1 deletion docs/en_US/NAS/one_shot_nas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@ One-shot NAS algorithms leverage weight sharing among models in neural architect
SPOS <SPOS>
CDARTS <CDARTS>
ProxylessNAS <Proxylessnas>
TextNAS <TextNAS>
TextNAS <TextNAS>
Cream <Cream>
Binary file added docs/img/cream.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/cream_flops100.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/cream_flops600.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Empty file added examples/__init__.py
Empty file.
Empty file added examples/nas/__init__.py
Empty file.
1 change: 1 addition & 0 deletions examples/nas/cream/Cream.md
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
[Documentation](https://nni.readthedocs.io/en/latest/NAS/Cream.html)
Empty file added examples/nas/cream/__init__.py
Empty file.
52 changes: 52 additions & 0 deletions examples/nas/cream/configs/retrain.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
AUTO_RESUME: False
DATA_DIR: './data/imagenet'
MODEL: '604m_retrain'
RESUME_PATH: './experiments/workspace/retrain/resume.pth.tar'
SAVE_PATH: './'
SEED: 42
LOG_INTERVAL: 50
RECOVERY_INTERVAL: 0
WORKERS: 4
NUM_GPU: 2
SAVE_IMAGES: False
AMP: False
OUTPUT: 'None'
EVAL_METRICS: 'prec1'
TTA: 0
LOCAL_RANK: 0

DATASET:
NUM_CLASSES: 1000
IMAGE_SIZE: 224 # image patch size
INTERPOLATION: 'random' # Image resize interpolation type
BATCH_SIZE: 32 # batch size
NO_PREFECHTER: False

NET:
GP: 'avg'
DROPOUT_RATE: 0.0
SELECTION: 42

EMA:
USE: True
FORCE_CPU: False # force model ema to be tracked on CPU
DECAY: 0.9998

OPT: 'sgd'
OPT_EPS: 1e-2
MOMENTUM: 0.9
DECAY_RATE: 0.1

SCHED: 'sgd'
LR_NOISE: None
LR_NOISE_PCT: 0.67
LR_NOISE_STD: 1.0
WARMUP_LR: 1e-4
MIN_LR: 1e-5
EPOCHS: 200
START_EPOCH: None
DECAY_EPOCHS: 30.0
WARMUP_EPOCHS: 3
COOLDOWN_EPOCHS: 10
PATIENCE_EPOCHS: 10
LR: 1e-2
37 changes: 37 additions & 0 deletions examples/nas/cream/configs/test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
AUTO_RESUME: True
DATA_DIR: './data/imagenet'
MODEL: 'Childnet_Testing'
RESUME_PATH: './experiments/workspace/ckps/42.pth.tar'
SAVE_PATH: './'
SEED: 42
LOG_INTERVAL: 50
RECOVERY_INTERVAL: 0
WORKERS: 4
NUM_GPU: 2
SAVE_IMAGES: False
AMP: False
OUTPUT: 'None'
EVAL_METRICS: 'prec1'
TTA: 0
LOCAL_RANK: 0

DATASET:
NUM_CLASSES: 1000
IMAGE_SIZE: 224 # image patch size
INTERPOLATION: 'bilinear' # Image resize interpolation type
BATCH_SIZE: 32 # batch size
NO_PREFECHTER: False

NET:
GP: 'avg'
DROPOUT_RATE: 0.0
SELECTION: 42

EMA:
USE: True
FORCE_CPU: False # force model ema to be tracked on CPU
DECAY: 0.9998

OPTIMIZER:
MOMENTUM: 0.9
WEIGHT_DECAY: 1e-3
53 changes: 53 additions & 0 deletions examples/nas/cream/configs/train.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
AUTO_RESUME: False
DATA_DIR: './data/imagenet'
MODEL: 'Supernet_Training'
RESUME_PATH: './experiments/workspace/train/resume.pth.tar'
SAVE_PATH: './'
SEED: 42
LOG_INTERVAL: 50
RECOVERY_INTERVAL: 0
WORKERS: 8
NUM_GPU: 8
SAVE_IMAGES: False
AMP: False
OUTPUT: 'None'
EVAL_METRICS: 'prec1'
TTA: 0
LOCAL_RANK: 0

DATASET:
NUM_CLASSES: 1000
IMAGE_SIZE: 224 # image patch size
INTERPOLATION: 'bilinear' # Image resize interpolation type
BATCH_SIZE: 128 # batch size

NET:
GP: 'avg'
DROPOUT_RATE: 0.0

EMA:
USE: True
FORCE_CPU: False # force model ema to be tracked on CPU
DECAY: 0.9998

OPT: 'sgd'
LR: 1.0
EPOCHS: 120
META_LR: 1e-4

BATCHNORM:
SYNC_BN: False

SUPERNET:
UPDATE_ITER: 200
SLICE: 4
POOL_SIZE: 10
RESUNIT: False
DIL_CONV: False
UPDATE_2ND: True
FLOPS_MINIMUM: 0
FLOPS_MAXIMUM: 600
PICK_METHOD: 'meta'
META_STA_EPOCH: 20
HOW_TO_PROB: 'pre_prob'
PRE_PROB: (0.05,0.2,0.05,0.5,0.05,0.15)
Loading