Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

refactor of nas examples #3513

Merged
merged 17 commits into from
Apr 9, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/en_US/NAS/CDARTS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ This is CDARTS based on the NNI platform, which currently supports CIFAR10 searc
Examples
--------

`Example code <https://github.com/microsoft/nni/tree/master/examples/nas/cdarts>`__
`Example code <https://github.com/microsoft/nni/tree/master/examples/nas/legacy/cdarts>`__

.. code-block:: bash

Expand All @@ -47,7 +47,7 @@ Examples
python setup.py install --cpp_ext --cuda_ext

# search the best architecture
cd examples/nas/cdarts
cd examples/nas/legacy/cdarts
bash run_search_cifar.sh

# train the best architecture.
Expand Down
2 changes: 1 addition & 1 deletion docs/en_US/NAS/ClassicNas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ A file named ``nni_auto_gen_search_space.json`` is generated by this command. Th

Currently, we only support :githublink:`PPO Tuner <examples/tuners/random_nas_tuner>` for classic NAS. More classic NAS algorithms will be supported soon.

The complete examples can be found :githublink:`here <examples/nas/classic_nas>` for PyTorch and :githublink:`here <examples/nas/classic_nas-tf>` for TensorFlow.
The complete examples can be found :githublink:`here <examples/nas/legacy/classic_nas>` for PyTorch and :githublink:`here <examples/nas/legacy/classic_nas-tf>` for TensorFlow.

Standalone mode for easy debugging
----------------------------------
Expand Down
2 changes: 1 addition & 1 deletion docs/en_US/NAS/Cream.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,7 +53,7 @@ The training with 16 Gpus is a little bit superior than 8 Gpus, as below.
Examples
--------

`Example code <https://github.com/microsoft/nni/tree/master/examples/nas/cream>`__
`Example code <https://github.com/microsoft/nni/tree/master/examples/nas/legacy/cream>`__

Please run the following scripts in the example folder.

Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/NAS/DARTS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,15 +36,15 @@ Examples
CNN Search Space
^^^^^^^^^^^^^^^^

:githublink:`Example code <examples/nas/darts>`
:githublink:`Example code <examples/nas/oneshot/darts>`

.. code-block:: bash

# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
git clone https://github.com/Microsoft/nni.git

# search the best architecture
cd examples/nas/darts
cd examples/nas/oneshot/darts
python3 search.py

# train the best architecture
Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/NAS/ENAS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,15 @@ Examples
CIFAR10 Macro/Micro Search Space
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

:githublink:`Example code <examples/nas/enas>`
:githublink:`Example code <examples/nas/oneshot/enas>`

.. code-block:: bash

# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
git clone https://github.com/Microsoft/nni.git

# search the best architecture
cd examples/nas/enas
cd examples/nas/oneshot/enas

# search in macro search space
python3 search.py --search-for macro
Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/NAS/PDARTS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,15 @@ P-DARTS
Examples
--------

:githublink:`Example code <examples/nas/pdarts>`
:githublink:`Example code <examples/nas/legacy/pdarts>`

.. code-block:: bash

# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
git clone https://github.com/Microsoft/nni.git

# search the best architecture
cd examples/nas/pdarts
cd examples/nas/legacy/pdarts
python3 search.py

# train the best architecture, it's the same progress as darts.
Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/NAS/Proxylessnas.rst
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ To use ProxylessNAS training/searching approach, users need to specify search sp
trainer.train()
trainer.export(args.arch_path)

The complete example code can be found :githublink:`here <examples/nas/proxylessnas>`.
The complete example code can be found :githublink:`here <examples/nas/oneshot/proxylessnas>`.

**Input arguments of ProxylessNasTrainer**

Expand Down Expand Up @@ -56,7 +56,7 @@ Implementation

The implementation on NNI is based on the `offical implementation <https://github.com/mit-han-lab/ProxylessNAS>`__. The official implementation supports two training approaches: gradient descent and RL based, and support different targeted hardware, including 'mobile', 'cpu', 'gpu8', 'flops'. In our current implementation on NNI, gradient descent training approach is supported, but has not supported different hardwares. The complete support is ongoing.

Below we will describe implementation details. Like other one-shot NAS algorithms on NNI, ProxylessNAS is composed of two parts: *search space* and *training approach*. For users to flexibly define their own search space and use built-in ProxylessNAS training approach, we put the specified search space in :githublink:`example code <examples/nas/proxylessnas>` using :githublink:`NNI NAS interface <nni/algorithms/nas/pytorch/proxylessnas>`.
Below we will describe implementation details. Like other one-shot NAS algorithms on NNI, ProxylessNAS is composed of two parts: *search space* and *training approach*. For users to flexibly define their own search space and use built-in ProxylessNAS training approach, we put the specified search space in :githublink:`example code <examples/nas/oneshot/proxylessnas>` using :githublink:`NNI NAS interface <nni/algorithms/nas/pytorch/proxylessnas>`.

.. image:: ../../img/proxylessnas.png
:target: ../../img/proxylessnas.png
Expand Down
2 changes: 1 addition & 1 deletion docs/en_US/NAS/SPOS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Examples

Here is a use case, which is the search space in paper, and the way to use flops limit to perform uniform sampling.

:githublink:`Example code <examples/nas/spos>`
:githublink:`Example code <examples/nas/oneshot/spos>`

Requirements
^^^^^^^^^^^^
Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/NAS/SearchSpaceZoo.rst
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ Search Space Zoo
DartsCell
---------

DartsCell is extracted from :githublink:`CNN model <examples/nas/darts>`. A DartsCell is a directed acyclic graph containing an ordered sequence of N nodes and each node stands for a latent representation (e.g. feature map in a convolutional network). Directed edges from Node 1 to Node 2 are associated with some operations that transform Node 1 and the result is stored on Node 2. The `Candidate operators <#predefined-operations-darts>`__ between nodes is predefined and unchangeable. One edge represents an operation that chosen from the predefined ones to be applied to the starting node of the edge. One cell contains two input nodes, a single output node, and other ``n_node`` nodes. The input nodes are defined as the cell outputs in the previous two layers. The output of the cell is obtained by applying a reduction operation (e.g. concatenation) to all the intermediate nodes. To make the search space continuous, the categorical choice of a particular operation is relaxed to a softmax over all possible operations. By adjusting the weight of softmax on every node, the operation with the highest probability is chosen to be part of the final structure. A CNN model can be formed by stacking several cells together, which builds a search space. Note that, in DARTS paper all cells in the model share the same structure.
DartsCell is extracted from :githublink:`CNN model <examples/nas/oneshot/darts>`. A DartsCell is a directed acyclic graph containing an ordered sequence of N nodes and each node stands for a latent representation (e.g. feature map in a convolutional network). Directed edges from Node 1 to Node 2 are associated with some operations that transform Node 1 and the result is stored on Node 2. The `Candidate operators <#predefined-operations-darts>`__ between nodes is predefined and unchangeable. One edge represents an operation that chosen from the predefined ones to be applied to the starting node of the edge. One cell contains two input nodes, a single output node, and other ``n_node`` nodes. The input nodes are defined as the cell outputs in the previous two layers. The output of the cell is obtained by applying a reduction operation (e.g. concatenation) to all the intermediate nodes. To make the search space continuous, the categorical choice of a particular operation is relaxed to a softmax over all possible operations. By adjusting the weight of softmax on every node, the operation with the highest probability is chosen to be part of the final structure. A CNN model can be formed by stacking several cells together, which builds a search space. Note that, in DARTS paper all cells in the model share the same structure.

One structure in the Darts search space is shown below. Note that, NNI merges the last one of the four intermediate nodes and the output node.

Expand Down Expand Up @@ -82,7 +82,7 @@ All supported operators for Darts are listed below.
ENASMicroLayer
--------------

This layer is extracted from the model designed :githublink:`here <examples/nas/enas>`. A model contains several blocks that share the same architecture. A block is made up of some normal layers and reduction layers, ``ENASMicroLayer`` is a unified implementation of the two types of layers. The only difference between the two layers is that reduction layers apply all operations with ``stride=2``.
This layer is extracted from the model designed :githublink:`here <examples/nas/oneshot/enas>`. A model contains several blocks that share the same architecture. A block is made up of some normal layers and reduction layers, ``ENASMicroLayer`` is a unified implementation of the two types of layers. The only difference between the two layers is that reduction layers apply all operations with ``stride=2``.

ENAS Micro employs a DAG with N nodes in one cell, where the nodes represent local computations, and the edges represent the flow of information between the N nodes. One cell contains two input nodes and a single output node. The following nodes choose two previous nodes as input and apply two operations from `predefined ones <#predefined-operations-enas>`__ then add them as the output of this node. For example, Node 4 chooses Node 1 and Node 3 as inputs then applies ``MaxPool`` and ``AvgPool`` on the inputs respectively, then adds and sums them as the output of Node 4. Nodes that are not served as input for any other node are viewed as the output of the layer. If there are multiple output nodes, the model will calculate the average of these nodes as the layer output.

Expand Down
6 changes: 3 additions & 3 deletions docs/en_US/NAS/TextNAS.rst
Original file line number Diff line number Diff line change
Expand Up @@ -57,15 +57,15 @@ Examples
Search Space
^^^^^^^^^^^^

:githublink:`Example code <examples/nas/textnas>`
:githublink:`Example code <examples/nas/legacy/textnas>`

.. code-block:: bash

# In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
git clone https://github.com/Microsoft/nni.git

# search the best architecture
cd examples/nas/textnas
cd examples/nas/legacy/textnas

# view more options for search
python3 search.py -h
Expand All @@ -83,7 +83,7 @@ retrain
git clone https://github.com/Microsoft/nni.git

# search the best architecture
cd examples/nas/textnas
cd examples/nas/legacy/textnas

# default to retrain on sst-2
sh run_retrain.sh
Expand Down
2 changes: 1 addition & 1 deletion docs/en_US/NAS/retiarii/Advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ Use placehoder to make mutation easier: ``nn.Placeholder``. If you want to mutat
stride=stride
)

``label`` is used by mutator to identify this placeholder. The other parameters are the information that are required by mutator. They can be accessed from ``node.operation.parameters`` as a dict, it could include any information that users want to put to pass it to user defined mutator. The complete example code can be found in :githublink:`Mnasnet base model <test/retiarii_test/mnasnet/base_mnasnet.py>`.
``label`` is used by mutator to identify this placeholder. The other parameters are the information that are required by mutator. They can be accessed from ``node.operation.parameters`` as a dict, it could include any information that users want to put to pass it to user defined mutator. The complete example code can be found in :githublink:`Mnasnet base model <examples/nas/multi-trial/mnasnet/base_mnasnet.py>`.

Starting an experiment is almost the same as using inline mutation APIs. The only difference is that the applied mutators should be passed to ``RetiariiExperiment``. Below is a simple example.

Expand Down
4 changes: 2 additions & 2 deletions docs/en_US/NAS/retiarii/Tutorial.rst
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ Below is a very simple example of defining a base model, it is almost the same a

The above example also shows how to use ``@basic_unit``. ``@basic_unit`` is decorated on a user-defined module to tell Retiarii that there will be no mutation within this module, Retiarii can treat it as a basic unit (i.e., as a blackbox). It is useful when (1) users want to mutate the initialization parameters of this module, or (2) Retiarii fails to parse this module due to complex control flow (e.g., ``for``, ``while``). More detailed description of ``@basic_unit`` can be found `here <./Advanced.rst>`__.

Users can refer to :githublink:`Darts base model <test/retiarii_test/darts/darts_model.py>` and :githublink:`Mnasnet base model <test/retiarii_test/mnasnet/base_mnasnet.py>` for more complicated examples.
Users can refer to :githublink:`Darts base model <test/retiarii_test/darts/darts_model.py>` and :githublink:`Mnasnet base model <examples/nas/multi-trial/mnasnet/base_mnasnet.py>` for more complicated examples.

Define Model Mutations
^^^^^^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -195,7 +195,7 @@ After all the above are prepared, it is time to start an experiment to do the mo
exp_config.training_service.use_active_gpu = False
exp.run(exp_config, 8081)

The complete code of a simple MNIST example can be found :githublink:`here <test/retiarii_test/mnist/test.py>`.
The complete code of a simple MNIST example can be found :githublink:`here <examples/nas/multi-trial/mnist/search.py>`.

**Local Debug Mode**: When running an experiment, it is easy to get some trivial errors in trial code, such as shape mismatch, undefined variable. To quickly fix these kinds of errors, we provide local debug mode which locally applies mutators once and runs only that generated model. To use local debug mode, users can simply invoke the API `debug_mutated_model(base_model, trainer, applied_mutators)`.

Expand Down