Skip to content

Latest commit



91 lines (76 loc) · 4.74 KB

File metadata and controls

91 lines (76 loc) · 4.74 KB

Tutorial for Advanced Neural Architecture Search

Currently many of the NAS algorithms leverage the technique of weight sharing among trials to accelerate its training process. For example, ENAS delivers 1000x effiency with 'parameter sharing between child models', compared with the previous NASNet algorithm. Other NAS algorithms such as DARTS, Network Morphism, and Evolution is also leveraging, or has the potential to leverage weight sharing.

This is a tutorial on how to enable weight sharing in NNI.

Weight Sharing among trials

Currently we recommend sharing weights through NFS (Network File System), which supports sharing files across machines, and is light-weighted, (relatively) efficient. We also welcome contributions from the community on more efficient techniques.

Weight Sharing through NFS file

With the NFS setup (see below), trial code can share model weight through loading & saving files. Here we recommend that user feed the tuner with the storage path:

  codeDir: path/to/customer_tuner
  className: CustomerTuner
    save_dir_root: /nfs/storage/path/

And let tuner decide where to save & load weights and feed the paths to trials through nni.get_next_parameters():


For example, in tensorflow:

# save models
saver = tf.train.Saver(), os.path.join(params['save_path'], 'model.ckpt'))
# load models

where 'save_path' and 'restore_path' in hyper-parameter can be managed by the tuner.

NFS Setup

NFS follows the Client-Server Architecture, with an NFS server providing physical storage, trials on the remote machine with an NFS client can read/write those files in the same way that they access local files.

NFS Server

An NFS server can be any machine as long as it can provide enough physical storage, and network connection with remote machine for NNI trials. Usually you can choose one of the remote machine as NFS Server.

On Ubuntu, install NFS server through apt-get:

sudo apt-get install nfs-kernel-server

Suppose /tmp/nni/shared is used as the physical storage, then run:

mkdir -p /tmp/nni/shared
sudo echo "/tmp/nni/shared *(rw,sync,no_subtree_check,no_root_squash)" >> /etc/exports
sudo service nfs-kernel-server restart

You can check if the above directory is successfully exported by NFS using sudo showmount -e localhost

NFS Client

For a trial on remote machine able to access shared files with NFS, an NFS client needs to be installed. For example, on Ubuntu:

sudo apt-get install nfs-common

Then create & mount the mounted directory of shared files:

mkdir -p /mnt/nfs/nni/
sudo mount -t nfs /mnt/nfs/nni

where should be replaced by the real IP of NFS server machine in practice.

Asynchronous Dispatcher Mode for trial dependency control

The feature of weight sharing enables trials from different machines, in which most of the time read after write consistency must be assured. After all, the child model should not load parent model before parent trial finishes training. To deal with this, users can enable asynchronous dispatcher mode with multiThread: true in config.yml in NNI, where the dispatcher assign a tuner thread each time a NEW_TRIAL request comes in, and the tuner thread can decide when to submit a new trial by blocking and unblocking the thread itself. For example:

    def generate_parameters(self, parameter_id):
        indiv = # configuration for a new trial[parameter_id] = threading.Event()
        if indiv.parent_id is not None:

    def receive_trial_result(self, parameter_id, parameters, reward):
        # code for processing trial results


For details, please refer to this simple weight sharing example. We also provided a practice example for reading comprehension, based on previous ga_squad example.