Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pull code #86

Merged
merged 15 commits into from
May 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,7 @@ install-dependencies: $(NNI_NODE_TARBALL) $(NNI_YARN_TARBALL)
.PHONY: install-python-modules
install-python-modules:
#$(_INFO) Installing Python SDK $(_END)
sed -ie 's/$(NNI_VERSION_TEMPLATE)/$(NNI_VERSION_VALUE)/' src/sdk/pynni/nni/__init__.py
sed -ie 's/$(NNI_VERSION_TEMPLATE)/$(NNI_VERSION_VALUE)/' setup.py && $(PIP_INSTALL) $(PIP_MODE) .

.PHONY: dev-install-python-modules
Expand Down
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,7 @@ With authors' permission, we listed a set of NNI usage examples and relevant art
Join IM discussion groups:
|Gitter||WeChat|
|----|----|----|
|![image](https://user-images.githubusercontent.com/39592018/80665738-e0574a80-8acc-11ea-91bc-0836dc4cbf89.png)| OR |![image](https://github.com/JSong-Jia/NNI-user-group/blob/master/user%20group%20code_0512.jpg)|
|![image](https://user-images.githubusercontent.com/39592018/80665738-e0574a80-8acc-11ea-91bc-0836dc4cbf89.png)| OR |![image](https://github.com/JSong-Jia/NNI-user-group/blob/master/user%20group%20code_0512.png)|


## Related Projects
Expand Down
1 change: 1 addition & 0 deletions deployment/pypi/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ build:
cp $(CWD)../../src/nni_manager/package.json $(CWD)nni
sed -ie 's/$(NNI_VERSION_TEMPLATE)/$(NNI_VERSION_VALUE)/' $(CWD)nni/package.json
cd $(CWD)nni && $(NNI_YARN) --prod
sed -ie 's/$(NNI_VERSION_TEMPLATE)/$(NNI_VERSION_VALUE)/' $(CWD)../../src/sdk/pynni/nni/__init__.py
cd $(CWD) && sed -ie 's/$(NNI_VERSION_TEMPLATE)/$(NNI_VERSION_VALUE)/' setup.py && python3 setup.py bdist_wheel -p $(WHEEL_SPEC)
cd $(CWD)

Expand Down
2 changes: 2 additions & 0 deletions deployment/pypi/install.ps1
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ Copy-Item $CWD\..\..\src\nni_manager\package.json $CWD\nni
(Get-Content $CWD\nni\package.json).replace($NNI_VERSION_TEMPLATE, $NNI_VERSION_VALUE) | Set-Content $CWD\nni\package.json
cd $CWD\nni
yarn --prod
cd $CWD\..\..\src\sdk\pynni\nni
(Get-Content __init__.py).replace($NNI_VERSION_TEMPLATE, $NNI_VERSION_VALUE) | Set-Content __init__.py
cd $CWD
(Get-Content setup.py).replace($NNI_VERSION_TEMPLATE, $NNI_VERSION_VALUE) | Set-Content setup.py
python setup.py bdist_wheel -p $WHEEL_SPEC
12 changes: 6 additions & 6 deletions docs/en_US/TrainingService/PaiMode.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,9 +7,9 @@ Step 1. Install NNI, follow the install guide [here](../Tutorial/QuickStart.md).

Step 2. Get PAI token.
Click `My profile` button in the top-right side of PAI's webprotal.
![](../../img/pai_token_button.jpg)
Find the token management region, copy one of the token as your account token.
![](../../img/pai_token_profile.jpg)
![](../../img/pai_profile.jpg)
Click `copy` button in the page to copy a jwt token.
![](../../img/pai_token.jpg)

Step 3. Mount NFS storage to local machine.
Click `Submit job` button in PAI's webportal.
Expand All @@ -19,7 +19,7 @@ Step 3. Mount NFS storage to local machine.
The `DEFAULT_STORAGE`field is the path to be mounted in PAI's container when a job is started. The `Preview container paths` is the NFS host and path that PAI provided, you need to mount the corresponding host and path to your local machine first, then NNI could use the PAI's NFS storage.
For example, use the following command:
```
sudo mount nfs://gcr-openpai-infra02:/pai/data /local/mnt
sudo mount -t nfs4 gcr-openpai-infra02:/pai/data /local/mnt
```
Then the `/data` folder in container will be mounted to `/local/mnt` folder in your local machine.
You could use the following configuration in your NNI's config file:
Expand Down Expand Up @@ -66,15 +66,15 @@ trial:
virtualCluster: default
nniManagerNFSMountPath: /home/user/mnt
containerNFSMountPath: /mnt/data/user
paiStoragePlugin: team_wise
paiStoragePlugin: teamwise_storage
# Configuration to access OpenPAI Cluster
paiConfig:
userName: your_pai_nni_user
token: your_pai_token
host: 10.1.1.1
```

Note: You should set `trainingServicePlatform: pai` in NNI config YAML file if you want to start experiment in pai mode.
Note: You should set `trainingServicePlatform: pai` in NNI config YAML file if you want to start experiment in pai mode. The host field in configuration file is PAI's job submission page uri, like `10.10.5.1`, the default http protocol in NNI is `http`, if your PAI's cluster enabled https, please use the uri in `https://10.10.5.1` format.

Compared with [LocalMode](LocalMode.md) and [RemoteMachineMode](RemoteMachineMode.md), trial configuration in pai mode have these additional keys:
* cpuNum
Expand Down
46 changes: 41 additions & 5 deletions docs/en_US/TrainingService/RemoteMachineMode.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,54 @@

NNI can run one experiment on multiple remote machines through SSH, called `remote` mode. It's like a lightweight training platform. In this mode, NNI can be started from your computer, and dispatch trials to remote machines in parallel.

## Remote machine requirements
The OS of remote machines supports `Linux`, `Windows 10`, and `Windows Server 2019`.

* It only supports Linux as remote machines, and [linux part in system specification](../Tutorial/InstallationLinux.md) is same as NNI local mode.
## Requirements

* Follow [installation](../Tutorial/InstallationLinux.md) to install NNI on each machine.

* Make sure remote machines meet environment requirements of your trial code. If the default environment does not meet the requirements, the setup script can be added into `command` field of NNI config.
* Make sure the default environment of remote machines meets requirements of your trial code. If the default environment does not meet the requirements, the setup script can be added into `command` field of NNI config.

* Make sure remote machines can be accessed through SSH from the machine which runs `nnictl` command. It supports both password and key authentication of SSH. For advanced usages, please refer to [machineList part of configuration](../Tutorial/ExperimentConfig.md).

* Make sure the NNI version on each machine is consistent.

* Make sure the command of Trial is compatible with remote OSes, if you want to use remote Linux and Windows together. For example, the default python 3.x executable called `python3` on Linux, and `python` on Windows.

### Linux

* Follow [installation](../Tutorial/InstallationLinux.md) to install NNI on the remote machine.

### Windows

* Follow [installation](../Tutorial/InstallationWin.md) to install NNI on the remote machine.

* Install and start `OpenSSH Server`.

1. Open `Settings` app on Windows.

2. Click `Apps`, then click `Optional features`.

3. Click `Add a feature`, search and select `OpenSSH Server`, and then click `Install`.

4. Once it's installed, run below command to start and set to automatic start.

```bat
sc config sshd start=auto
net start sshd
```

* Make sure remote account is administrator, so that it can stop running trials.

* Make sure there is no welcome message more than default, since it causes ssh2 failed in NodeJs. For example, if you're using Data Science VM on Azure, it needs to remove extra echo commands in `C:\dsvm\tools\setup\welcome.bat`.

The output like below is ok, when opening a new command window.

```text
Microsoft Windows [Version 10.0.17763.1192]
(c) 2018 Microsoft Corporation. All rights reserved.

(py37_default) C:\Users\AzureUser>
```

## Run an experiment

e.g. there are three machines, which can be logged in with username and password.
Expand Down
66 changes: 39 additions & 27 deletions docs/en_US/Tutorial/InstallationWin.md
Original file line number Diff line number Diff line change
@@ -1,46 +1,56 @@
# Install on Windows

## Installation
## Prerequires

Anaconda or Miniconda is highly recommended to manage multiple Python environments.
* Python 3.5 (or above) 64-bit. [Anaconda](https://www.anaconda.com/products/individual) or [Miniconda](https://docs.conda.io/en/latest/miniconda.html) is highly recommended to manage multiple Python environments on Windows.

### Install NNI through pip
* If it's a newly installed Python environment, it needs to install [Microsoft C++ Build Tools](https://visualstudio.microsoft.com/visual-cpp-build-tools/) to support build NNI dependencies like `scikit-learn`.

Prerequisites: `python 64-bit >= 3.5`
```bat
pip install cython wheel
```

```bash
python -m pip install --upgrade nni
```
* git for verifying installation.

### Install NNI through source code
## Install NNI

If you are interested in special or the latest code versions, you can install NNI through source code.
In most cases, you can install and upgrade NNI from pip package. It's easy and fast.

Prerequisites: `python 64-bit >=3.5`, `git`, `PowerShell`.
If you are interested in special or the latest code versions, you can install NNI through source code.

```bash
git clone -b v1.5 https://github.com/Microsoft/nni.git
cd nni
powershell -ExecutionPolicy Bypass -file install.ps1
```
If you want to contribute to NNI, refer to [setup development environment](SetupNniDeveloperEnvironment.md).

* From pip package

```bat
python -m pip install --upgrade nni
```

* From source code

```bat
git clone -b v1.5 https://github.com/Microsoft/nni.git
cd nni
powershell -ExecutionPolicy Bypass -file install.ps1
```

## Verify installation

The following example is built on TensorFlow 1.x. Make sure **TensorFlow 1.x is used** when running it.

* Download the examples via clone the source code.
* Clone examples within source code.

```bash
git clone -b v1.5 https://github.com/Microsoft/nni.git
```
```bat
git clone -b v1.5 https://github.com/Microsoft/nni.git
```

* Run the MNIST example.

```bash
nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
```
```bat
nnictl create --config nni\examples\trials\mnist-tfv1\config_windows.yml
```

Note: for other examples you need to change trial command `python3` to `python` in each example YAML, if python3 is called through `python` on your machine.
Note: If you are familiar with other frameworks, you can choose corresponding example under `examples\trials`. It needs to change trial command `python3` to `python` in each example YAML, since default installation has `python.exe`, not `python3.exe` executable.

* Wait for the message `INFO: Successfully started experiment!` in the command line. This message indicates that your experiment has been successfully started. You can explore the experiment using the `Web UI url`.

Expand Down Expand Up @@ -112,18 +122,20 @@ If there is a stderr file, please check it. Two possible cases are:
* forgetting to install experiment dependencies such as TensorFlow, Keras and so on.

### Fail to use BOHB on Windows

Make sure a C++ 14.0 compiler is installed when trying to run `nnictl package install --name=BOHB` to install the dependencies.

### Not supported tuner on Windows

SMAC is not supported currently; for the specific reason refer to this [GitHub issue](https://github.com/automl/SMAC3/issues/483).

### Use a Windows server as a remote worker
Currently, you can't.
### Use Windows as a remote worker

Note:
Refer to [Remote Machine mode](../TrainingService/RemoteMachineMode.md).

* If an error like `Segmentation fault` is encountered, please refer to the [FAQ](FAQ.md)
### Segmentation fault (core dumped) when installing

Refer to [FAQ](FAQ.md).

## Further reading

Expand Down
69 changes: 26 additions & 43 deletions docs/en_US/Tutorial/SetupNniDeveloperEnvironment.md
Original file line number Diff line number Diff line change
@@ -1,76 +1,59 @@
**Set up NNI developer environment**
# Setup NNI development environment

===
NNI development environment supports Ubuntu 1604 (or above), and Windows 10 with Python3 64bit.

## Best practice for debug NNI source code
## Installation

For debugging NNI source code, your development environment should be under Ubuntu 16.04 (or above) system with python 3 and pip 3 installed, then follow the below steps.
The installation steps are similar with installing from source code. But the installation links to code directory, so that code changes can be applied to installation as easy as possible.

### 1. Clone the source code
### 1. Clone source code

Run the command

```
```bash
git clone https://github.com/Microsoft/nni.git
```

to clone the source code
Note, if you want to contribute code back, it needs to fork your own NNI repo, and clone from there.

### 2. Prepare the debug environment and install dependencies
### 2. Install from source code

Change directory to the source code folder, then run the command
#### Ubuntu

```bash
make dev-easy-install
```
make install-dependencies
```

to install the dependent tools for the environment

### 3. Build source code

Run the command
#### Windows

```bat
powershell -ExecutionPolicy Bypass -file install.ps1 -Development
```
make build
```

to build the source code

### 4. Install NNI to development environment

Run the command

```
make dev-install
```

to install the distribution content to development environment, and create cli scripts

### 5. Check if the environment is ready
### 3. Check if the environment is ready

Now, you can try to start an experiment to check if your environment is ready.
For example, run the command

```
nnictl create --config ~/nni/examples/trials/mnist-tfv1/config.yml
```bash
nnictl create --config examples/trials/mnist-tfv1/config.yml
```

And open WebUI to check if everything is OK

### 6. Redeploy

After the code changes, it may need to redeploy. It depends on what kind of code changed.
### 4. Reload changes

#### Python

It doesn't need to redeploy, but the nnictl may need to be restarted.
Nothing to do, the code is already linked to package folders.

#### TypeScript

* If `src/nni_manager` is changed, run `yarn watch` continually under this folder. It will rebuild code instantly. The nnictl may need to be restarted to reload NNI manager.
* If `src/nni_manager` is changed, run `yarn watch` under this folder. It will watch and build code continually. The `nnictl` need to be restarted to reload NNI manager.
* If `src/webui` or `src/nasui` are changed, run `yarn start` under the corresponding folder. The web UI will refresh automatically if code is changed.

### 5. Submit Pull Request

All changes are merged to master branch from your forked repo. The description of Pull Request must be meaningful, and useful.

We will review the changes as soon as possible. Once it passes review, we will merge it to master branch.

---
At last, wish you have a wonderful day.
For more contribution guidelines on making PR's or issues to NNI source code, you can refer to our [Contributing](Contributing.md) document.
For more contribution guidelines and coding styles, you can refer to the [contributing document](Contributing.md).
Binary file modified docs/img/pai_job_submission_page.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/pai_profile.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/img/pai_token.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed docs/img/pai_token_profile.jpg
Binary file not shown.
12 changes: 12 additions & 0 deletions examples/nas/enas-tf/datasets.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT license.

import tensorflow as tf
from tensorflow.data import Dataset

def get_dataset():
(x_train, y_train), (x_valid, y_valid) = tf.keras.datasets.cifar10.load_data()
x_train, x_valid = x_train / 255.0, x_valid / 255.0
train_set = (x_train, y_train)
valid_set = (x_valid, y_valid)
return train_set, valid_set
Loading