SparkSnail · SparkSnail · Dec 10, 2019 · Nov 28, 2019 · Nov 28, 2019 · Nov 29, 2019
diff --git a/README.md b/README.md
@@ -18,7 +18,7 @@ NNI (Neural Network Intelligence) is a toolkit to help users run automated machi
 The tool dispatches and runs trial jobs generated by tuning algorithms to search the best neural architecture and/or hyper-parameters in different environments like local machine, remote servers and cloud.
 
 
-### **NNI v1.1 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
+### **NNI v1.2 has been released! &nbsp;<a href="#nni-released-reminder"><img width="48" src="docs/img/release_icon.png"></a>**
 
 <p align="center">
   <a href="#nni-has-been-released"><img src="docs/img/overview.svg" /></a>
@@ -34,7 +34,7 @@ The tool dispatches and runs trial jobs generated by tuning algorithms to search
         <img src="docs/img/bar.png"/>
       </td>
       <td>
-        <b>Tuning Algorithms</b>
+        <b>Algorithms</b>
         <img src="docs/img/bar.png"/>
       </td>
       <td>
@@ -83,13 +83,8 @@ The tool dispatches and runs trial jobs generated by tuning algorithms to search
       <td align="left" >
         <a href="docs/en_US/Tuner/BuiltinTuner.md">Tuner</a>
         <ul>
-          <li><b>General Tuner</b></li>
-          <ul>
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#Random">Random Search</a></li>
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#Evolution">Naïve Evolution</a></li>
-          </ul>    
-          <li><b>Tuner for <a href="docs/en_US/CommunitySharings/HpoComparision.md">HPO</a></b></li>
-          <ul>
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#TPE">TPE</a></li>
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#Anneal">Anneal</a></li>
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#SMAC">SMAC</a></li>
@@ -99,20 +94,34 @@ The tool dispatches and runs trial jobs generated by tuning algorithms to search
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#MetisTuner">Metis Tuner</a></li>
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#BOHB">BOHB</a></li>
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#GPTuner">GP Tuner</a></li>
-          </ul>
-          <li><b>Tuner for <a href="docs/en_US/AdvancedFeature/GeneralNasInterfaces.md">NAS</a></b></li>
-          <ul>
+          <li><a href="docs/en_US/Tuner/BuiltinTuner.md#PPOTuner">PPO Tuner</a></li>
           <li><a href="docs/en_US/Tuner/BuiltinTuner.md#NetworkMorphism">Network Morphism</a></li>
-          <li><a href="examples/tuners/enas_nni/README.md">ENAS</a></li>
-          </ul>
         </ul>
           <a href="docs/en_US/Assessor/BuiltinAssessor.md">Assessor</a>
           <ul>
-          <ul>
           <li><a href="docs/en_US/Assessor/BuiltinAssessor.md#Medianstop">Median Stop</a></li>
           <li><a href="docs/en_US/Assessor/BuiltinAssessor.md#Curvefitting">Curve Fitting</a></li>   
           </ul>
-          </ul>  
+          <a href="docs/en_US/NAS/Overview.md">NAS (Beta)</a>
+          <ul>
+          <li><a href="docs/en_US/NAS/Overview.md#enas">ENAS</a></li>
+          <li><a href="docs/en_US/NAS/Overview.md#darts">DARTS</a></li>
+          <li><a href="docs/en_US/NAS/Overview.md#p-darts">P-DARTS</a></li>
+          </ul>
+          <a href="docs/en_US/Compressor/Overview.md">Model Compression (Beta)</a>
+          <ul>
+          <li><a href="docs/en_US/Compressor/Pruner.md#agp-pruner">AGP Pruner</a></li>
+          <li><a href="docs/en_US/Compressor/Pruner.md#slim-pruner">Slim Pruner</a></li>
+          <li><a href="docs/en_US/Compressor/Pruner.md#fpgm-pruner">FPGM Pruner</a></li>
+          <li><a href="docs/en_US/Compressor/Quantizer.md#qat-quantizer">QAT Quantizer</a></li>
+          <li><a href="docs/en_US/Compressor/Quantizer.md#dorefa-quantizer">DoReFa Quantizer</a></li>
+          <li><a href="docs/en_US/Compressor/Overview.md">More...</a></li>
+          </ul>
+          <a href="docs/en_US/FeatureEngineering/Overview.md">Feature Engineering (Beta)</a>
+          <ul>
+          <li><a href="docs/en_US/FeatureEngineering/GradientFeatureSelector.md">GradientFeatureSelector</a></li>
+          <li><a href="docs/en_US/FeatureEngineering/GBDTSelector.md">GBDTSelector</a></li>
+          </ul>
       </td>
       <td>
       <ul>
@@ -211,7 +220,7 @@ Linux and MacOS
 * Run the following commands in an environment that has `python >= 3.5`, `git` and `wget`.
 
 ```bash
-    git clone -b v1.1 https://github.com/Microsoft/nni.git
+    git clone -b v1.2 https://github.com/Microsoft/nni.git
     cd nni
     source install.sh
 ```
@@ -221,7 +230,7 @@ Windows
 * Run the following commands in an environment that has `python >=3.5`, `git` and `PowerShell`
 
 ```bash
-  git clone -b v1.1 https://github.com/Microsoft/nni.git
+  git clone -b v1.2 https://github.com/Microsoft/nni.git
   cd nni
   powershell -ExecutionPolicy Bypass -file install.ps1
 ```
@@ -237,7 +246,7 @@ The following example is an experiment built on TensorFlow. Make sure you have *
 * Download the examples via clone the source code.
 
 ```bash
-    git clone -b v1.1 https://github.com/Microsoft/nni.git
+    git clone -b v1.2 https://github.com/Microsoft/nni.git
 ```
 
 Linux and MacOS

diff --git a/azure-pipelines.yml b/azure-pipelines.yml
@@ -38,8 +38,8 @@ jobs:
     displayName: 'Run pylint'
   - script: |
       python3 -m pip install flake8 --user
-      IGNORE=./tools/nni_annotation/testcase/*:F821,./examples/trials/mnist-nas/*/mnist*.py:F821,./examples/trials/nas_cifar10/src/cifar10/general_child.py:F821
-      python3 -m flake8 . --count --per-file-ignores=$IGNORE --select=E9,F63,F72,F82 --show-source --statistics
+      EXCLUDES=./src/nni_manager/,./tools/nni_annotation/testcase/,./examples/trials/mnist-nas/*/mnist*.py,./examples/trials/nas_cifar10/src/cifar10/general_child.py
+      python3 -m flake8 . --count --exclude=$EXCLUDES --select=E9,F63,F72,F82 --show-source --statistics
     displayName: 'Run flake8 tests to find Python syntax errors and undefined names'
   - script: |
       cd test

diff --git a/deployment/pypi/setup.py b/deployment/pypi/setup.py
@@ -62,7 +62,7 @@
         'scipy',
         'coverage',
         'colorama',
-        'sklearn'
+        'scikit-learn==0.20'
     ],
     classifiers = [
         'Programming Language :: Python :: 3',

diff --git a/docs/en_US/FeatureEngineering/Overview.md b/docs/en_US/FeatureEngineering/Overview.md
@@ -241,17 +241,17 @@ print("Pipeline Score: ", pipeline.score(X_train, y_train))
 
 # Benchmark
 
-`Baseline` means without any feature selection, we directly pass the data to LogisticRegression. For this benchmark, we only use 10% data from the train as test data.
-
-| Dataset | Baseline | GradientFeatureSelector | TreeBasedClassifier | #Train | #Feature | 
-| ----------- | ------ | ------ | ------- | ------- | -------- |
-| colon-cancer | 0.7547 | 0.7368 | 0.7223 | 62 | 2,000 |
-| gisette | 0.9725 | 0.89416 | 0.9792 | 6,000 | 5,000 |
-| avazu | 0.8834 | N/A | N/A | 40,428,967 | 1,000,000 |
-| rcv1 | 0.9644 | 0.7333 | 0.9615 | 20,242 | 47,236 |
-| news20.binary | 0.9208 | 0.6870 | 0.9070 | 19,996 | 1,355,191 |
-| real-sim | 0.9681 | 0.7969 | 0.9591 | 72,309 | 20,958 |
-
-The benchmark could be download in [here](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
+`Baseline` means without any feature selection, we directly pass the data to LogisticRegression. For this benchmark, we only use 10% data from the train as test data. For the GradientFeatureSelector, we only take the top20 features. The metric is the mean accuracy on the given test data and labels.
+
+| Dataset | Baseline | GradientFeatureSelector top20 | GradientFeatureSelector auto | TreeBasedClassifier | #Train | #Feature | 
+| ----------- | ------ | ------ | ------- | ------- | -------- |-------- |
+| colon-cancer | 0.7547 | 0.7368 | 0.5389 | 0.7223 | 62 | 2,000 |
+| gisette | 0.9725 | 0.9241 | 0.9658 |0.9792 | 6,000 | 5,000 |
+| rcv1 | 0.9644 | 0.7333 | 0.9548 |0.9615 | 20,242 | 47,236 |
+| news20.binary | 0.9208 | 0.8780  | 0.8875 | 0.9070 | 19,996 | 1,355,191 |
+| real-sim | 0.9681 |  0.7969 | 0.9439  |0.9591 | 72,309 | 20,958 |
+
+The dataset of benchmark could be download in [here](https://www.csie.ntu.edu.tw/~cjlin/libsvmtools/datasets/
 )
 
+The code could be refenrence `/examples/feature_engineering/gradient_feature_selector/benchmark_test.py`.
diff --git a/docs/en_US/NAS/DARTS.md b/docs/en_US/NAS/DARTS.md
@@ -0,0 +1,18 @@
+# DARTS on NNI
+
+## Introduction
+
+The paper [DARTS: Differentiable Architecture Search](https://arxiv.org/abs/1806.09055) addresses the scalability challenge of architecture search by formulating the task in a differentiable manner. Their method is based on the continuous relaxation of the architecture representation, allowing efficient search of the architecture using gradient descent
+
+To implement, authors optimize the network weights and architecture weights alternatively in mini-batches. They further explore the possibility that uses second order optimization (unroll) instead of first order, to improve the performance.
+
+Implementation on NNI is based on the [official implementation](https://github.com/quark0/darts) and a [popular 3rd-party repo](https://github.com/khanrc/pt.darts). So far, first and second order optimization and training from scratch on CIFAR10 have been implemented.
+
+## Reproduce Results
+
+To reproduce the results in the paper, we do experiments with first and second order optimization. Due to the time limit, we retrain *only the best architecture* derived from the search phase and we repeat the experiment *only once*. Our results is currently on par with the results reported in paper. We will add more results later when ready.
+
+|                        | In paper      | Reproduction |
+| ---------------------- | ------------- | ------------ |
+| First order (CIFAR10)  | 3.00 +/- 0.14 | 2.78         |
+| Second order (CIFAR10) | 2.76 +/- 0.09 | 2.89         |
diff --git a/docs/en_US/NAS/ENAS.md b/docs/en_US/NAS/ENAS.md
@@ -0,0 +1,7 @@
+# ENAS on NNI
+
+## Introduction
+
+The paper [Efficient Neural Architecture Search via Parameter Sharing](https://arxiv.org/abs/1802.03268) uses parameter sharing between child models to accelerate the NAS process. In ENAS, a controller learns to discover neural network architectures by searching for an optimal subgraph within a large computational graph. The controller is trained with policy gradient to select a subgraph that maximizes the expected reward on the validation set. Meanwhile the model corresponding to the selected subgraph is trained to minimize a canonical cross entropy loss.
+
+Implementation on NNI is based on the [official implementation in Tensorflow](https://github.com/melodyguan/enas), macro and micro search space on CIFAR10 included. Since code to train from scratch on NNI is not ready yet, reproduction results are currently unavailable.
diff --git a/docs/en_US/NAS/NasInterface.md b/docs/en_US/NAS/NasInterface.md
@@ -55,7 +55,7 @@ def forward(self, x):
     out = self.input_switch([in_tensor1, in_tensor2, in_tensor3])
     ...
 ```
-`InputChoice` is a PyTorch module, in init, it needs meta information, for example, from how many input candidates to choose how many inputs, the name of this initialized `InputChoice`. The real candidate input tensors can only be obtained in `forward` function. In `forward`, `InputChoice` instance is called with real candidate input tensors.
+`InputChoice` is a PyTorch module, in init, it needs meta information, for example, from how many input candidates to choose how many inputs, and the name of this initialized `InputChoice`. The real candidate input tensors can only be obtained in `forward` function. In the `forward` function, the `InputChoice` module you create in `__init__` (e.g., `self.input_switch`) is called with real candidate input tensors.
 
 Some [NAS trainers](#one-shot-training-mode) need to know the source layer the input tensors, thus, we add one input argument `choose_from` in `InputChoice` to indicate the source layer of each candidate input. `choose_from` is a list of string, each element is `key` of `LayerChoice` and `InputChoice` or the name of a module (refer to [the code](https://github.com/microsoft/nni/blob/master/src/sdk/pynni/nni/nas/pytorch/mutables.py) for more details).
 
@@ -102,8 +102,6 @@ Different trainers could have different input arguments depending on their algor
 
 The supported trainers can be found [here](./Overview.md#supported-one-shot-nas-algorithms). A very simple example using NNI NAS API can be found [here](https://github.com/microsoft/nni/tree/master/examples/nas/simple/train.py).
 
-The complete example code can be found [here]().
-
 ### Classic distributed search
 
 Neural architecture search is originally executed by running each child model independently as a trial job. We also support this searching approach, and it naturally fits in NNI hyper-parameter tuning framework, where tuner generates child model for next trial and trials run in training service.
@@ -174,4 +172,4 @@ Correspondingly, the generate architecture is in the following format:
         "_idex": [1]
     }
 }
-```
+```
diff --git a/docs/en_US/NAS/Overview.md b/docs/en_US/NAS/Overview.md
@@ -6,11 +6,11 @@ However, it takes great efforts to implement NAS algorithms, and it is hard to r
 
 With this motivation, our ambition is to provide a unified architecture in NNI, to accelerate innovations on NAS, and apply state-of-art algorithms on real world problems faster.
 
-With [the unified interface](.NasInterface.md), there are two different modes for the architecture search. [The one](#supported-one-shot-nas-algorithms) is the so-called one-shot NAS, where a super-net is built based on search space, and using one shot training to generate good-performing child model. [The other](.ClassicNas.md) is the traditional searching approach, where each child model in search space runs as an independent trial, the performance result is sent to tuner and the tuner generates new child model.
+With [the unified interface](./NasInterface.md), there are two different modes for the architecture search. [The one](#supported-one-shot-nas-algorithms) is the so-called one-shot NAS, where a super-net is built based on search space, and using one shot training to generate good-performing child model. [The other](./NasInterface.md#classic-distributed-search) is the traditional searching approach, where each child model in search space runs as an independent trial, the performance result is sent to tuner and the tuner generates new child model.
 
 * [Supported One-shot NAS Algorithms](#supported-one-shot-nas-algorithms)
-* [Classic Distributed NAS with NNI experiment](.NasInterface.md#classic-distributed-search)
-* [NNI NAS Programming Interface](.NasInterface.md)
+* [Classic Distributed NAS with NNI experiment](./NasInterface.md#classic-distributed-search)
+* [NNI NAS Programming Interface](./NasInterface.md)
 
 ## Supported One-shot NAS Algorithms
 
@@ -37,7 +37,7 @@ Note, these algorithms run **standalone without nnictl**, and supports PyTorch o
 
 #### Usage
 
-ENAS in NNI is still under development and we only support search phase for macro/micro search space on CIFAR10. Training from scratch and search space on PTB has not been finished yet.
+ENAS in NNI is still under development and we only support search phase for macro/micro search space on CIFAR10. Training from scratch and search space on PTB has not been finished yet. [Detailed Description](ENAS.md)
 
 ```bash
 # In case NNI code is not cloned. If the code is cloned already, ignore this line and enter code folder.
@@ -58,7 +58,7 @@ python3 search.py -h
 
 ### DARTS
 
-The main contribution of [DARTS: Differentiable Architecture Search][3] on algorithm is to introduce a novel algorithm for differentiable network architecture search on bilevel optimization.
+The main contribution of [DARTS: Differentiable Architecture Search][3] on algorithm is to introduce a novel algorithm for differentiable network architecture search on bilevel optimization. [Detailed Description](DARTS.md)
 
 #### Usage
 

diff --git a/docs/en_US/Overview.md b/docs/en_US/Overview.md
@@ -46,6 +46,33 @@ For each experiment, user only needs to define a search space and update a few l
 
 More details about how to run an experiment, please refer to [Get Started](Tutorial/QuickStart.md).
 
+## Core Features
+
+NNI provides a key capacity to run multiple instances in parallel to find best combinations of parameters. This feature can be used in various domains, like find best hyperparameters for a deep learning model, or find best configuration for database and other complex system with real data.
+
+NNI is also like to provide algorithm toolkits for machine learning and deep learning, especially neural architecture search (NAS) algorithms, model compression algorithms, and feature engineering algorithms.
+
+### Hyperparameter Tuning
+This is a core and basic feature of NNI, we provide many popular [automatic tuning algorithms](Tuner/BuiltinTuner.md) (i.e., tuner) and [early stop algorithms](Assessor/BuiltinAssessor.md) (i.e., assessor). You could follow [Quick Start](Tutorial/QuickStart.md) to tune your model (or system). Basically, there are the above three steps and then start an NNI experiment.
+
+### General NAS Framework
+This NAS framework is for users to easily specify candidate neural architectures, for example, could specify multiple candidate operations (e.g., separable conv, dilated conv) for a single layer, and specify possible skip connections. NNI will find the best candidate automatically. On the other hand, the NAS framework provides simple interface for another type of users (e.g., NAS algorithm researchers) to implement new NAS algorithms. Detailed description and usage can be found [here](NAS/Overview.md).
+
+NNI has supported many one-shot NAS algorithms, such as ENAS, DARTS, through NNI trial SDK. To use these algorithms you do not have to start an NNI experiment. Instead, to import an algorithm in your trial code, and simply run your trial code. If you want to tune the hyperparameters in the algorithms or want to run multiple instances, you could choose a tuner and start an NNI experiment.
+
+Other than one-shot NAS, NAS can also run in a classic mode where each candidate architecture runs as an independent trial job. In this mode, similar to hyperparameter tuning, users have to start an NNI experiment and choose a tuner for NAS.
+
+### Model Compression
+Model Compression on NNI includes pruning algorithms and quantization algorithms. These algorithms are provided through NNI trial SDK. Users could directly use them in their trial code and run the trial code without starting an NNI experiment. Detailed description and usage can be found [here](Compressor/Overview.md).
+
+There are different types of hyperparamters in model compression. One type is the hyperparameters in input configuration, e.g., sparsity, quantization bits, to a compression algorithm. The other type is the hyperparamters in compression algorithms. Here, Hyperparameter tuning of NNI could help a lot in finding the best compressed model automatically. A simple example can be found [here](Compressor/AutoCompression.md).
+
+### Automatic Feature Engineering
+Automatic feature engineering is for users to find the best features for the following tasks. Detailed description and usage can be found [here](FeatureEngineering/Overview.md). It is supported through NNI trial SDK, which means you do not have to create an NNI experiment. Instead, simply import a built-in auto-feature-engineering algorithm in your trial code and directly run your trial code. 
+
+The auto-feature-engineering algorithms usually have a bunch of hyperparameters themselves. If you want to automatically tune those hyperparameters, you can leverage hyperparameter tuning of NNI, that is, choose a tuning algorithm (i.e., tuner) and start an NNI experiment for it.
+
+
 ## Learn More
 * [Get started](Tutorial/QuickStart.md)
 * [How to adapt your trial code on NNI?](TrialExample/Trials.md)
@@ -56,4 +83,7 @@ More details about how to run an experiment, please refer to [Get Started](Tutor
 * [How to run an experiment on local?](TrainingService/LocalMode.md)
 * [How to run an experiment on multiple machines?](TrainingService/RemoteMachineMode.md)
 * [How to run an experiment on OpenPAI?](TrainingService/PaiMode.md)
-* [Examples](TrialExample/MnistExamples.md)
+* [Examples](TrialExample/MnistExamples.md)
+* [Neural Architecture Search on NNI](NAS/Overview.md)
+* [Automatic model compression on NNI](Compressor/Overview.md)
+* [Automatic feature engineering on NNI](FeatureEngineering/Overview.md)