Skip to content
This repository has been archived by the owner on Sep 18, 2024. It is now read-only.

Update search space documentation #2637

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 31 additions & 40 deletions docs/en_US/Tutorial/SearchSpaceSpec.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@

In NNI, tuner will sample parameters/architecture according to the search space, which is defined as a json file.

To define a search space, users should define the name of variable, the type of sampling strategy and its parameters.
To define a search space, users should define the name of the variable, the type of sampling strategy and its parameters.

* An example of search space definition as follow:
* An example of a search space definition is as follow:

```yaml
{
Expand All @@ -19,83 +19,74 @@ To define a search space, users should define the name of variable, the type of

```

Take the first line as an example. `dropout_rate` is defined as a variable whose priori distribution is a uniform distribution of a range from `0.1` and `0.5`.
Take the first line as an example. `dropout_rate` is defined as a variable whose priori distribution is a uniform distribution with a range from `0.1` to `0.5`.

Note that the ability of a search space is highly connected with your tuner. We listed the supported types for each builtin tuner below. For a customized tuner, you don't have to follow our convention and you will have the flexibility to define any type you want.
Note that the available sampling strategies within a search space depend on the tuner you want to use. We list the supported types for each builtin tuner below. For a customized tuner, you don't have to follow our convention and you will have the flexibility to define any type you want.

## Types

All types of sampling strategies and their parameter are listed here:

* `{"_type": "choice", "_value": options}`

* Which means the variable's value is one of the options. Here `options` should be a list of numbers or a list of strings. Using arbitrary objects as members of this list (like sublists, a mixture of numbers and strings, or null values) should work in most cases, but may trigger undefined behaviors.
* `options` could also be a nested sub-search-space, this sub-search-space takes effect only when the corresponding element is chosen. The variables in this sub-search-space could be seen as conditional variables. Here is an simple [example of nested search space definition](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nested-search-space/search_space.json). If an element in the options list is a dict, it is a sub-search-space, and for our built-in tuners you have to add a key `_name` in this dict, which helps you to identify which element is chosen. Accordingly, here is a [sample](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nested-search-space/sample.json) which users can get from nni with nested search space definition. Tuners which support nested search space are as follows:

- Random Search
- TPE
- Anneal
- Evolution
* The variable's value is one of the options. Here `options` should be a list of numbers or a list of strings. Using arbitrary objects as members of this list (like sublists, a mixture of numbers and strings, or null values) should work in most cases, but may trigger undefined behaviors.
* `options` can also be a nested sub-search-space, this sub-search-space takes effect only when the corresponding element is chosen. The variables in this sub-search-space can be seen as conditional variables. Here is an simple [example of nested search space definition](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nested-search-space/search_space.json). If an element in the options list is a dict, it is a sub-search-space, and for our built-in tuners you have to add a `_name` key in this dict, which helps you to identify which element is chosen. Accordingly, here is a [sample](https://github.com/microsoft/nni/tree/master/examples/trials/mnist-nested-search-space/sample.json) which users can get from nni with nested search space definition. See the table below for the tuners which support nested search spaces.

* `{"_type": "randint", "_value": [lower, upper]}`
* Choosing a random integer from `lower` (inclusive) to `upper` (exclusive).
* Note: Different tuners may interpret `randint` differently. Some (e.g., TPE, GridSearch) treat integers from lower
to upper as unordered ones, while others respect the ordering (e.g., SMAC). If you want all the tuners to respect
* Choosing a random integer between `lower` (inclusive) and `upper` (exclusive).
* Note: Different tuners may interpret `randint` differently. Some (e.g., TPE, GridSearch) treat integers from lower
to upper as unordered ones, while others respect the ordering (e.g., SMAC). If you want all the tuners to respect
the ordering, please use `quniform` with `q=1`.

* `{"_type": "uniform", "_value": [low, high]}`
* Which means the variable value is a value uniformly between low and high.
* The variable value is uniformly sampled between low and high.
* When optimizing, this variable is constrained to a two-sided interval.

* `{"_type": "quniform", "_value": [low, high, q]}`
* Which means the variable value is a value like `clip(round(uniform(low, high) / q) * q, low, high)`, where the clip operation is used to constraint the generated value in the bound. For example, for `_value` specified as [0, 10, 2.5], possible values are [0, 2.5, 5.0, 7.5, 10.0]; For `_value` specified as [2, 10, 5], possible values are [2, 5, 10].
* Suitable for a discrete value with respect to which the objective is still somewhat "smooth", but which should be bounded both above and below. If you want to uniformly choose integer from a range [low, high], you can write `_value` like this: `[low, high, 1]`.
* The variable value is determined using `clip(round(uniform(low, high) / q) * q, low, high)`, where the clip operation is used to constrain the generated value within the bounds. For example, for `_value` specified as [0, 10, 2.5], possible values are [0, 2.5, 5.0, 7.5, 10.0]; For `_value` specified as [2, 10, 5], possible values are [2, 5, 10].
* Suitable for a discrete value with respect to which the objective is still somewhat "smooth", but which should be bounded both above and below. If you want to uniformly choose an integer from a range [low, high], you can write `_value` like this: `[low, high, 1]`.

* `{"_type": "loguniform", "_value": [low, high]}`
* Which means the variable value is a value drawn from a range [low, high] according to a loguniform distribution like exp(uniform(log(low), log(high))), so that the logarithm of the return value is uniformly distributed.
* The variable value is drawn from a range [low, high] according to a loguniform distribution like exp(uniform(log(low), log(high))), so that the logarithm of the return value is uniformly distributed.
* When optimizing, this variable is constrained to be positive.

* `{"_type": "qloguniform", "_value": [low, high, q]}`
* Which means the variable value is a value like `clip(round(loguniform(low, high) / q) * q, low, high)`, where the clip operation is used to constraint the generated value in the bound.
* The variable value is determined using `clip(round(loguniform(low, high) / q) * q, low, high)`, where the clip operation is used to constrain the generated value within the bounds.
* Suitable for a discrete variable with respect to which the objective is "smooth" and gets smoother with the size of the value, but which should be bounded both above and below.

* `{"_type": "normal", "_value": [mu, sigma]}`
* Which means the variable value is a real value that's normally-distributed with mean mu and standard deviation sigma. When optimizing, this is an unconstrained variable.
* The variable value is a real value that's normally-distributed with mean mu and standard deviation sigma. When optimizing, this is an unconstrained variable.

* `{"_type": "qnormal", "_value": [mu, sigma, q]}`
* Which means the variable value is a value like `round(normal(mu, sigma) / q) * q`
* The variable value is determined using `round(normal(mu, sigma) / q) * q`
* Suitable for a discrete variable that probably takes a value around mu, but is fundamentally unbounded.

* `{"_type": "lognormal", "_value": [mu, sigma]}`
* Which means the variable value is a value drawn according to `exp(normal(mu, sigma))` so that the logarithm of the return value is normally distributed. When optimizing, this variable is constrained to be positive.
* The variable value is drawn according to `exp(normal(mu, sigma))` so that the logarithm of the return value is normally distributed. When optimizing, this variable is constrained to be positive.

* `{"_type": "qlognormal", "_value": [mu, sigma, q]}`
* Which means the variable value is a value like `round(exp(normal(mu, sigma)) / q) * q`
* The variable value is determined using `round(exp(normal(mu, sigma)) / q) * q`
* Suitable for a discrete variable with respect to which the objective is smooth and gets smoother with the size of the variable, which is bounded from one side.


## Search Space Types Supported by Each Tuner

| | choice | randint | uniform | quniform | loguniform | qloguniform | normal | qnormal | lognormal | qlognormal |
|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|
| TPE Tuner | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Random Search Tuner| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Anneal Tuner | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Evolution Tuner | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| SMAC Tuner | ✓ | ✓ | ✓ | ✓ | ✓ | | | | | |
| Batch Tuner | ✓ | | | | | | | | | |
| Grid Search Tuner | ✓ | ✓ | | ✓ | | | | | | |
| Hyperband Advisor | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Metis Tuner | ✓ | ✓ | ✓ | ✓ | | | | | | |
| GP Tuner | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | | | | |

| | choice | choice(nested) | randint | uniform | quniform | loguniform | qloguniform | normal | qnormal | lognormal | qlognormal |
|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|:------:|
| TPE Tuner | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Random Search Tuner| ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Anneal Tuner | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Evolution Tuner | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| SMAC Tuner | ✓ | | ✓ | ✓ | ✓ | ✓ | | | | | |
| Batch Tuner | ✓ | | | | | | | | | | |
| Grid Search Tuner | ✓ | | ✓ | | ✓ | | | | | | |
| Hyperband Advisor | ✓ | | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Metis Tuner | ✓ | | ✓ | ✓ | ✓ | | | | | | |
| GP Tuner | ✓ | | ✓ | ✓ | ✓ | ✓ | ✓ | | | | |

Known Limitations:

* GP Tuner and Metis Tuner support only **numerical values** in search space (`choice` type values can be no-numeraical with other tuners, e.g. string values). Both GP Tuner and Metis Tuner use Gaussian Process Regressor(GPR). GPR make predictions based on a kernel function and the 'distance' between different points, it's hard to get the true distance between no-numerical values.
* GP Tuner and Metis Tuner support only **numerical values** in search space (`choice` type values can be no-numerical with other tuners, e.g. string values). Both GP Tuner and Metis Tuner use Gaussian Process Regressor(GPR). GPR make predictions based on a kernel function and the 'distance' between different points, it's hard to get the true distance between no-numerical values.

* Note that for nested search space:

* Only Random Search/TPE/Anneal/Evolution tuner supports nested search space

* We do not support nested search space "Hyper Parameter" in visualization now, the enhancement is being considered in [#1110](https://github.com/microsoft/nni/issues/1110), any suggestions or discussions or contributions are warmly welcomed