Modify guides

kubeflow · Nov 11, 2020 · 66842ed · 66842ed
1 parent dac91bd
commit 66842ed
Show file tree

Hide file tree

Showing 6 changed files with 160 additions and 65 deletions.
diff --git a/content/en/docs/components/katib/early-stopping.md b/content/en/docs/components/katib/early-stopping.md
@@ -1,74 +1,75 @@
 +++
 title = "Using Early Stopping"
-description = "How to use early stopping in Katib experiments"
+description = "How to use an early stopping in Katib experiments"
 weight = 60
 
 +++
 
-This page shows how you can use
+This guide shows how you can use
 [early stopping](https://en.wikipedia.org/wiki/Early_stopping) to improve your
-Katib experiments.
-Early stopping allows you to avoid overfitting when you train your model
-during Katib experiments.
-It helps you to save computing resources and experiment execution time by
-stopping the experiment's trials before the training process is complete.
+Katib experiments. Early stopping allows you to avoid overfitting when you
+train your model during Katib experiments. It helps you to save computing
+resources and experiment execution time by stopping the experiment's trials
+before the training process is complete.
 
 The major advantage of using early stopping in Katib, is that you don't
 need to modify your
-[training container package](/docs/components/hyperparameter-tuning/experiment/#packaging-your-training-code-in-a-container-image).
+[training container package](/docs/components/katib/experiment/#packaging-your-training-code-in-a-container-image).
 All you have to do is to change your experiment YAML file.
 
 Early stopping works in the same way as Katib's
-[metrics collector](http://localhost:1313/docs/components/hyperparameter-tuning/experiment/#metrics-collector).
-It analyses required metrics from `stdout` or from the arbitrary output file and
-an early stopping algorithm makes the decision if the trial needs to be stopped.
-Currently, early stopping works only with `StdOut` or `File` metrics collectors.
+[metrics collector](/docs/components/katib/experiment/#metrics-collector).
+It analyses required metrics from the `stdout` or from the arbitrary output file
+and an early stopping algorithm makes the decision if the trial needs to be
+stopped. Currently, early stopping works only with
+`StdOut` or `File` metrics collectors.
 
 **Note**: Your training container must print training logs with the timestamp,
 because early stopping algorithms need to know the sequence of reported metrics.
-See the
-[example](https://github.com/kubeflow/katib/blob/master/examples/v1beta1/mxnet-mnist/mnist.py#L36)
+Check the
+[`MXNet` example](https://github.com/kubeflow/katib/blob/master/examples/v1beta1/mxnet-mnist/mnist.py#L36)
 how to add date format to your logs.
 
 ## Configure the experiment with early stopping
 
 As a reference, you can use the YAML file of the
 [early stopping example](https://github.com/kubeflow/katib/blob/master/examples/v1beta1/early-stopping/median-stop.yaml).
 
-First of all, follow the [guide](/docs/components/hyperparameter-tuning/experiment/#configuring-the-experiment)
+First of all, follow the
+[guide](/docs/components/katib/experiment/#configuring-the-experiment)
 to configure your Katib experiment.
-To apply early stopping on your experiment, specify `.spec.earlyStopping`
-parameter, similar to `.spec.algorithm`. See the
+To apply early stopping for your experiment, specify the `.spec.earlyStopping`
+parameter, similar to the `.spec.algorithm`. Refer to the
 [`EarlyStoppingSpec` type](https://github.com/kubeflow/katib/blob/master/pkg/apis/controller/common/v1beta1/common_types.go#L41-L58)
 
-- `.earlyStopping.algorithmName` - is the name of the early stopping algorithm.
+- `.earlyStopping.algorithmName` - the name of the early stopping algorithm.
 
-- `.earlyStopping.algorithmSettings`- is the settings for the early stopping algorithm.
+- `.earlyStopping.algorithmSettings`- the settings for the early stopping algorithm.
 
 Experiment's suggestion produces new trials. After that, the early stopping
 algorithm generates early stopping rules for the created trials.
 Once the trial reaches all the rules, it is stopped and the trial status is
-transferred to `EarlyStopped`.
-After that, Katib calls the suggestion again to ask for the new trials.
+changed to the `EarlyStopped`. Then, Katib calls the suggestion again to
+ask for the new trials.
 
-Read more about Katib concepts in the
-[overview guide](/docs/components/hyperparameter-tuning/overview/#katib-concepts).
+Learn more about Katib concepts
+in the [overview guide](/docs/components/katib/overview/#katib-concepts).
 
 Follow the
-[Katib configuration guide](/docs/components/hyperparameter-tuning/katib-config/#early-stopping-settings)
-to see how you can specify your own image for the early stopping algorithm.
+[Katib configuration guide](/docs/components/katib/katib-config/#early-stopping-settings)
+to specify your own image for the early stopping algorithm.
 
 ### Early stopping algorithms in detail
 
-Katib currently supports one early stopping algorithm.
 Here’s a list of the early stopping algorithms available in Katib.
 The links lead to descriptions on this page:
 
 - [Median Stopping Rule](#median-stopping-rule)
 
 More algorithms are under development. You can add an early stopping algorithm
-to Katib yourself. See the
-[developer guide](https://github.com/kubeflow/katib/blob/master/docs/developer-guide.md) to contribute.
+to Katib yourself. Check the
+[developer guide](https://github.com/kubeflow/katib/blob/master/docs/developer-guide.md)
+to contribute.
 
 <a id="median-stopping-rule"></a>
 
@@ -96,12 +97,12 @@ Katib supports the following early stopping settings:
     <tbody>
       <tr>
         <td>min_trials_required</td>
-        <td>Minimal number of complete trials to compute median value</td>
+        <td>Minimal number of successful trials to compute median value</td>
         <td>3</td>
       </tr>
       <tr>
         <td>start_step</td>
-        <td>Number of reported intermediate results before stopping the trials</td>
+        <td>Number of reported intermediate results before stopping the trial</td>
         <td>4</td>
       </tr>
     </tbody>
@@ -110,12 +111,11 @@ Katib supports the following early stopping settings:
 
 ### Submit an early stopping experiment from the UI
 
-You can use Katib UI to submit an early stopping experiment.
-Follow
-[these steps](/docs/components/hyperparameter-tuning/experiment/#running-the-experiment-from-the-katib-ui)
-to create the experiment from the UI.
+You can use Katib UI to submit an early stopping experiment. Follow
+[these steps](/docs/components/katib/experiment/#running-the-experiment-from-the-katib-ui)
+to create an experiment from the UI.
 
-Once you reach early stopping section, select the appropriate values:
+Once you reach the early stopping section, select the appropriate values:
 
 <img src="/docs/images/katib/katib-early-stopping-parameter.png"
   alt="UI form to deploy an early stopping Katib experiment"
@@ -126,7 +126,7 @@ Once you reach early stopping section, select the appropriate values:
 You have to install [jq](https://stedolan.github.io/jq/download/),
 to run below commands.
 
-Check early stopped trials in your experiment:
+Check the early stopped trials in your experiment:
 
 ```shell
 kubectl get experiment <experiment-name>  -n <experiment-namespace> -o json | jq -r ".status"
@@ -168,31 +168,37 @@ If you check status for the early stopped trial:
 kubectl get trial median-stop-2ml8h96d -n <experiment-namespace>
 ```
 
-You see the `EarlyStopped` status for the trial:
+You should be able to view `EarlyStopped` status for the trial:
 
 ```shell
 NAME                   TYPE           STATUS   AGE
 median-stop-2ml8h96d   EarlyStopped   True     15m
 ```
 
-As well, you can see results on the Katib UI.
-Check trial statuses on the experiment monitor page:
+As well, you can check the results on the Katib UI.
+The trial statuses on the experiment monitor page looks as follows:
 
 <img src="/docs/images/katib/katib-early-stopping-trials.png"
   alt="UI form to view trials"
   class="mt-3 mb-3 border border-info rounded">
 
-If you click on the early stopped trial name, you see reported metrics before trial
-is early stopped:
+You can click on the early stopped trial name to get reported metrics before this
+trial is early stopped:
 
 <img src="/docs/images/katib/katib-early-stopping-trial-info.png"
   alt="UI form to view trial info"
   class="mt-3 mb-3 border border-info rounded">
 
 ## Next steps
 
-- TODO: Add link to resume Experiment
+- Learn how to
+  [configure and run your Katib experiments](/docs/components/katib/experiment/).
 
-- Read about [Katib Configuration (Katib config)](/docs/components/katib/katib-config/).
+- How to
+  [restart your experiment and use the resume policies](/docs/components/katib/resume-experiment/).
 
-- How to [set up environment variables](/docs/components/katib/env-variables/) for each Katib component.
+- Check the
+  [Katib Configuration (Katib config)](/docs/components/katib/katib-config/).
+
+- How to [set up environment variables](/docs/components/katib/env-variables/)
+  for each Katib component.
diff --git a/content/en/docs/components/katib/experiment.md b/content/en/docs/components/katib/experiment.md
@@ -1,5 +1,5 @@
 +++
-title = "Running an experiment"
+title = "Running an Experiment"
 description = "How to configure and run a hyperparameter tuning or neural architecture search experiment in Katib"
 weight = 30
 
@@ -815,16 +815,11 @@ View the results of the experiment in the Katib UI:
   neural architecture search, check the
   [introduction to Katib](/docs/components/katib/overview/).
 
-<<<<<<< HEAD:content/en/docs/components/katib/experiment.md
+- Boost your hyperparameter tuning experiment with
+  the [early stopping guide](/docs/components/katib/early-stopping/)
+
 - Check the
   [Katib Configuration (Katib config)](/docs/components/katib/katib-config/).
-=======
-* Follow the [early stopping guide](/docs/components/hyperparameter-tuning/early-stopping/)
-  to see how you can boost your hyperparameter tunning experiments.
-
-* For a detailed instruction of the Katib Configuration file, 
-  read the [Katib config page](/docs/components/hyperparameter-tuning/katib-config/).
->>>>>>> Add early stopping doc:content/en/docs/components/hyperparameter-tuning/experiment.md
 
 - How to [set up environment variables](/docs/components/katib/env-variables/)
   for each Katib component.
diff --git a/content/en/docs/components/katib/hyperparameter.md b/content/en/docs/components/katib/hyperparameter.md
@@ -1,5 +1,5 @@
 +++
-title = "Getting started with Katib"
+title = "Getting Started with Katib"
 description = "How to set up Katib and perform hyperparameter tuning"
 weight = 20
 

diff --git a/content/en/docs/components/katib/katib-config.md b/content/en/docs/components/katib/katib-config.md
@@ -1,7 +1,7 @@
 +++
 title = "Katib Configuration Overview"
 description = "How to make changes in Katib configuration"
-weight = 90
+weight = 70
 
 +++
 
@@ -10,8 +10,17 @@ This guide describes
 the Kubernetes
 [Config Map](https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/) that contains information about:
 
-1. Current [metrics collectors](/docs/components/katib/experiment/#metrics-collector) (`key = metrics-collector-sidecar`).
-1. Current [algorithms](/docs/components/katib/experiment/#search-algorithms-in-detail) (suggestions) (`key = suggestion`).
+1. Current
+   [metrics collectors](/docs/components/katib/experiment/#metrics-collector)
+   (`key = metrics-collector-sidecar`).
+
+1. Current
+   [algorithms](/docs/components/katib/experiment/#search-algorithms-in-detail)
+   (suggestions) (`key = suggestion`).
+
+1. Current
+   [early stopping algorithms](/docs/components/katib/early-stopping/#early-stopping-algorithms-in-detail)
+   (`key = early-stopping`).
 
 The Katib Config Map must be deployed in the
 [`KATIB_CORE_NAMESPACE`](/docs/components/katib/env-variables/#katib-controller)
@@ -119,16 +128,16 @@ suggestion: |-
 }
 ```
 
-All of these settings except **`image`** can be omitted. If you don't specify any other settings,
-a default value is set automatically.
+All of these settings except **`image`** can be omitted. If you don't specify
+any other settings, a default value is set automatically.
 
 1. `image` - a Docker image for the suggestion's container with a `random`
    algorithm (**must be specified**).
 
    Image example: `docker.io/kubeflowkatib/<suggestion-name>`
 
    For each algorithm (suggestion) you can specify one of the following
-   suggestion names in Docker image:
+   suggestion names in the Docker image:
 
    <div class="table-responsive">
      <table class="table table-bordered">
@@ -216,3 +225,79 @@ a default value is set automatically.
    in which case, the pod uses the
    [default](https://kubernetes.io/docs/tasks/configure-pod-container/configure-service-account/#use-the-default-service-account-to-access-the-api-server)
    service account.
+
+   **Note:** If you want to run your experiments with
+   [early stopping](/docs/components/katib/early-stopping/),
+   the suggestion's deployment must have permission to update the experiment's
+   trial status. If you don't specify a service account in the Katib config,
+   Katib controller creates required
+   [Kubernetes Role-based access control](https://kubernetes.io/docs/reference/access-authn-authz/rbac)
+   for the suggestion.
+
+   If you need your own service account for the experiment's
+   suggestion with early stopping, you have to follow the rules:
+
+   - The service account name can't be equal to
+     `<experiment-name>-<experiment-algorithm>`
+
+   - The service account must have sufficient permissions to update
+     the experiment's trial status.
+
+## Early stopping settings
+
+These settings are related to Katib early stopping, where:
+
+- key: `early-stopping`
+- value: corresponding JSON settings for each early stopping algorithm name
+
+If you want to use a new early stopping algorithm, you need to update the
+Katib config. For example, using a `medianstop` early stopping algorithm with
+all settings looks as follows:
+
+```json
+early-stopping: |-
+{
+  "medianstop": {
+    "image": "docker.io/kubeflowkatib/earlystopping-medianstop",
+    "imagePullPolicy": "Always"
+  },
+  ...
+}
+```
+
+All of these settings except **`image`** can be omitted. If you don't specify
+any other settings, a default value is set automatically.
+
+1. `image` - a Docker image for the early stopping's container with a
+   `medianstop` algorithm (**must be specified**).
+
+   Image example: `docker.io/kubeflowkatib/<early-stopping-name>`
+
+   For each early stopping algorithm you can specify one of the following
+   early stopping names in the Docker image:
+
+   <div class="table-responsive">
+     <table class="table table-bordered">
+       <thead class="thead-light">
+         <tr>
+           <th>Early stopping name</th>
+           <th>Early stopping algorithm</th>
+           <th>Description</th>
+         </tr>
+       </thead>
+       <tbody>
+         <tr>
+           <td><code>earlystopping-medianstop</code></td>
+           <td><code>medianstop</code></td>
+           <td><a href="https://github.com/kubeflow/katib/tree/master/pkg/earlystopping/v1beta1/medianstop">Katib
+             Median Stopping</a> implementation</td>
+         </tr>
+       </tbody>
+     </table>
+   </div>
+
+1. `imagePullPolicy` - an
+   [image pull policy](https://kubernetes.io/docs/concepts/configuration/overview/#container-images)
+   for the early stopping's container with a `medianstop` algorithm.
+
+   The default value is `IfNotPresent`
diff --git a/content/en/docs/components/katib/overview.md b/content/en/docs/components/katib/overview.md
@@ -11,15 +11,19 @@ weight = 10
 This guide introduces the concepts of hyperparameter tuning, neural
 architecture search, and the Katib system as a component of Kubeflow.
 
-Katib is a Kubernetes-native project for automated machine learning (AutoML) —
-it's a system for hyperparameter tuning and neural architecture search (NAS).
-Katib supports a number of machine learning frameworks, including
-TensorFlow, MXNet, PyTorch, XGBoost, and others.
+Katib is a Kubernetes-native project for automated machine learning (AutoML).
+Katib supports hyperparameter tuning, early stopping and
+neural architecture search (NAS).
 Learn more about AutoML at [fast.ai](https://www.fast.ai/2018/07/16/auto-ml2/),
 [Google Cloud](https://cloud.google.com/automl),
 [Microsoft Azure](https://docs.microsoft.com/en-us/azure/machine-learning/concept-automated-ml#automl-in-azure-machine-learning) or
 [Amazon SageMaker](https://aws.amazon.com/blogs/aws/amazon-sagemaker-autopilot-fully-managed-automatic-machine-learning/).
 
+Katib is the project which is agnostic to machine learning (ML) frameworks.
+It can tune hyperparameters of applications written in any language
+of the users' choice and natively supports many ML frameworks,
+such as TensorFlow, MXNet, PyTorch, XGBoost, and others.
+
 Katib supports a lot of various AutoML algorithms, such as
 [Bayesian optimization](https://arxiv.org/pdf/1012.2599.pdf),
 [Tree of Parzen Estimators](https://papers.nips.cc/paper/2011/file/86e8f7ab32cfd12577bc2619bc635690-Paper.pdf),
@@ -75,6 +79,11 @@ hyperparameter tuning job (_experiment_). Each trial tests a different set of
 hyperparameter configurations. At the end of the experiment, Katib outputs
 the optimized values for the hyperparameters.
 
+You can improve you hyperparameter tunning experiments by using
+[early stopping](https://en.wikipedia.org/wiki/Early_stopping) techniques.
+Follow the [early stopping guide](/docs/components/katib/early-stopping/)
+for the details.
+
 ## Neural architecture search
 
 {{% alert title="Alpha version" color="warning" %}}

diff --git a/content/en/docs/components/katib/trial-template.md b/content/en/docs/components/katib/trial-template.md
@@ -1,5 +1,5 @@
 +++
-title = "Overview of trial templates"
+title = "Overview of Trial Templates"
 description = "How to specify trial template parameters and support a custom resource (CRD) in Katib"
 weight = 40