From a04de33b0c72aa10df4a4e5ad8cc46c6b52dc89a Mon Sep 17 00:00:00 2001 From: avelichk Date: Wed, 11 Nov 2020 13:55:25 +0000 Subject: [PATCH] Point omitted --- .../en/docs/components/katib/experiment.md | 38 ++----------------- 1 file changed, 4 insertions(+), 34 deletions(-) diff --git a/content/en/docs/components/katib/experiment.md b/content/en/docs/components/katib/experiment.md index e464dd1060..3d84b0609b 100644 --- a/content/en/docs/components/katib/experiment.md +++ b/content/en/docs/components/katib/experiment.md @@ -75,48 +75,18 @@ These are the fields in the experiment configuration spec: Refer to the [`ObjectiveSpec` type](https://github.com/kubeflow/katib/blob/master/pkg/apis/controller/common/v1beta1/common_types.go#L93). -* **algorithm**: The search algorithm that you want Katib to use to find the - best hyperparameters or neural architecture configuration. Examples include - random search, grid search, Bayesian optimization, and more. - See the [search algorithm details](#search-algorithms) below. - -* **trialTemplate**: The template that defines the trial. - You must package your ML training code into a Docker image, as described - [above](#docker-image). You must configure the model's - hyperparameters either as command-line arguments or as environment variables, - so that Katib can automatically set the values in each trial. - - You can use one of the following job types to train your model: - - - [Kubernetes Job](https://kubernetes.io/docs/concepts/workloads/controllers/jobs-run-to-completion/) - (does not support distributed execution). - - [Kubeflow TFJob](/docs/guides/components/tftraining/) (supports - distributed execution). - - [Kubeflow PyTorchJob](/docs/guides/components/pytorch/) (supports - distributed execution). - - See the [`TrialTemplate` - type](https://github.com/kubeflow/katib/blob/master/pkg/apis/controller/experiments/v1alpha3/experiment_types.go#L189-L203). - The template - uses the [Go template format](https://golang.org/pkg/text/template/). - - You can define the job in raw string format or you can use a - [ConfigMap](https://kubernetes.io/docs/tasks/configure-pod-container/configure-pod-configmap/). - [Here](https://github.com/kubeflow/katib/blob/master/manifests/v1alpha3/katib-controller/trialTemplateConfigmapLabeled.yaml) is an example how to create ConfigMap with trial templates. - -* **parallelTrialCount**: The maximum number of hyperparameter sets that Katib +- **parallelTrialCount**: The maximum number of hyperparameter sets that Katib should train in parallel. The default value is 3. - **maxTrialCount**: The maximum number of trials to run. This is equivalent to the number of hyperparameter sets that Katib should - generate to test the model. If the `maxTrialCount` value is omitted, your + generate to test the model. If the `maxTrialCount` value is **omitted**, your experiment is running until the objective goal is reached or the experiment reaches a maximum number of failed trials. - **maxFailedTrialCount**: The maximum number of failed trials before Katib - should stop the experiment. - This is equivalent to the number of failed hyperparameter sets that Katib - should test. + should stop the experiment. This is equivalent to the number of failed + hyperparameter sets that Katib should test. If the number of failed trials exceeds `maxFailedTrialCount`, Katib stops the experiment with a status of `Failed`.