diff --git a/README.md b/README.md index a1494d001c7..14b609e489b 100644 --- a/README.md +++ b/README.md @@ -17,310 +17,214 @@ Katib supports Katib is the project which is agnostic to machine learning (ML) frameworks. It can tune hyperparameters of applications written in any language of the -users’ choice and natively supports many ML frameworks, such as TensorFlow, -MXNet, PyTorch, XGBoost, and others. +users’ choice and natively supports many ML frameworks, such as +[TensorFlow](https://www.tensorflow.org/), [Apache MXNet](https://mxnet.apache.org/), +[PyTorch](https://pytorch.org/), [XGBoost](https://xgboost.readthedocs.io/en/latest/), and others. -## Getting Started - -Follow the -[getting-started guide](https://www.kubeflow.org/docs/components/katib/hyperparameter/) -on the Kubeflow website. - -## Name +Katib can perform training jobs using any Kubernetes +[Custom Resources](https://www.kubeflow.org/docs/components/katib/trial-template/) +with out of the box support for [Kubeflow Training Operators](https://github.com/kubeflow/tf-operator), +[Argo Workflows](https://github.com/argoproj/argo-workflows), [Tekton Pipelines](https://github.com/tektoncd/pipeline) +and many more. Katib stands for `secretary` in Arabic. -## Concepts in Katib - -For a detailed description of the concepts in Katib and AutoML, check the -[Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/overview/). - -Katib has the concepts of `Experiment`, `Suggestion`, `Trial` and `Worker Job`. - -### Experiment - -An `Experiment` represents a single optimization run over a feasible space. -Each `Experiment` contains a configuration: - -1. **Objective**: What you want to optimize. -2. **Search Space**: Constraints for configurations describing the feasible space. -3. **Search Algorithm**: How to find the optimal configurations. - -Katib `Experiment` is defined as a CRD. Check the detailed guide to -[configuring and running a Katib `Experiment`](https://kubeflow.org/docs/components/katib/experiment/) -in the Kubeflow docs. - -### Suggestion - -A `Suggestion` is a set of hyperparameter values that the hyperparameter tuning -process has proposed. Katib creates a `Trial` to evaluate -the suggested set of values. - -Katib `Suggestion` is defined as a CRD. - -### Trial - -A `Trial` is one iteration of the hyperparameter tuning process. -A `Trial` corresponds to one worker job instance with a list of parameter -assignments. The list of parameter assignments corresponds to a `Suggestion`. - -Each `Experiment` runs several `Trials`. The `Experiment` runs the `Trials` until -it reaches either the objective or the configured maximum number of `Trials`. - -Katib `Trial` is defined as a CRD. - -### Worker Job - -The `Worker Job` is the process that runs to evaluate a `Trial` and calculate -its objective value. - -The `Worker Job` can be any type of Kubernetes resource or -[Kubernetes CRD](https://kubernetes.io/docs/concepts/extend-kubernetes/api-extension/custom-resources/). -Follow the [`Trial` template guide](https://www.kubeflow.org/docs/components/katib/trial-template/#custom-resource) -to support your own Kubernetes resource in Katib. - -Katib has these CRD examples in upstream: - -- [Kubernetes `Job`](https://kubernetes.io/docs/concepts/workloads/controllers/job/) - -- [Kubeflow `TFJob`](https://www.kubeflow.org/docs/components/training/tftraining/) - -- [Kubeflow `PyTorchJob`](https://www.kubeflow.org/docs/components/training/pytorch/) - -- [Kubeflow `MPIJob`](https://www.kubeflow.org/docs/components/training/mpi/) - -- [Kubeflow `XGBoostJob`](https://github.com/kubeflow/xgboost-operator) - -- [Tekton `Pipelines`](./examples/v1beta1/tekton) - -- [Argo `Workflows`](./examples/v1beta1/argo) +# Search Algorithms -Thus, Katib supports multiple frameworks with the help of different job kinds. - -### Search Algorithms - -Katib currently supports several search algorithms. Follow the +Katib supports several search algorithms. Follow the [Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/experiment/#search-algorithms-in-detail) -to know more about each algorithm. - -#### Hyperparameter Tuning - -- [Random Search](https://en.wikipedia.org/wiki/Hyperparameter_optimization#Random_search) -- [Tree of Parzen Estimators (TPE)](https://papers.nips.cc/paper/4443-algorithms-for-hyper-parameter-optimization.pdf) -- [Multivariate TPE](https://tech.preferred.jp/en/blog/multivariate-tpe-makes-optuna-even-more-powerful/) -- [Grid Search](https://en.wikipedia.org/wiki/Hyperparameter_optimization#Grid_search) -- [Hyperband](https://arxiv.org/pdf/1603.06560.pdf) -- [Bayesian Optimization](https://arxiv.org/pdf/1012.2599.pdf) -- [Covariance Matrix Adaptation Evolution Strategy (CMA-ES)](https://arxiv.org/abs/1604.00772) -- [Sobol's Quasirandom Sequence](https://dl.acm.org/doi/10.1145/641876.641879) - -#### Neural Architecture Search - -- [Efficient Neural Architecture Search (ENAS)](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1beta1/nas/enas) -- [Differentiable Architecture Search (DARTS)](https://github.com/kubeflow/katib/tree/master/pkg/suggestion/v1beta1/nas/darts) - -## Components in Katib - -Katib consists of several components as shown below. Each component is running -on Kubernetes as a deployment. Each component communicates with others via GRPC -and the API is defined at `pkg/apis/manager/v1beta1/api.proto`. - -- Katib main components: - - `katib-db-manager` - the GRPC API server of Katib which is the DB Interface. - - `katib-mysql` - the data storage backend of Katib using mysql. - - `katib-ui` - the user interface of Katib. - - `katib-controller` - the controller for the Katib CRDs in Kubernetes. - -## Web UI - -Katib provides a Web UI. -During 1.3 we've worked on a new iteration of the UI, which is rewritten in -Angular and is utilizing the common code of the other Kubeflow [dashboards](https://github.com/kubeflow/kubeflow/tree/master/components/crud-web-apps). - -The users are currently able to list, delete and create Experiments in their -cluster via this new UI as well as inspect the owned Trials. One important -missing functionalities are the ability to edit the Trial templates ConfigMaps -and view Neural Architecture Search models. Check [this Project](https://github.com/kubeflow/katib/projects/1) -to monitor the current progress. - -![katibui](./docs/images/katib-ui.png) - -To use the old Katib UI you can update the Katib image `newName` with the previous -image tag `docker.io/kubeflowkatib/katib-ui:v0.11.1` in the [Kustomize](./manifests/v1beta1/installs/katib-standalone/kustomization.yaml#L29) -manifests. - -## GRPC API documentation - -Check the [Katib v1beta1 API reference docs](https://www.kubeflow.org/docs/reference/katib/v1beta1/katib/). - -## Installation - -For standard installation of Katib with support for all job operators, -install Kubeflow. -Follow the documentation: - -- [Kubeflow installation guide](https://www.kubeflow.org/docs/started/getting-started/) -- [Kubeflow Katib guides](https://www.kubeflow.org/docs/components/katib/). - -If you install Katib with other Kubeflow components, -you can't submit Katib jobs in Kubeflow namespace. Check the -[Kubeflow documentation](https://www.kubeflow.org/docs/components/katib/hyperparameter/#example-using-random-algorithm) -to know more about it. - -Alternatively, if you want to install Katib manually with TF and PyTorch -operators support, follow these steps: - -Create Kubeflow namespace: - -``` -kubectl create namespace kubeflow +to know more about each algorithm and check the +[Suggestion service guide](/docs/new-algorithm-service.md) to implement your +custom algorithm. + +
+ Hyperparameter Tuning + | ++ Neural Architecture Search + | ++ Early Stopping + | +
+ Random Search + | ++ ENAS + | ++ Median Stop + | +
+ Grid Search + | ++ DARTS + | ++ | +
+ Bayesian Optimization + | ++ | ++ | +
+ TPE + | ++ | ++ | +
+ Multivariate TPE + | ++ | ++ | +
+ CMA-ES + | ++ | ++ | +
+ Sobol's Quasirandom Sequence + | ++ | ++ | +
+ HyperBand + | ++ | ++ | +