Skip to content

Commit

Permalink
Merge pull request #116 from roytman/kfpCluster2
Browse files Browse the repository at this point in the history
installation scripts and instructions for a real K8s installation
  • Loading branch information
roytman authored May 13, 2024
2 parents 4ca66fb + 0c2d1eb commit f5abca2
Show file tree
Hide file tree
Showing 6 changed files with 75 additions and 28 deletions.
58 changes: 41 additions & 17 deletions kfp/doc/simple_transform_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -203,19 +203,19 @@ To compile pipeline execute `make build` command in the same directory where you

## Preparing cluster for pipeline execution

We support two options for preparing cluster - local KInd cluster and usage of the standard Kubernetes
or OpenShift cluster. We recommend using Kind cluster for only for local testing and debugging,
not production loads. For production loads use external Kubernetes cluster.
The project provides instructions and deployment automation to run all components in an all-inclusive fashion on a
single machine using a Kind cluster. However, this topology is not suitable for processing medium and large datasets,
and deployment should be carried out on a real Kubernetes or OpenShift cluster. Therefore, we recommend using Kind
cluster for only for local testing and debugging, not production loads. For production loads use a real Kubernetes cluster.

### Preparing Knd cluster
### Preparing Kind cluster

You can create a Kind cluster with all required software installed
using the following command:

````shell
make setup
````
**Note** that this command has to run from the project kind subdirectory

We tested Kind cluster installation on multiple platforms, including Intel Mac,
AMD Mac (see [this](deployment_on_MacOS.md)), Windows,
Expand All @@ -224,23 +224,47 @@ for additional RHEL configurations) and Ubuntu. Additional platform can be used,
require additional configuration and testing.

### Preparing an existing Kubernetes cluster
Alternatively you can deploy pipeline to the existing Kubernetes cluster.

Alternatively you can deploy pipeline to the existing Kubernetes cluster. This cluster should have both KubeRay
and KFP (**Note** We are currently using KFP V1) installed.
#### Pre-requirements
Deployment on an existing cluster requires less pre-installed software
Only the following programs should be manually installed:

To install KubeRay on your cluster first create `kuberay` namespace:
- [Helm](https://helm.sh/docs/intro/install/) 3.10.0 or greater must be installed and configured on your machine.
- [Kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) 1.26 or newer must be installed on your machine, and be
able to connect to the external cluster. For OpenShift clusters OpenShift CLI
[oc](https://docs.openshift.com/container-platform/4.15/cli_reference/openshift_cli/getting-started-cli.html) can be used instead.
- [wget](https://www.gnu.org/software/wget/) 1.21 must be installed on your machine.

```shell
kubectl create namespace kuberay
```
#### Installation steps

In order to execute data transformers on the remote cluster, the following packages should be installed on the Kubernetes cluster:

- [KubeFlow Pipelines](https://www.kubeflow.org/docs/components/pipelines/v1/introduction/) (KFP). Currently, we use
upstream Argo-based KFP v1.
- [KubeRay](https://docs.ray.io/en/latest/cluster/kubernetes/index.html) controller and
[KubeRay API Server](https://ray-project.github.io/kuberay/components/apiserver/)

and deploy in it KubeRay operator and KubeRay API aerver. You can use
this [script](../../kind/hack/tools/install_kuberay.sh) or use
[documentation](https://docs.ray.io/en/master/cluster/kubernetes/getting-started.html#kuberay-quickstart)
You can install the software from their repositories, or you can use our installation scripts.

To install KFP on your cluster, you can use this
[script](../../kind/hack/tools/install_kubeflow.sh))
or refer to KFP [documentation](https://www.kubeflow.org/docs/components/pipelines/v1/installation/overview/).
If your local kubectl is configured to connect to the external cluster do the following:
```bash
export EXTERNAL_CLUSTER=1
make setup
```

- In addition, you should configure external access to the KFP UI (`svc/ml-pipeline-ui` in the `kubeflow` ns) and the Ray
Server API (`svc/kuberay-apiserver-service` in the `kuberay` ns). Depends on your cluster and its deployment it can be
LoadBalancer services, Ingresses or Routes.

- Optionally, you can upload the test data into the [MinIO](https://min.io/) Object Store, deployed as part of KFP. In
order to do this, please provide external access to the Minio (`svc/minio-service` in the `kubeflow` ns) and execute the
following commands:
```bash
export MINIO_SERVER=<Minio external URL>
kubectl apply -f kind/hack/s3_secret.yaml
kind/hack/populate_minio.sh
```

## Deploying workflow

Expand Down
14 changes: 14 additions & 0 deletions kind/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,14 +13,17 @@ include ../.make.defaults

export TOOLS_DIR=${ROOT_DIR}/hack/tools

export EXTERNAL_CLUSTER ?= 0
export DEPLOY_KUBEFLOW ?= 1
export USE_KFP_MINIO ?= 1
export POPULATE_TEST_DATA ?= 1
KIND_CLUSTER_NAME ?= dataprep

setup::
ifneq ($(EXTERNAL_CLUSTER), 1)
@# Help: Building kind cluster with everything installed
$(MAKE) .create-kind-cluster
endif
$(MAKE) cluster-deploy
@echo "setup-cluster completed"

Expand All @@ -32,6 +35,7 @@ cluster-deploy::
@# Help: Deploy all required tools on existing cluster
$(MAKE) .cluster-prepare
$(MAKE) .cluster-prepare-wait
ifneq ($(EXTERNAL_CLUSTER), 1)
cd $(TOOLS_DIR) && ./ingress.sh deploy
ifeq ($(DEPLOY_KUBEFLOW)$(USE_KFP_MINIO),11)
cd $(TOOLS_DIR) && ./install_minio.sh deploy
Expand All @@ -40,10 +44,16 @@ ifeq ($(POPULATE_TEST_DATA), 1)
$(MAKE) populate-data
endif
endif
endif

clean::
ifneq ($(EXTERNAL_CLUSTER), 1)
@# Help: Deleting the kind cluster
cd $(TOOLS_DIR); ./kind_management.sh delete_cluster ${KIND_CLUSTER_NAME}
else
cd $(TOOLS_DIR) && ./install_kuberay.sh cleanup
cd $(TOOLS_DIR) && ./install_kubeflow.sh cleanup
endif

build::

Expand All @@ -53,14 +63,18 @@ test::
cd $(TOOLS_DIR); ./kind_management.sh create_cluster ${KIND_CLUSTER_NAME}

.cluster-prepare::
ifneq ($(EXTERNAL_CLUSTER), 1)
cd $(TOOLS_DIR) && ./install_nginx.sh deploy
endif
cd $(TOOLS_DIR) && ./install_kuberay.sh deploy
ifeq ($(DEPLOY_KUBEFLOW),1)
cd $(TOOLS_DIR) && ./install_kubeflow.sh deploy
endif

.cluster-prepare-wait::
ifneq ($(EXTERNAL_CLUSTER), 1)
cd $(TOOLS_DIR) && ./install_nginx.sh deploy-wait
endif
cd $(TOOLS_DIR) && ./install_kuberay.sh deploy-wait
ifeq ($(DEPLOY_KUBEFLOW),1)
cd $(TOOLS_DIR) && ./install_kubeflow.sh deploy-wait
Expand Down
3 changes: 2 additions & 1 deletion kind/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ The following programs should be manually installed:
- [Kubectl](https://kubernetes.io/docs/tasks/tools/#kubectl) 1.26 or newer must be installed on your machine.
- [wget](https://www.gnu.org/software/wget/) 1.21 must be installed on your machine.
- [MinIO Client (mc)](https://min.io/docs/minio/kubernetes/upstream/index.html) must be installed on your machine. Please
choose your OS system, and process according to "(Optional) Install the MinIO Client". You have to install the `mc` client only.
choose your OS system, and process according to "(Optional) Install the MinIO Client". You have to install the `mc` client only.
- [git client](https://git-scm.com/downloads), we use git client to clone installation repository
- [lsof](https://www.ionos.com/digitalguide/server/configuration/linux-lsof/) usually it is part of Linux or MacOS distribution.
- Container agent such as [Docker](https://www.docker.com/) or [Podman](https://podman-desktop.io/)

Expand Down
9 changes: 7 additions & 2 deletions kind/hack/populate_minio.sh
Original file line number Diff line number Diff line change
@@ -1,7 +1,12 @@
#!/usr/bin/env bash

echo "creating minio alias"
mc alias set kfp http://localhost:8080 minio minio123
if [ "$MINIO_SERVER" == "" ]; then
MINIO_SERVER="http://localhost:8080"
fi

echo "creating minio alias to $MINIO_SERVER"
mc alias set kfp $MINIO_SERVER minio minio123

echo "creating test bucket"
mc mb kfp/test
echo "copying data"
Expand Down
12 changes: 7 additions & 5 deletions kind/hack/tools/install_kubeflow.sh
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ op=$1

source ../common.sh

SLEEP_TIME="${SLEEP_TIME:-50}"
SLEEP_TIME="${SLEEP_TIME:-60}"
MAX_RETRIES="${MAX_RETRIES:-20}"
EXIT_CODE=0

Expand All @@ -13,9 +13,8 @@ deploy() {
echo "Temporary dir:"
echo "${TEMP_DIR}"
cd $TEMP_DIR
git clone https://github.com/kubeflow/pipelines.git
git clone https://github.com/kubeflow/pipelines.git --branch ${PIPELINE_VERSION} --single-branch
cd pipelines
git checkout tags/${PIPELINE_VERSION}
kubectl apply -k manifests/kustomize/cluster-scoped-resources
kubectl wait --for condition=established --timeout=60s crd/applications.app.k8s.io
# Disable the public endpoint
Expand All @@ -36,6 +35,8 @@ deploy() {

wait(){
echo "Wait for kubeflow deployment."
# see https://github.com/kubeflow/pipelines/issues/5411
kubectl delete deployment -n kubeflow controller-manager
wait_for_pods "kubeflow" "$MAX_RETRIES" "$SLEEP_TIME" || EXIT_CODE=$?

if [[ $EXIT_CODE -ne 0 ]]
Expand All @@ -49,8 +50,9 @@ wait(){
}

delete(){
kubectl delete -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION"
kubectl delete -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION"
kubectl delete -k "github.com/kubeflow/pipelines/manifests/kustomize/env/dev?ref=$PIPELINE_VERSION" --ignore-not-found || True
kubectl delete -k "github.com/kubeflow/pipelines/manifests/kustomize/cluster-scoped-resources?ref=$PIPELINE_VERSION" --ignore-not-found || True
kubectl delete --ignore-not-found clusterrolebinding pipeline-runner-extend
}

usage(){
Expand Down
7 changes: 4 additions & 3 deletions kind/hack/tools/install_kuberay.sh
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ EXIT_CODE=0
deploy() {
sed -i.back "s/tag: v[0-9].*/tag: v${KUBERAY_APISERVER}/" ${ROOT_DIR}/hack/ray_api_server_values.yaml
helm repo add kuberay https://ray-project.github.io/kuberay-helm/
helm repo update
helm repo update kuberay
helm install kuberay-operator kuberay/kuberay-operator -n kuberay --version ${KUBERAY_OPERATOR} --set image.pullPolicy=IfNotPresent --create-namespace
helm install -f ${ROOT_DIR}/hack/ray_api_server_values.yaml kuberay-apiserver kuberay/kuberay-apiserver -n kuberay --version ${KUBERAY_APISERVER} --set image.pullPolicy=IfNotPresent
echo "Finished KubeRay deployment."
Expand All @@ -29,8 +29,9 @@ wait(){
}

delete(){
helm uninstall kuberay-operator -n kuberay
helm uninstall kuberay-apiserver -n kuberay
helm uninstall kuberay-operator -n kuberay || True
helm uninstall kuberay-apiserver -n kuberay || True
helm uninstall kuberay-apiserver -n kuberay || True
}

usage(){
Expand Down

0 comments on commit f5abca2

Please sign in to comment.