Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add pipelines code #38

Merged
merged 8 commits into from
Aug 29, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .github/images/S3-models.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
167 changes: 167 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,167 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/#use-with-ide
.pdm.toml

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

*-secret.yaml
*-secret.yml
aws-storage-config
tekton/azureml-container-pipeline/azureml-container-aws-creds-real.yaml
tekton/azureml-container-pipeline/aws-env-real.yaml
oc-debug-pod.yaml
22 changes: 14 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,21 @@
# ODH AI Edge Use Cases

Artifacts in support of ODH Edge use cases that integration with Red Hat Advanced Cluster Management(Open Cluster Management)

| Components | Version |
|--------------------------------------|---------|
| OpenShift | 4.13 |
| Open Data Hub | 2.x |
| Red Hat Advanced Cluster Management | 2.8 |
| OpenShift Pipelines | |
| OpenShift Pipelines | 1.11.x |
| Quay Registry | 2.8 |


## Proof of Concept Edge use case with ACM

The main objective is to showcase that a user can take a trained model, use a pipeline to package it with all the dependencies and deploy it at the near edge location(s) in a centralized way.

### Infrastructure Configuration

1. Provision OpenShift Cluster
1. Configure the default Identity Provider
1. Install Red Hat Advanced Cluster Management
Expand All @@ -29,18 +30,23 @@ The main objective is to showcase that a user can take a trained model, use a pi
* Deploy the Model container

### MLOps Engineer workflows

1. Develop the model in an ODH Jupyter notebook
1. Build the model from the notebook using Data Science Pipelines
1. Push the model to the image registry accessible by the near edge cluster(s)
1. Update the GitOps config for the near edge cluster

### Pipelines setup

See [pipelines/README.md](pipelines/README.md)

### Observability setup

* Core cluster
* Login to the core cluster and run `make install/observability-core` to setup acm-observability on the core cluster.
* Login to the core cluster and run `make install/observability-core` to setup acm-observability on the core cluster.
* Edge cluster(s)
* Login to edge cluster
* Enable userWorkloadMonitoring
* `oc edit cm cluster-monitoring-config`
* Set variable `enableUserWorkload` to `true`
* Run `make install/observability-edge` to create the configmap required for metric whitelisting.
* Login to edge cluster
* Enable userWorkloadMonitoring
* `oc edit cm cluster-monitoring-config`
* Set variable `enableUserWorkload` to `true`
* Run `make install/observability-edge` to create the ConfigMap required for metric whitelisting.
78 changes: 78 additions & 0 deletions pipelines/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Pipelines Setup

## Models

These pipelines come with the following trained example MLflow models: [bike-rentals-auto-ml](models/bike-rentals-auto-ml/) and [tensorflow-housing](models/tensorflow-housing/):

```plaintext
bike-rentals-auto-ml/
├── conda.yaml
├── MLmodel
├── model.pkl
├── python_env.yaml
└── requirements.txt

tensorflow-housing/
├── conda.yaml
├── MLmodel
├── model.pkl
├── python_env.yaml
├── requirements.txt
└── tf2model/
├── saved_model.pb
└── ...
```

## Prerequisites

- Trained AzureML model including MLflow environment (see above)
- OpenShift cluster with [OpenShift Pipelines Operator](https://docs.openshift.com/container-platform/4.13/cicd/pipelines/installing-pipelines.html) installed
- OpenShift project / namespace. E.g. `oc new-project azureml-model-to-edge`
- A repository on [Quay.io](https://quay.io/)
- S3 bucket for storing the models
- A clone of this repository

## Deploy AzureML Container build pipeline

> **NOTE** Run `cd tekton` before running the commands below, that's where the pipeline files are.

### Provide S3 credentials

After creating the S3 bucket and uploading the models (see above), fill out `aws-env.yaml` and create the secret:

![S3 models example](../.github/images/S3-models.png)

```bash
oc create -f azureml-container-pipeline/aws-env.yaml
```

### Deploy and run the build pipeline

> **NOTE** Make sure to change the `aws-bucket-name` parameter to match your AWS bucket name if using one of the provided `PipelineRun` files.

```bash
oc apply -k azureml-container-pipeline/
oc create -f azureml-container-pipeline/azureml-container-pipelinerun-tensorflow-housing.yaml
```

## Deploy Test MLflow Container image pipeline

### Quay Repository and Robot Permissions

- Create a repository, add a robot account to push images and set write Permissions for the robot on the repository.
- Download `build-secret.yml`
- Apply build-secret. E.g.:

```bash
oc apply -f <downloaddir>/rhoai-edge-build-secret.yml
oc secret link pipeline rhoai-edge-build-pull-secret
```

### Deploy and run the test pipeline

> **NOTE** Make sure to change the `target-imagerepo` parameter to match the name of your Quay namespace if using one of the provided `PipelineRun` files.

```bash
oc apply -k test-mlflow-image-pipeline/
oc create -f test-mlflow-image-pipeline/test-mlflow-image-pipelinerun-tensorflow-housing.yaml
```
25 changes: 25 additions & 0 deletions pipelines/containerfiles/Containerfile.openvino.mlserver.mlflow
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
# Create the MLServer container.
#
FROM docker.io/openvino/model_server:latest

USER root

RUN mkdir /models && chown ovms:ovms /models

# CHANGE THIS LINE TO MATCH YOUR MODEL
COPY --chown=ovms:ovms tensorflow-housing/tf2model /models/1
RUN rm -f /models/1/fingerprint.pb

RUN chmod o+rwX /models/1
# https://docs.openshift.com/container-platform/4.13/openshift_images/create-images.html#use-uid_create-images
RUN chgrp -R 0 /models/1 && chmod -R g=u /models/1

# https://stackoverflow.com/a/41207910/19020549
# ENTRYPOINT ["/usr/bin/env"]

EXPOSE 9090 8080

USER ovms

# CHANGE THIS LINE TO MATCH YOUR MODEL
CMD ["/ovms/bin/ovms", "--model_path", "/models", "--model_name", "tensorflow-housing", "--port", "9090", "--rest_port", "8080", "--shape", "auto", "--metrics_enable"]
48 changes: 48 additions & 0 deletions pipelines/containerfiles/Containerfile.seldonio.mlserver.mlflow
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
FROM registry.access.redhat.com/ubi9/python-39:1 as env-creator

USER root

# Install miniconda as a helper to create a portable python environment
RUN mkdir -p ~/miniconda3 && \
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda3/miniconda.sh && \
bash ~/miniconda3/miniconda.sh -b -u -p ~/miniconda3 && \
rm -rf ~/miniconda3/miniconda.sh

# CHANGE THIS LINE TO MATCH YOUR MODEL
COPY bike-rentals-auto-ml/ /opt/app-root/src/model/

# Download model dependencies and create a portable tarball
# The tarball is placed inside the model directory.
RUN . ~/miniconda3/bin/activate && \
conda env create -n mlflow-env -f model/conda.yaml && \
conda activate mlflow-env && \
pip install mlserver-mlflow && \
conda list && \
conda deactivate && \
conda activate && \
conda install conda-pack && \
conda-pack -n mlflow-env -o model/environment.tar.gz

# Create the MLServer container. Use the slim image, since we are providing an environment tarball.
#
FROM docker.io/seldonio/mlserver:1.3.5-slim

USER root

RUN mkdir /mnt/models/ && chown mlserver:mlserver /mnt/models/

# Copy both the model together with its environment tarball.
COPY --from=env-creator --chown=mlserver:mlserver /opt/app-root/src/model /mnt/models/

RUN chmod o+rwX /mnt/models/
# https://docs.openshift.com/container-platform/4.13/openshift_images/create-images.html#use-uid_create-images
RUN chgrp -R 0 /mnt/models/ && chmod -R g=u /mnt/models/

# Specify that the model is in MLFlow format, and some additional flags.
ENV MLSERVER_MODEL_IMPLEMENTATION=mlserver_mlflow.MLflowRuntime MLSERVER_HTTP_PORT=8080 MLSERVER_GRPC_PORT=9090
# CHANGE THIS LINE TO MATCH YOUR MODEL
ENV MLSERVER_MODEL_URI=/mnt/models MLSERVER_MODEL_NAME=bike-rentals-auto-ml

EXPOSE 8080 9090

USER mlserver
Loading