Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KFPv2 support step 1 #226

Merged
merged 68 commits into from
Jun 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
68 commits
Select commit Hold shift + click to select a range
6108300
Initial version.
revit13 May 13, 2024
0d4b3f7
Comment compute_exec_params_op.
revit13 May 15, 2024
b2ad17f
separate lib for compile only dependencies
roytman May 21, 2024
cbe07d2
merge with dev
roytman May 29, 2024
ea8e8c4
add missing files
roytman May 30, 2024
b6e280e
remove kfp/kfp_support_lib_v2/src/kfp_support/workflow_support_v2/
roytman May 30, 2024
997a1ed
Update kfp_support_lib_v2.
revit13 May 29, 2024
df5ff64
Fixes.
revit13 May 29, 2024
228a8a5
More fixes.
revit13 May 30, 2024
77713d0
revert 3 latest commits
roytman May 30, 2024
433dcae
add KFPv2 installation setting
roytman May 31, 2024
815f0ca
before rebase
roytman Jun 2, 2024
eeadb25
add kfp_support_lib/Makefile
roytman Jun 2, 2024
e343bdb
kfpv1 intermediate impelmentation
roytman Jun 2, 2024
9fc3070
lib fixes
roytman Jun 2, 2024
d67d8a9
fix import
roytman Jun 2, 2024
0417995
fix python_appserver_client tests
roytman Jun 3, 2024
d280247
some fixes
roytman Jun 3, 2024
2ae0984
Fixes after testing.
revit13 Jun 3, 2024
96f8dd4
update noop kfpv2
roytman Jun 3, 2024
a0028b2
Fixes after testing.
revit13 Jun 3, 2024
47c4c68
Fix compute_exec_params
revit13 Jun 3, 2024
8ff9434
Minor fix.
revit13 Jun 3, 2024
de204fa
Merge pull request #223 from revit13/test1
roytman Jun 3, 2024
97e51c1
add input paramters to the pipeline
roytman Jun 4, 2024
e5127bd
Change kfp_v1_workflow_support.
revit13 Jun 4, 2024
efe3ab1
Merge pull request #227 from revit13/lib1
roytman Jun 4, 2024
8627a46
remove Dockerfile_v2, update .make.transforms_workflows
roytman Jun 4, 2024
245610d
first universal pipeline
roytman Jun 4, 2024
3ada157
move default_compute_execution_params
roytman Jun 4, 2024
e92803a
fix imports in ray_components
roytman Jun 4, 2024
b815751
Fixes after testing.
revit13 Jun 5, 2024
48a5165
add noop_multiple_wf.py
roytman Jun 5, 2024
6e35f4d
More fixes.
revit13 Jun 5, 2024
76ea8f9
Minor fix.
revit13 Jun 5, 2024
caa1de3
fix run job id
roytman Jun 5, 2024
5fa1596
More fixes.
revit13 Jun 5, 2024
e416148
Address code review.
revit13 Jun 5, 2024
ecc3bf1
Merge branch 'dev' into kfp-v2
revit13 Jun 5, 2024
55a8f64
Merge pull request #232 from revit13/change1
revit13 Jun 5, 2024
cc3deca
Minor fixes.
revit13 Jun 5, 2024
8ae25d3
Additional fixes.
revit13 Jun 5, 2024
1ea0cf5
Fix transform test.
revit13 Jun 5, 2024
4002368
noop
roytman Jun 5, 2024
6f1b417
Fix noop
roytman Jun 5, 2024
3bdf6a7
image name fixes, dock_id
roytman Jun 5, 2024
5da574b
Adjust tokenization and filter workflows.
revit13 Jun 5, 2024
2edfd67
fix tests
roytman Jun 5, 2024
d4fb2d5
Fixes after testing.
revit13 Jun 5, 2024
c3930e7
More fixes.
revit13 Jun 5, 2024
09ea1cc
fix doc_id and tests
roytman Jun 5, 2024
d0d2f59
Minor fixes.
revit13 Jun 5, 2024
5e15040
fix .make.workflows
roytman Jun 5, 2024
5db777a
Fixes after testing.
revit13 Jun 5, 2024
33a3d97
fix ededup
roytman Jun 6, 2024
b31b04c
fix ededup2
roytman Jun 6, 2024
6e76bd8
fix ededup3
roytman Jun 6, 2024
3a07e68
Adjuest code workflows.
revit13 Jun 5, 2024
09ba6bc
Fixes after testing.
revit13 Jun 6, 2024
5a17ebf
More fixes.
revit13 Jun 6, 2024
a1f759f
fix ededup
roytman Jun 6, 2024
36f6d92
fix ededup2
roytman Jun 6, 2024
fc21bb8
fix ededup3
roytman Jun 6, 2024
3892937
fix fdedup
roytman Jun 6, 2024
bb834ec
fix fdedup2
roytman Jun 6, 2024
f8ead56
fix comments
roytman Jun 7, 2024
53f3d3b
add comments in Makefiles
roytman Jun 7, 2024
74647f8
add the non-unique ray cluster ID warning for KFPv2
roytman Jun 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 5 additions & 3 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -62,7 +62,7 @@ jobs:
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Test KFP lib
- name: Test KFP v1 lib
run: |
source kind/requirements.env
export PATH=$PATH:/tmp/
Expand Down Expand Up @@ -93,7 +93,7 @@ jobs:
sudo rm -rf /usr/share/dotnet /opt/ghc /usr/local/lib/android /usr/local/share/powershell /usr/share/swift /usr/lib/jvm /usr/local/.ghcup
sudo docker rmi $(docker image ls -aq) >/dev/null 2>&1 || true
df -h
- name: Test KFP worflow run
- name: Test KFP v1 worflow run
timeout-minutes: 120
run: |
source kind/requirements.env
Expand All @@ -108,6 +108,8 @@ jobs:
chmod 777 /tmp/kubectl
curl https://dl.min.io/client/mc/release/linux-amd64/mc --create-dirs -o /tmp/mc
chmod +x /tmp/mc
export DEPLOY_KUBEFLOW=1
make -C kind setup
make -C transforms workflow-build
make -C kfp/kfp_support_lib test
make -C transforms/universal/noop/ workflow-build
make -C transforms/universal/noop workflow-test
20 changes: 11 additions & 9 deletions .make.defaults
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,9 @@ KIND_CLUSTER_NAME=dataprep
DPK_PYTHON_LIB_DIR=$(REPOROOT)/data-processing-lib/python
DPK_RAY_LIB_DIR=$(REPOROOT)/data-processing-lib/ray
DPK_SPARK_LIB_DIR=$(REPOROOT)/data-processing-lib/spark

KFPv2?=0

#######################################################################################
# Lists all targets and optional help text found in the target.
# Adapted from https://stackoverflow.com/a/65243296/45375
Expand Down Expand Up @@ -200,7 +203,7 @@ __check_defined = \
cp -p -R ${LIB_PATH}/README.md ${LIB_NAME}

# Build and image using the local Dockerfile and make the data-processing-lib/python
# available in the current directory for use by the Dockerfile (i.e. to install the library).
# available in the current directory for use by the Dockerfile (i.e. to install the library).
.PHONY: .defaults.python-lib-src-image
.defaults.python-lib-src-image:: # Must be called with a DOCKER_LOCAL_IMAGE= settings.
@# Help: Build the Python $(DOCKER_LOCAL_IMAGE) using the $(DOCKER_FILE), requirements.txt and data-processing-lib/python source
Expand Down Expand Up @@ -261,8 +264,8 @@ __check_defined = \

# Install all source from the repo for a python runtime transform into an existing venv
.PHONY: .defaults.install-python-lib-src-venv
.defaults.install-python-lib-src-venv::
@# Help: Install Python data processing library source into existing venv
.defaults.install-python-lib-src-venv::
@# Help: Install Python data processing library source into existing venv
@echo Installing Python data processing library source to existing venv
@source venv/bin/activate; \
pip install pytest; \
Expand All @@ -276,8 +279,8 @@ __check_defined = \

# Install all source from the repo for a ray runtime transform into an existing venv
.PHONY: .defaults.install-ray-lib-src-venv
.defaults.install-ray-lib-src-venv::
@# Help: Install Ray and Python data processing library source into existing venv
.defaults.install-ray-lib-src-venv::
@# Help: Install Ray and Python data processing library source into existing venv
@echo Installing Ray and Python data processing library source to existing venv
@source venv/bin/activate; \
pip install pytest; \
Expand All @@ -291,11 +294,10 @@ __check_defined = \
.PHONY: .defaults.spark-lib-src-venv
.defaults.spark-lib-src-venv:: .defaults.create-venv .defaults.install-spark-lib-src-venv .defaults.install-local-requirements-venv

# Install all source from the repo for a spark runtime transform into an existing venv
# Install the python-based lib BEFORE spark assuming spark depends on the same version as python source.
.PHONY: .defaults.install-spark-lib-src-venv
.defaults.install-spark-lib-src-venv::
@# Help: Install Spark and Python data processing library source into existing venv
@echo ""
.defaults.install-spark-lib-src-venv::
@# Help: Install Spark and Python data processing library source into existing venv
@echo Installing Spark and Python data processing library source to existing venv
@source venv/bin/activate; \
pip install pytest; \
Expand Down
1 change: 1 addition & 0 deletions .make.versions
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ RELEASE_VERSION_SUFFIX=.dev6
# Data prep lab wheel version
DPK_LIB_VERSION=0.2.0$(RELEASE_VERSION_SUFFIX)
DPK_LIB_KFP_VERSION=0.2.0$(RELEASE_VERSION_SUFFIX)
DPK_LIB_KFP_VERSION_v2=0.2.0$(RELEASE_VERSION_SUFFIX)

# Begin transform versions/tags
BLOCKLIST_VERSION=0.4.2$(RELEASE_VERSION_SUFFIX)
Expand Down
9 changes: 8 additions & 1 deletion kfp/doc/setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,14 @@ choose your OS system, and process according to "(Optional) Install the MinIO Cl

## Installation steps <a name = "installation"></a>

You can create a Kind cluster with all required software installed using the following command:
Before installation, you have to deside which KFP version do you want to use.
In order to use KFP v2, please set the following environment variable:

```shell
export KFPv2=1
```

Now, you can create a Kind cluster with all required software installed using the following command:

```shell
make setup
Expand Down
14 changes: 7 additions & 7 deletions kfp/doc/simple_transform_pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,16 +34,16 @@ Note: the project and the explanation below are based on [KFPv1](https://www.kub
* Pipeline wiring - definition of the sequence of invocation (with parameter passing) of participating components
* Additional configuration

### Imports definition <a name = "imports"></a>
### Imports definition <a name = "imports"></a>

```python
import kfp.compiler as compiler
import kfp.components as comp
import kfp.dsl as dsl
from kfp_support.workflow_support.utils import (
ONE_HOUR_SEC,
ONE_WEEK_SEC,
ComponentUtils,
from kfp_support.workflow_support.runtime_utils import (
ONE_HOUR_SEC,
ONE_WEEK_SEC,
ComponentUtils,
)
from kubernetes import client as k8s_client
```
Expand Down Expand Up @@ -73,8 +73,8 @@ Ray cluster. For each step we have to define a component that will execute them:
Note: here we are using shared components described in this [document](../kfp_ray_components/README.md) for `create_ray_op`,
`execute_ray_jobs_op` and `cleanup_ray_op`, while `compute_exec_params_op` component is built inline, because it might
differ significantly. For "simple" pipeline cases we can use the
[default implementation](../kfp_support_lib/src/kfp_support/workflow_support/utils/remote_jobs_utils.py),
while, for example for exact dedup, we are using a very [specialized one](../transform_workflows/universal/ededup/src/ededup_compute_execution_params.py).
[default implementation](../kfp_support_lib/src/kfp_support/workflow_support/runtime_utils/remote_jobs_utils.py),
while, for example for exact dedup, we are using a very [specialized one](../../transforms/universal/ededup/kfp_ray/v2/src/ededup_compute_execution_params.py).

### Input parameters definition <a name = "inputs"></a>

Expand Down
30 changes: 20 additions & 10 deletions kfp/kfp_ray_components/Dockerfile
Original file line number Diff line number Diff line change
@@ -1,25 +1,35 @@
FROM docker.io/rayproject/ray:2.9.3-py310

ARG BUILD_DATE
ARG GIT_COMMIT

LABEL build-date=$BUILD_DATE
LABEL git-commit=$GIT_COMMIT

# install libraries
COPY requirements.txt requirements.txt
RUN pip install --no-cache-dir -r requirements.txt

# Copy and install data processing libraries
# Copy and install data processing libraries
# These are expected to be placed in the docker context before this is run (see the make image).
COPY --chown=ray:users data-processing-lib-python/ data-processing-lib-python/
RUN cd data-processing-lib-python && pip install --no-cache-dir -e .
COPY --chown=ray:users data-processing-lib-ray/ data-processing-lib-ray/
COPY --chown=ray:users data-processing-lib-ray/ data-processing-lib-ray/
RUN cd data-processing-lib-ray && pip install --no-cache-dir -e .

COPY --chown=ray:users kfp_support_lib/ kfp_support_lib/
RUN cd kfp_support_lib && pip install --no-cache-dir -e .
COPY --chown=ray:users python_apiserver_client python_apiserver_client/
RUN cd python_apiserver_client && pip install --no-cache-dir -e .

COPY --chown=ray:users workflow_support_lib workflow_support_lib/
RUN cd workflow_support_lib && pip install --no-cache-dir -e .

# overwriting the installation of old versions of pydantic
RUN pip install --no-cache-dir pydantic==2.6.3

# remove credentials-containing file
RUN rm requirements.txt
# components
COPY ./src /pipelines/component/src

# Set environment
ENV KFP_v2=$KFP_v2

# Put these at the end since they seem to upset the docker cache.
ARG BUILD_DATE
ARG GIT_COMMIT
LABEL build-date=$BUILD_DATE
LABEL git-commit=$GIT_COMMIT
36 changes: 25 additions & 11 deletions kfp/kfp_ray_components/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -2,26 +2,39 @@
# # know where they are running from.
REPOROOT=../..

# Include the common rules.
# Use "make help" to see them.
include $(REPOROOT)/.make.defaults

IGNORE := $(shell bash -c "sed -n /=/p ${REPOROOT}/kfp/requirements.env | sed 's/=/:=/' | sed 's/^/export /' > makeenv")

include makeenv
DOCKER_FILE=Dockerfile

ifeq ($(KFPv2), 1)
DOCKER_IMAGE_NAME=kfp-data-processing_v2
DOCKER_IMAGE_VERSION=${KFP_DOCKER_VERSION_v2}
WORKFLOW_SUPPORT_LIB=kfp_v2_workflow_support
else
DOCKER_IMAGE_NAME=kfp-data-processing
DOCKER_IMAGE_VERSION=${KFP_DOCKER_VERSION}
WORKFLOW_SUPPORT_LIB=kfp_v1_workflow_support
endif

# Include the common rules.
# Use "make help" to see them.
include $(REPOROOT)/.make.defaults

#DOCKER_IMG=${DOCKER_HOSTNAME}/${DOCKER_NAMESPACE}/${DOCKER_IMAGE_NAME}:${DOCKER_IMAGE_VERSION}
DOCKER_IMG=$(DOCKER_LOCAL_IMAGE)

.PHONY: .lib-src-image
.lib-src-image::
$(MAKE) .defaults.copy-lib LIB_PATH=$(DPK_RAY_LIB_DIR) LIB_NAME=data-processing-lib-ray
$(MAKE) .defaults.copy-lib LIB_PATH=$(DPK_PYTHON_LIB_DIR) LIB_NAME=data-processing-lib-python
$(MAKE) .defaults.copy-lib LIB_PATH=$(REPOROOT)/kfp/kfp_support_lib LIB_NAME=kfp_support_lib
$(MAKE) .defaults.copy-lib LIB_PATH=$(REPOROOT)/kfp/kfp_support_lib/python_apiserver_client LIB_NAME=python_apiserver_client
$(MAKE) .defaults.copy-lib LIB_PATH=$(REPOROOT)/kfp/kfp_support_lib/$(WORKFLOW_SUPPORT_LIB) LIB_NAME=workflow_support_lib
$(MAKE) .defaults.image
rm -rf data-processing-lib-ray
rm -rf data-processing-lib-python
rm -rf kfp_support_lib
rm -rf python_apiserver_client
rm -rf workflow_support_lib

.PHONY: image
image: Dockerfile requirements.txt
Expand All @@ -34,11 +47,12 @@ set-versions:: reconcile-requirements
.PHONY: reconcile-requirements
reconcile-requirements:
@# Help: Update yaml files to build images tagged as version $(KFP_DOCKER_VERSION)
sed -i.back "s/kfp-data-processing:[0-9].*/kfp-data-processing:${KFP_DOCKER_VERSION}/" createRayClusterComponent.yaml
sed -i.back "s/kfp-data-processing:[0-9].*/kfp-data-processing:${KFP_DOCKER_VERSION}/" deleteRayClusterComponent.yaml
sed -i.back "s/kfp-data-processing:[0-9].*/kfp-data-processing:${KFP_DOCKER_VERSION}/" executeRayJobComponent.yaml
sed -i.back "s/kfp-data-processing:[0-9].*/kfp-data-processing:${KFP_DOCKER_VERSION}/" executeRayJobComponent_multi_s3.yaml
sed -i.back "s/kfp-data-processing:[0-9].*/kfp-data-processing:${KFP_DOCKER_VERSION}/" executeSubWorkflowComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" createRayClusterComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" deleteRayClusterComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" executeRayJobComponent.yaml
sed -i.back "s/kfp-data-processing.*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" executeRayJobComponent_multi_s3.yaml
# TODO remove it for KFPv2
sed -i.back "s/kfp-data-processing*:[0-9].*/$(DOCKER_IMAGE_NAME):${KFP_DOCKER_VERSION}/" executeSubWorkflowComponent.yaml

.PHONY: load-image
load-image:
Expand Down
2 changes: 1 addition & 1 deletion kfp/kfp_ray_components/executeRayJobComponent.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ inputs:
- { name: run_id, type: String, description: "The KFP Run ID" }
- { name: additional_params, type: String, description: "additional parameters" }
# The component converts the dictionary to json string
- { name: exec_params, type: dict, description: "job parameters" }
- { name: exec_params, type: JsonObject, description: "job parameters" }
- { name: exec_script_name, type: String, description: "transform script name" }
- { name: server_url, type: String, default: "", description: "url of api server" }

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ inputs:
- { name: server_url, type: String, default: "", description: "url of api server" }
- { name: prefix, type: String, default: "", description: "prefix for extra credentials" }
# The component converts the dictionary to json string
- { name: exec_params, type: dict, description: "job parameters" }
- { name: exec_params, type: JsonObject, description: "job parameters" }
- { name: additional_params, type: String, description: "additional parameters" }

implementation:
Expand Down
4 changes: 1 addition & 3 deletions kfp/kfp_ray_components/src/create_ray_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,10 +9,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

import sys

from kfp_support.workflow_support.utils import KFPUtils, RayRemoteJobs
from workflow_support.runtime_utils import KFPUtils, RayRemoteJobs


def start_ray_cluster(
Expand Down
5 changes: 1 addition & 4 deletions kfp/kfp_ray_components/src/delete_ray_cluster.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,11 +9,8 @@
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

import sys

from kfp_support.workflow_support.utils import KFPUtils, RayRemoteJobs

from workflow_support.runtime_utils import KFPUtils, RayRemoteJobs

# Cleans and shutdowns the Ray cluster
def cleanup_ray_cluster(
Expand Down
4 changes: 1 addition & 3 deletions kfp/kfp_ray_components/src/execute_ray_job.py
Original file line number Diff line number Diff line change
Expand Up @@ -9,9 +9,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
################################################################################

from kfp_support.workflow_support.utils import KFPUtils, execute_ray_jobs

from workflow_support.runtime_utils import KFPUtils, execute_ray_jobs

if __name__ == "__main__":
import argparse
Expand Down
3 changes: 1 addition & 2 deletions kfp/kfp_ray_components/src/execute_ray_job_multi_s3.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,7 @@
# limitations under the License.
################################################################################

from kfp_support.workflow_support.utils import KFPUtils, execute_ray_jobs

from workflow_support.runtime_utils import KFPUtils, execute_ray_jobs

if __name__ == "__main__":
import argparse
Expand Down
6 changes: 4 additions & 2 deletions kfp/kfp_ray_components/src/subworkflow.py
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
import sys

from data_processing.utils.params_utils import ParamsUtils
from kfp_support.workflow_support.utils import KFPUtils, PipelinesUtils
from workflow_support.runtime_utils import KFPUtils
from workflow_support.pipeline_utils import PipelinesUtils


from data_processing.utils import ParamsUtils

def invoke_sub_workflow(
name: str, # workflow name
prefix: str, # workflow arguments prefix
Expand Down
Loading