Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(sdk): extract DSL into kfp-dsl package #9738

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion sdk/python/build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@
# ./build.sh [output_file]


target_archive_file=${1:-kfp.tar.gz}
target_archive_file=$1

pushd "$(dirname "$0")"
dist_dir=$(mktemp -d)
Expand Down
4 changes: 4 additions & 0 deletions sdk/python/install_from_source.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash

pip3 install -e sdk/python/kfp-dsl
connor-mccarthy marked this conversation as resolved.
Show resolved Hide resolved
pip3 install -e sdk/python
25 changes: 25 additions & 0 deletions sdk/python/kfp-dsl/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
## kfp-dsl package

`kfp-dsl` is a subpackage of the KFP SDK that is released separately in order to provide a minimal dependency runtime package for Lightweight Python Components. **`kfp-dsl` should not be installed and used directly.**

`kfp-dsl` enables the KFP runtime code and objects to be installed at Lightweight Python Component runtime without needing to install the full KFP SDK package.

### Release
`kfp-dsl` should be released immediately prior to each full `kfp` release. The version of `kfp-dsl` should match the version of `kfp` that depends on it.

### Development
To develop on `kfp` with a version of `kfp-dsl` built from source, run the following from the repository root:

```sh
source sdk/python/install_from_source.sh
```

**Note:** Modules in the `kfp-dsl` package are only permitted to have *top-level* imports from the Python standard library, the `typing-extensions` package, and the `kfp-dsl` package itself. Imports from other subpackages of the main `kfp` package or its transitive dependencies must be nested within functions to avoid runtime import errors when only `kfp-dsl` is installed.

### Testing
The `kfp-dsl` code is tested alongside the full KFP SDK in `sdk/python/kfp/dsl-test`. This is because many of the DSL tests require the full KFP SDK to be installed (e.g., requires creating and compiling a component/pipeline).

There are also dedicated `kfp-dsl` tests `./sdk/python/kfp-dsl/runtime_tests/` which test the dedicated runtime code in `kfp-dsl` and should *not* be run with the full KFP SDK installed. Specifically, these tests ensure:
* That KFP runtime logic is correct
* That `kfp-dsl` specifies all of its dependencies (i.e., no module not found errors from missing `kfp-dsl` dependencies)
* That `kfp-dsl` dependencies on the main `kfp` package have associated imports nested inside function calls (i.e., no module not found errors from missing `kfp` dependencies)
30 changes: 30 additions & 0 deletions sdk/python/kfp-dsl/build.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
#!/bin/bash -ex
#
# Copyright 2018 The Kubeflow Authors
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.


# The scripts creates the Kubeflow Pipelines python SDK package.
#
# Usage:
# ./build.sh [output_file]


target_archive_file=$1

pushd "$(dirname "$0")"
dist_dir=$(mktemp -d)
python3 setup.py sdist --format=gztar --dist-dir "$dist_dir"
cp "$dist_dir"/*.tar.gz "$target_archive_file"
popd
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,8 @@
'PipelineTask',
]

_kfp_dsl_import_error_msg = 'It looks like only `kfp-dsl` is installed. Please install the full KFP SDK using `pip install kfp`.'

try:
from typing import Annotated
except ImportError:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,6 @@
from kfp.dsl import pipeline_task
from kfp.dsl import structures
from kfp.dsl.types import type_utils
from kfp.pipeline_spec import pipeline_spec_pb2


class BaseComponent(abc.ABC):
Expand Down Expand Up @@ -103,13 +102,13 @@ def __call__(self, *args, **kwargs) -> pipeline_task.PipelineTask:
)

@property
def pipeline_spec(self) -> pipeline_spec_pb2.PipelineSpec:
def pipeline_spec(self) -> 'pipeline_spec_pb2.PipelineSpec':
"""Returns the pipeline spec of the component."""
with BlockPipelineTaskRegistration():
return self.component_spec.to_pipeline_spec()

@property
def platform_spec(self) -> pipeline_spec_pb2.PlatformSpec:
def platform_spec(self) -> 'pipeline_spec_pb2.PlatformSpec':
"""Returns the PlatformSpec of the component.

Useful when the component is a GraphComponent, else will be
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@
from typing import Callable, List, Mapping, Optional, Tuple, Type, Union
import warnings

import docstring_parser
from kfp import dsl
from kfp.dsl import container_component_artifact_channel
from kfp.dsl import container_component_class
from kfp.dsl import graph_component
Expand Down Expand Up @@ -124,9 +124,9 @@ def _get_packages_to_install_command(
return ['sh', '-c', install_python_packages_script]


def _get_default_kfp_package_path() -> str:
def _get_kfp_dsl_requirement() -> str:
import kfp
return f'kfp=={kfp.__version__}'
return f'kfp-dsl=={kfp.__version__}'


def _get_function_source_definition(func: Callable) -> str:
Expand Down Expand Up @@ -175,6 +175,12 @@ def extract_component_interface(
parameters = list(signature.parameters.values())

original_docstring = inspect.getdoc(func)

try:
import docstring_parser
except ImportError as e:
raise ImportError(dsl._kfp_dsl_import_error_msg) from e

parsed_docstring = docstring_parser.parse(original_docstring)

inputs = {}
Expand Down Expand Up @@ -475,7 +481,7 @@ def create_component_from_func(

if install_kfp_package and target_image is None:
if kfp_package_path is None:
kfp_package_path = _get_default_kfp_package_path()
kfp_package_path = _get_kfp_dsl_requirement()
packages_to_install.append(kfp_package_path)

packages_to_install_command = _get_packages_to_install_command(
Expand Down Expand Up @@ -622,7 +628,7 @@ def create_graph_component_from_func(

def get_pipeline_description(
decorator_description: Union[str, None],
docstring: docstring_parser.Docstring,
docstring: 'docstring_parser.Docstring',
) -> Union[str, None]:
"""Obtains the correct pipeline description from the pipeline decorator's
description argument and the parsed docstring.
Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -17,12 +17,11 @@
from typing import Callable, Optional
import uuid

from kfp.compiler import pipeline_spec_builder as builder
from kfp import dsl
from kfp.dsl import base_component
from kfp.dsl import pipeline_channel
from kfp.dsl import pipeline_context
from kfp.dsl import structures
from kfp.pipeline_spec import pipeline_spec_pb2


class GraphComponent(base_component.BaseComponent):
Expand Down Expand Up @@ -65,6 +64,11 @@ def __init__(
pipeline_group = dsl_pipeline.groups[0]
pipeline_group.name = uuid.uuid4().hex

try:
from kfp.compiler import pipeline_spec_builder as builder
except ImportError as e:
raise ImportError(dsl._kfp_dsl_import_error_msg) from e

pipeline_spec, platform_spec = builder.create_pipeline_spec(
pipeline=dsl_pipeline,
component_spec=self.component_spec,
Expand All @@ -83,7 +87,7 @@ def __init__(
self.component_spec.platform_spec = platform_spec

@property
def pipeline_spec(self) -> pipeline_spec_pb2.PipelineSpec:
def pipeline_spec(self) -> 'pipeline_spec_pb2.PipelineSpec':
"""Returns the pipeline spec of the component."""
return self.component_spec.implementation.graph

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -20,13 +20,13 @@
from typing import Any, Dict, List, Mapping, Optional, Union
import warnings

import kfp
from kfp.dsl import constants
from kfp.dsl import pipeline_channel
from kfp.dsl import placeholders
from kfp.dsl import structures
from kfp.dsl import utils
from kfp.dsl.types import type_utils
from kfp.pipeline_spec import pipeline_spec_pb2

_register_task_handler = lambda task: utils.maybe_rename_for_k8s(
task.component_spec.name)
Expand Down Expand Up @@ -89,6 +89,7 @@ def __init__(
error_message_prefix=(
f'Incompatible argument passed to the input '
f'{input_name!r} of component {component_spec.name!r}: '),
raise_on_error=kfp.TYPE_CHECK,
)

self.component_spec = component_spec
Expand Down Expand Up @@ -149,7 +150,7 @@ def validate_placeholder_types(
])

@property
def platform_spec(self) -> pipeline_spec_pb2.PlatformSpec:
def platform_spec(self) -> 'pipeline_spec_pb2.PlatformSpec':
"""PlatformSpec for all tasks in the pipeline as task.

Only for use on tasks created from GraphComponents.
Expand Down
Loading