Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update sagemaker docs #701

Merged
merged 6 commits into from
Dec 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 17 additions & 5 deletions docs/pipeline.md
Original file line number Diff line number Diff line change
Expand Up @@ -144,15 +144,20 @@ pipeline.
```bash
fondant run local <pipeline_ref>
```
=== "Vertex Runner"
=== "Vertex"

```bash
fondant run vertex <pipeline_ref> \
--project-id $PROJECT_ID \
--project-region $PROJECT_REGION \
--service-account $SERVICE_ACCOUNT
```

=== "SageMaker"

```bash
fondant run sagemaker <pipeline_ref> \
--role-arn <sagemaker_role_arn>
```
=== "Kubeflow"

```bash
Expand All @@ -167,24 +172,31 @@ pipeline.
from fondant.pipeline.runner import DockerRunner

runner = DockerRunner()
runner.run(input_spec=<pipeline_ref>)
runner.run(input=<pipeline_ref>)
```
=== "Vertex"

```python
from fondant.pipeline.runner import VertexRunner

runner = VertexRunner()
runner.run(input_spec=<pipeline_ref>)
runner.run(input=<pipeline_ref>)
```
=== "SageMaker"

```python
from fondant.pipeline.runner import SageMakerRunner

runner = SageMakerRunner()
runner.run(input=<pipeline_ref>, pipeline_name=<pipeline-name> role_arn=<sagemaker_role_arn>)
```
=== "KubeFlow"

```python
from fondant.pipeline.runner import KubeFlowRunner

runner = KubeFlowRunner(host=<kubeflow_host>)
runner.run(input_spec=<pipeline_ref>)
runner.run(input=<pipeline_ref>)
```

The pipeline ref can be a reference to the file containing your pipeline, a variable
Expand Down
2 changes: 1 addition & 1 deletion docs/runners/kfp.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ needs to
have an available GPU.

```python
from fondant.pipeline.pipeline import Resources
from fondant.pipeline import Resources

dataset = dataset.apply(
"...",
Expand Down
2 changes: 1 addition & 1 deletion docs/runners/local.md
Original file line number Diff line number Diff line change
Expand Up @@ -134,7 +134,7 @@ The local runner uses the computation resources (RAM, CPU) of the host machine.
it needs to be assigned explicitly.

```python
from fondant.pipeline.pipeline import Resources
from fondant.pipeline import Resources

dataset = dataset.apply(
"...",
Expand Down
111 changes: 111 additions & 0 deletions docs/runners/sagemaker.md
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
### SageMaker Runner

Leverage [AWS SageMaker](https://aws.amazon.com/sagemaker/) to run your Fondant pipelines.

This makes it easy to scale up your pipelines in a serverless manner without worrying about infrastructure
deployment.

The Fondant SageMaker runner will compile your pipeline to a SageMaker pipeline spec and submit it to SageMaker.


!!! note "IMPORTANT"

Using the SageMaker runner will create a [through cache rule](https://docs.aws.amazon.com/AmazonECR/latest/userguide/pull-through-cache.html) on the private ECR registry of your account. This is required to make sure that SageMaker can access the public [reusable images](../components/hub.md) used by Fondant components.

### Installing the SageMaker runner

Make sure to install Fondant with the SageMaker runner extra.

```bash
pip install fondant[sagemaker]
```

### Prerequisites
- You will need a sagemaker domain and user with the correct permissions. You can follow the instructions [here](https://docs.aws.amazon.com/sagemaker/latest/dg/onboard-quick-start.html) to set this up. Make sure to note down the role arn( `arn:aws:iam::<account_id>:role/service-role/AmazonSageMaker-ExecutionRole-<creation_timestamp>`) of the user you are using since you will need it.
- You will need to have an AWS account and have the [AWS CLI](https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-getting-started.html) installed and configured.
- Fondant on SageMaker uses an s3 bucket to store the pipeline artifacts. You will need to create an s3 bucket that SageMaker can use to store artifacts (manifests and data). You can create a bucket using the AWS CLI:

```bash
aws s3 mb s3://<bucket-name>
```
!!! note "IMPORTANT"

Regarding [the bucket and SageMaker permissions](https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-ex-bucket.html):

- If you use the the term 'sagemaker' in the name of the bucket, SageMaker will automatically have the correct permissions to the access bucket.
- If you use any other name or existing bucket you will need to add a policy on the role that SageMaker uses to access the bucket.


You can then set this bucket as the `base_path` of your pipeline with the syntax: `s3://<bucket_name>/<path>`.

### Running a pipeline with SageMaker


Since compiling a sagemaker spec requires access to the AWS SageMaker API, you will need to be logged in to
AWS with a role that has all the required permissions to launch a SageMaker pipeline.


=== "Console"

```bash
fondant run sagemaker <pipeline_ref> \
--role-arn $SAGEMAKER_ROLE_ARN
```


=== "Python"

```python
from fondant.pipeline.runner import SageMakerRunner

runner = SageMakerRunner()
runner.run(
input=<path_to_pipeline>,
role_arn=<role_arn>,
pipeline_name=<pipeline_name>
)
```


Once your pipeline is running you can monitor it using the SageMaker [Studio](https://aws.amazon.com/sagemaker/studio/).



#### Using custom Fondant components on SageMaker
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved

SageMaker only supports images hosted on a private ECR registry. If you want to use custom Fondant components on SageMaker you will need to build and push them to your private ECR registry first. You can do this using the `fondant build` command.

But first you need to login into Docker with valid ECR credentials more info [here](https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-push-ecr-image.html):
```bash
aws ecr get-login-password --region <region> | docker login --username AWS --password-stdin <aws_account_id>.dkr.ecr.<region>.amazonaws.com
```

You will need to create a repository for you component first (one time operation):
```bash
aws ecr create-repository --region <region> --repository-name <component_name>
```

Now you can use the `fondant build` [command](../components/publishing_components.md) (which uses Docker under the hood) to build and push your custom components to your private ECR registry:
```bash
fondant build <component dir> -t <aws_account_id>.dkr.ecr.<region>.amazonaws.com/<component_name>:<tag>
```


#### Assigning custom resources to the pipeline

The SageMaker runner supports assigning a specific `instance_type` to each component. This can be done by using the resources block when defining a component.

If not specified, the default `instance_type` is `ml.t3.medium`. The `instance_type` needs to be a valid SageMaker instance type you can find more info [here](https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available-instance-types.html).

```python
from fondant.pipeline import Resources

images = raw_data.apply(
"download_images",
arguments={
"input_partition_rows": 100,
"resize_mode": "no",
},
resources=Resources(instance_type="ml.t3.xlarge"),
)
```
1 change: 0 additions & 1 deletion docs/runners/sagmaker.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/runners/vertex.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ for a list of available GPU resources. Make sure to check that the chosen GPU is
region where the pipeline will be run.

```python
from fondant.pipeline.pipeline import Resources
from fondant.pipeline import Resources

dataset = dataset.apply(
"...",
Expand Down
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ nav:
- Local: runners/local.md
- Vertex: runners/vertex.md
- Kubeflow: runners/kfp.md
- SageMaker: runners/sagemaker.md
- Explorer: data_explorer.md
- Advanced:
- Architecture: architecture.md
Expand Down
15 changes: 0 additions & 15 deletions src/fondant/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -406,13 +406,6 @@ def register_compile(parent_parser):
help="Output path of compiled pipeline",
default=".fondant/sagemaker_pipeline.json",
)
sagemaker_parser.add_argument(
"--instance-type",
help="""the instance type to use for the processing steps
(see: https://aws.amazon.com/ec2/instance-types/ for options).""",
default="ml.m5.large",
)

sagemaker_parser.add_argument(
"--role-arn",
help="""the Amazon Resource Name role to use for the processing steps""",
Expand Down Expand Up @@ -471,7 +464,6 @@ def compile_sagemaker(args):
compiler.compile(
pipeline=pipeline,
output_path=args.output_path,
instance_type=args.instance_type,
role_arn=args.role_arn,
)

Expand Down Expand Up @@ -631,12 +623,6 @@ def register_run(parent_parser):
help="""the Amazon Resource Name role to use for the processing steps""",
default=None,
)
sagemaker_parser.add_argument(
"--instance-type",
help="""the instance type to use for the processing steps
(see: https://aws.amazon.com/ec2/instance-types/ for options).""",
default="ml.m5.large",
RobbeSneyders marked this conversation as resolved.
Show resolved Hide resolved
)

local_parser.set_defaults(func=run_local)
kubeflow_parser.set_defaults(func=run_kfp)
Expand Down Expand Up @@ -714,7 +700,6 @@ def run_sagemaker(args):
input=ref,
pipeline_name=args.pipeline_name,
role_arn=args.role_arn,
instance_type=args.instance_type,
)


Expand Down
38 changes: 28 additions & 10 deletions src/fondant/pipeline/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -630,13 +630,11 @@ def _patch_uri(self, og_uri: str) -> str:
return uri

def validate_base_path(self, base_path: str) -> None:
file_prefix, storage_path = base_path.split("://")

if file_prefix != "s3":
if not base_path.startswith("s3://"):
msg = "base_path must be a valid s3 path, starting with s3://"
raise ValueError(msg)

if storage_path.endswith("/"):
if base_path.endswith("/"):
msg = "base_path must not end with a '/'"
raise ValueError(msg)

Expand All @@ -645,7 +643,6 @@ def compile(
pipeline: Pipeline,
output_path: str,
*,
instance_type: str = "ml.t3.medium",
role_arn: t.Optional[str] = None,
) -> None:
"""Compile a fondant pipeline to sagemaker pipeline spec and save it
Expand All @@ -654,8 +651,6 @@ def compile(
Args:
pipeline: the pipeline to compile
output_path: the path where to save the sagemaker pipeline spec.
instance_type: the instance type to use for the processing steps
(see: https://aws.amazon.com/ec2/instance-types/ for options).
role_arn: the Amazon Resource Name role to use for the processing steps,
if none provided the `sagemaker.get_execution_role()` role will be used.
"""
Expand Down Expand Up @@ -706,13 +701,15 @@ def compile(
# https://docs.aws.amazon.com/sagemaker/latest/dg/automatic-model-tuning-ex-role.html
role_arn = self.sagemaker.get_execution_role()

resources_dict = self._set_configuration(component_op)

processor = self.sagemaker.processing.ScriptProcessor(
image_uri=self._patch_uri(component_op.component_spec.image),
command=["bash"],
instance_type=instance_type,
instance_count=1,
base_job_name=component_name,
role=role_arn,
**resources_dict,
)

step = self.sagemaker.workflow.steps.ProcessingStep(
Expand All @@ -735,8 +732,29 @@ def compile(
indent=4,
)

def _set_configuration(self, *args, **kwargs) -> None:
raise NotImplementedError
def _set_configuration(
self,
fondant_component_operation,
*args,
**kwargs,
):
# Used configurations
resources_dict = fondant_component_operation.resources.to_dict()

instance_type = resources_dict.pop("instance_type")

if not instance_type:
logger.warning(
"""No instance type provided, using default `ml.t3.medium`. See:
https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-available-instance-types.html
for options""",
)
instance_type = "ml.t3.medium"

# Unused configurations
self.log_unused_configurations(**resources_dict)

return {"instance_type": instance_type}

def generate_component_script(
self,
Expand Down
2 changes: 2 additions & 0 deletions src/fondant/pipeline/pipeline.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ class Resources:
memory_limit: t.Optional[str] = None
node_pool_label: t.Optional[str] = None
node_pool_name: t.Optional[str] = None
instance_type: t.Optional[str] = None

"""
Class representing the resources to assign to a Fondant Component operation in a Fondant
Expand All @@ -81,6 +82,7 @@ class Resources:
number followed by one of “E”, “P”, “T”, “G”, “M”, “K”.
memory_limit: the maximum memory that can be used by the component. The value can be a
number or a number followed by one of “E”, “P”, “T”, “G”, “M”, “K”.
instancy_type: the instancy type of the component.
"""

def __post_init__(self):
Expand Down
3 changes: 0 additions & 3 deletions src/fondant/pipeline/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -243,8 +243,6 @@ def run(
input: t.Union[Pipeline, str],
pipeline_name: str,
role_arn: str,
*,
instance_type: str = "ml.m5.xlarge",
):
"""Run a pipeline, either from a compiled sagemaker spec or from a fondant pipeline.

Expand All @@ -266,7 +264,6 @@ def run(
compiler.compile(
input,
output_path=output_path,
instance_type=instance_type,
role_arn=role_arn,
)
self._run(output_path, pipeline_name=pipeline_name, role_arn=role_arn)
Expand Down
1 change: 0 additions & 1 deletion tests/pipeline/test_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -236,7 +236,6 @@ def compile(
pipeline,
output_path,
*,
instance_type,
role_arn,
) -> None:
with open(output_path, "w") as f:
Expand Down
2 changes: 0 additions & 2 deletions tests/test_cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -252,13 +252,11 @@ def test_sagemaker_compile(tmp_path_factory):
sagemaker=True,
output_path=str(fn / "sagemaker_pipeline.json"),
role_arn="some_role",
instance_type="some_instance_type",
)
compile_sagemaker(args)
mock_compiler.assert_called_once_with(
pipeline=TEST_PIPELINE,
output_path=str(fn / "sagemaker_pipeline.json"),
instance_type="some_instance_type",
role_arn="some_role",
)

Expand Down
Loading