Skip to content

Commit

Permalink
feat: adds k8s config options to Bytewax materialization engine (#3518)
Browse files Browse the repository at this point in the history
feat: adds k8s config options

Signed-off-by: adamschmidt <aschmidt1978@gmail.com>
  • Loading branch information
adamschmidt authored Mar 4, 2023
1 parent ec08a55 commit 1883f55
Show file tree
Hide file tree
Showing 2 changed files with 41 additions and 2 deletions.
25 changes: 24 additions & 1 deletion docs/reference/batch-materialization/bytewax.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ To configure secrets, first create them using `kubectl`:
kubectl create secret generic -n bytewax aws-credentials --from-literal=aws-access-key-id='<access key id>' --from-literal=aws-secret-access-key='<secret access key>'
```

If your Docker registry requires authentication to store/pull containers, you can use this same approach to store your repository access credential and use when running the materialization engine.

Then configure them in the batch_engine section of `feature_store.yaml`:

``` yaml
Expand All @@ -40,6 +42,8 @@ batch_engine:
secretKeyRef:
name: aws-credentials
key: aws-secret-access-key
image_pull_secrets:
- docker-repository-access-secret
```
#### Configuration
Expand All @@ -51,9 +55,28 @@ batch_engine:
type: bytewax
namespace: bytewax
image: bytewax/bytewax-feast:latest
image_pull_secrets:
- my_container_secret
service_account_name: my-k8s-service-account
annotations:
# example annotation you might include if running on AWS EKS
iam.amazonaws.com/role: arn:aws:iam::<account number>:role/MyBytewaxPlatformRole
resources:
limits:
cpu: 1000m
memory: 2048Mi
requests:
cpu: 500m
memory: 1024Mi
```

The `namespace` configuration directive specifies which Kubernetes [namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) jobs, services and configuration maps will be created in.
**Notes:**

* The `namespace` configuration directive specifies which Kubernetes [namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) jobs, services and configuration maps will be created in.
* The `image_pull_secrets` configuration directive specifies the pre-configured secret to use when pulling the image container from your registry
* The `service_account_name` specifies which Kubernetes service account to run the job under
* `annotations` allows you to include additional Kubernetes annotations to the job. This is particularly useful for IAM roles which grant the running pod access to cloud platform resources (for example).
* The `resources` configuration directive sets the standard Kubernetes [resource requests](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) for the job containers to utilise when materializing data.

#### Building a custom Bytewax Docker image

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,17 @@ class BytewaxMaterializationEngineConfig(FeastConfigBaseModel):
These environment variables can be used to reference Kubernetes secrets.
"""

image_pull_secrets: List[str] = []
""" (optional) The secrets to use when pulling the image to run for the materialization job """

resources: dict = {}
""" (optional) The resource requests and limits for the materialization containers """

service_account_name: StrictStr = ""
""" (optional) The service account name to use when running the job """

annotations: dict = {}
""" (optional) Annotations to apply to the job container. Useful for linking the service account to IAM roles, operational metadata, etc """

class BytewaxMaterializationEngine(BatchMaterializationEngine):
def __init__(
Expand Down Expand Up @@ -248,9 +259,14 @@ def _create_job_definition(self, job_id, namespace, pods, env):
"parallelism": pods,
"completionMode": "Indexed",
"template": {
"metadata": {
"annotations": self.batch_engine_config.annotations,
},
"spec": {
"restartPolicy": "Never",
"subdomain": f"dataflow-{job_id}",
"imagePullSecrets": self.batch_engine_config.image_pull_secrets,
"serviceAccountName": self.batch_engine_config.service_account_name,
"initContainers": [
{
"env": [
Expand Down Expand Up @@ -300,7 +316,7 @@ def _create_job_definition(self, job_id, namespace, pods, env):
"protocol": "TCP",
}
],
"resources": {},
"resources": self.batch_engine_config.resources,
"securityContext": {
"allowPrivilegeEscalation": False,
"capabilities": {
Expand Down

0 comments on commit 1883f55

Please sign in to comment.