-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-24434][K8S] pod template files #22146
Changes from 55 commits
e2e7223
ea4dde6
4f088db
f2f9a44
368d0a4
0005ea5
dda5cc9
d0f41aa
c4c1231
74de0e5
205ddd3
4ae6fc6
c0bcfea
b9e4263
c5e1ea0
56a6b32
7d0d928
1da79a8
8ef756e
7f3cb04
cc8d3f8
4119899
1d0a8fa
81e5a66
7f4ff5a
3097aef
9b1418a
ebacc96
98acd29
95f8b8b
da5dff5
7fb76c7
d86bc75
f2720a5
4b3950d
3813fcb
ec04323
f3b6082
fd503db
eeb2492
4801e8e
36a70ad
ece7a7c
8b8aa48
1ed95ab
a4fde0c
140e89c
838c2bd
5faea62
9e6a4b2
c8077dc
3d6ff3b
83087eb
a46b885
80b56c1
8f7f571
3707e6a
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||
---|---|---|---|---|
|
@@ -186,6 +186,22 @@ To use a secret through an environment variable use the following options to the | |||
--conf spark.kubernetes.executor.secretKeyRef.ENV_NAME=name:key | ||||
``` | ||||
|
||||
## Pod Template | ||||
Kubernetes allows defining pods from [template files](https://kubernetes.io/docs/concepts/workloads/pods/pod-overview/#pod-templates). | ||||
Spark users can similarly use template files to define the driver or executor pod configurations that Spark configurations do not support. | ||||
To do so, specify the spark properties `spark.kubernetes.driver.podTemplateFile` and `spark.kubernetes.executor.podTemplateFile` | ||||
to point to local files accessible to the `spark-submit` process. To allow the driver pod access the executor pod template | ||||
file, the file will be automatically mounted onto a volume in the driver pod when it's created. | ||||
Spark does not do any validation after unmarshalling these template files and relies on the Kubernetes API server for validation. | ||||
|
||||
It is important to note that Spark is opinionated about certain pod configurations so there are values in the | ||||
pod template that will always be overwritten by Spark. Therefore, users of this feature should note that specifying | ||||
the pod template file only lets Spark start with a template pod instead of an empty pod during the pod-building process. | ||||
For details, see the [full list](#pod-template-properties) of pod template values that will be overwritten by spark. | ||||
|
||||
Pod template files can also define multiple containers. In such cases, Spark will always assume that the first container in | ||||
the list will be the driver or executor container. | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it possible to use only extra containers and not Spark specific with this approach? Could we have a naming convention or a less error prone convention? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This PR originally had an extra spark conf for container names, but we have decided to use the first container in the template instead. Users can have an empty first container in the pod spec template if they only want to add containers without changing Spark's executor or driver container There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Having a first container empty looks redundant to me. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @skonto True, but this prevents the addition of a new spark conf for the container name |
||||
|
||||
## Using Kubernetes Volumes | ||||
|
||||
Starting with Spark 2.4.0, users can mount the following types of Kubernetes [volumes](https://kubernetes.io/docs/concepts/storage/volumes/) into the driver and executor pods: | ||||
|
@@ -863,4 +879,168 @@ specific to Spark on Kubernetes. | |||
to provide any kerberos credentials for launching a job. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td><code>spark.kubernetes.driver.podTemplateFile</code></td> | ||||
<td>(none)</td> | ||||
<td> | ||||
Specify the local file that contains the driver [pod template](#pod-template). For example | ||||
<code>spark.kubernetes.driver.podTemplateFile=/path/to/driver-pod-template.yaml`</code> | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td><code>spark.kubernetes.executor.podTemplateFile</code></td> | ||||
<td>(none)</td> | ||||
<td> | ||||
Specify the local file that contains the executor [pod template](#pod-template). For example | ||||
<code>spark.kubernetes.executor.podTemplateFile=/path/to/executor-pod-template.yaml`</code> | ||||
</td> | ||||
</tr> | ||||
</table> | ||||
|
||||
#### Pod template properties | ||||
|
||||
See the below table for the full list of pod specifications that will be overwritten by spark. | ||||
|
||||
### Pod Metadata | ||||
|
||||
<table class="table"> | ||||
<tr><th>Pod metadata key</th><th>Modified value</th><th>Description</th></tr> | ||||
<tr> | ||||
<td>name</td> | ||||
<td>Value of <code>spark.kubernetes.driver.pod.name</code></td> | ||||
<td> | ||||
The driver pod name will be overwritten with either the configured or default value of | ||||
<code>spark.kubernetes.driver.pod.name</code>. The executor pod names will be unaffected. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>namespace</td> | ||||
<td>Value of <code>spark.kubernetes.namespace</code></td> | ||||
<td> | ||||
Spark makes strong assumptions about the driver and executor namespaces. Both driver and executor namespaces will | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if the spark conf value for namespace isn't set, can spark use the template setting, or will spark's conf default also override the template? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It will also be replaced with the spark conf's default value. I'll update this description to match that of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It should be the default namespace. |
||||
be replaced by either the configured or default spark conf value. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>labels</td> | ||||
<td>Adds the labels from <code>spark.kubernetes.{driver,executor}.label.*</code></td> | ||||
<td> | ||||
Spark will add additional labels specified by the spark configuration. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>annotations</td> | ||||
<td>Adds the annotations from <code>spark.kubernetes.{driver,executor}.annotation.*</code></td> | ||||
<td> | ||||
Spark will add additional labels specified by the spark configuration. | ||||
</td> | ||||
</tr> | ||||
</table> | ||||
|
||||
### Pod Spec | ||||
|
||||
<table class="table"> | ||||
<tr><th>Pod spec key</th><th>Modified value</th><th>Description</th></tr> | ||||
<tr> | ||||
<td>imagePullSecrets</td> | ||||
<td>Adds image pull secrets from <code>spark.kubernetes.container.image.pullSecrets</code></td> | ||||
<td> | ||||
Additional pull secrets will be added from the spark configuration to both executor pods. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>nodeSelector</td> | ||||
<td>Adds node selectors from <code>spark.kubernetes.node.selector.*</code></td> | ||||
<td> | ||||
Additional node selectors will be added from the spark configuration to both executor pods. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>restartPolicy</td> | ||||
<td><code>"never"</code></td> | ||||
<td> | ||||
Spark assumes that both drivers and executors never restart. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>serviceAccount</td> | ||||
<td>Value of <code>spark.kubernetes.authenticate.driver.serviceAccountName</code></td> | ||||
<td> | ||||
Spark will override <code>serviceAccount</code> with the value of the spark configuration for only | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. similar Q to namespace: if no spark-conf is set, will spark's conf default override here as well? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm pretty sure this does not: https://github.com/apache/spark/blob/master/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/features/DriverKubernetesCredentialsFeatureStep.scala#L74. I'll update the docs to clarify! |
||||
driver pods, and only if the spark configuration is specified. Executor pods will remain unaffected. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>serviceAccountName</td> | ||||
<td>Value of <code>spark.kubernetes.authenticate.driver.serviceAccountName</code></td> | ||||
<td> | ||||
Spark will override <code>serviceAccountName</code> with the value of the spark configuration for only | ||||
driver pods, and only if the spark configuration is specified. Executor pods will remain unaffected. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>volumes</td> | ||||
<td>Adds volumes from <code>spark.kubernetes.{driver,executor}.volumes.[VolumeType].[VolumeName].mount.path</code></td> | ||||
<td> | ||||
Spark will add volumes as specified by the spark conf, as well as additional volumes necessary for passing | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we should also document the volume names (and optionally mount points) for all the "internal" spark volumes (pod download, config map, etc). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1 As I've commented elsewhere it is easily possible to create invalid specs with this feature because Spark will create certain volumes, config maps etc with known name patterns that users need to avoid There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I have done exactly that even w/o this feature - I was trying to use the spark operator to mount a config map and accidentally hit upon the spark config volume on my first try :) spark/resource-managers/kubernetes/core/src/main/scala/org/apache/spark/deploy/k8s/Constants.scala Line 64 in ba84bcb
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. +1. As I mentioned before we need to know the implications of whatever property we expose. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One way we can avoid conflicting volumes entirely is by randomizing the name of the volumes added by features, e.g. appending some UUID or at least some large integer. I think keeping running documentation on all volumes we add from features is too much overhead. If we run into these conflicts often then we can do this, but I think it's fine not to block merging on that. Either way though I think again, the validation piece can be done separately from this PR. I wouldn't consider that documentation as blocking on this merging. Thoughts? |
||||
spark conf and pod template files. | ||||
</td> | ||||
</tr> | ||||
</table> | ||||
|
||||
### Container spec | ||||
|
||||
The following affect the driver and executor containers. All other containers in the pod spec will be unaffected. | ||||
|
||||
<table class="table"> | ||||
<tr><th>Container spec key</th><th>Modified value</th><th>Description</th></tr> | ||||
<tr> | ||||
<td>env</td> | ||||
<td>Adds env variables from <code>spark.kubernetes.driverEnv.[EnvironmentVariableName]</code></td> | ||||
<td> | ||||
Spark will add driver env variables from <code>spark.kubernetes.driverEnv.[EnvironmentVariableName]</code>, and | ||||
executor env variables from <code>spark.executorEnv.[EnvironmentVariableName]</code>. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>image</td> | ||||
<td>Value of <code>spark.kubernetes.{driver,executor}.container.image</code></td> | ||||
<td> | ||||
The image will be defined by the spark configurations. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>imagePullPolicy</td> | ||||
<td>Value of <code>spark.kubernetes.container.image.pullPolicy</code></td> | ||||
<td> | ||||
Spark will override the pull policy for both driver and executors. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>name</td> | ||||
<td>See description.</code></td> | ||||
<td> | ||||
The container name will be assigned by spark ("spark-kubernetes-driver" for the driver container, and | ||||
"executor" for each executor container) if not defined by the pod template. If the container is defined by the | ||||
template, the template's name will be used. | ||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>resources</td> | ||||
<td>See description</td> | ||||
<td> | ||||
The cpu limits are set by <code>spark.kubernetes.{driver,executor}.limit.cores</code>. The cpu is set by | ||||
<code>spark.{driver,executor}.cores</code>. The memory request and limit are set by summing the values of | ||||
<code>spark.{driver,executor}.memory</code> and <code>spark.{driver,executor}.memoryOverhead</code>. | ||||
|
||||
</td> | ||||
</tr> | ||||
<tr> | ||||
<td>volumeMounts</td> | ||||
<td>Add volumes from <code>spark.kubernetes.driver.volumes.[VolumeType].[VolumeName].mount.{path,readOnly}</code></td> | ||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just checking, is it add or replace? I'm hoping one could use this to mount unsupported volume types, like config maps or secrets, in addition to those managed by spark. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @aditanase that was the purpose of the design doc: https://docs.google.com/document/d/1pcyH5f610X2jyJW9WbWHnj8jktQPLlbbmmUwdeK4fJk capture what behavior we want for each case. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Volumes should be additive. If names are duplicated though I'd expect K8s to throw an error. |
||||
<td> | ||||
Spark will add volumes as specified by the spark conf, as well as additional volumes necessary for passing | ||||
spark conf and pod template files. | ||||
</td> | ||||
</tr> | ||||
</table> |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -76,9 +76,17 @@ private[spark] object Constants { | |
val ENV_R_PRIMARY = "R_PRIMARY" | ||
val ENV_R_ARGS = "R_APP_ARGS" | ||
|
||
// Pod spec templates | ||
val EXECUTOR_POD_SPEC_TEMPLATE_FILE_NAME = "pod-spec-template.yml" | ||
val EXECUTOR_POD_SPEC_TEMPLATE_MOUNTPATH = "/opt/spark/pod-template" | ||
val POD_TEMPLATE_VOLUME = "podspec-volume" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. nit: s/podspec-volume/pod-template-volume There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ping here |
||
val POD_TEMPLATE_CONFIGMAP = "podspec-configmap" | ||
val POD_TEMPLATE_KEY = "podspec-configmap-key" | ||
|
||
// Miscellaneous | ||
val KUBERNETES_MASTER_INTERNAL_URL = "https://kubernetes.default.svc" | ||
val DRIVER_CONTAINER_NAME = "spark-kubernetes-driver" | ||
val DEFAULT_DRIVER_CONTAINER_NAME = "spark-kubernetes-driver" | ||
val DEFAULT_EXECUTOR_CONTAINER_NAME = "spark-kubernetes-executor" | ||
val MEMORY_OVERHEAD_MIN_MIB = 384L | ||
|
||
// Hadoop Configuration | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you considered a single config map created at submission time, from which both driver and executors pull their appropriate templates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't think we considered this, but, an interesting proposal. I think that can be a follow up feature if requested.