-
Notifications
You must be signed in to change notification settings - Fork 28.3k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[SPARK-25815][K8S] Support kerberos in client mode, keytab-based toke…
…n renewal. This change hooks up the k8s backed to the updated HadoopDelegationTokenManager, so that delegation tokens are also available in client mode, and keytab-based token renewal is enabled. The change re-works the k8s feature steps related to kerberos so that the driver does all the credential management and provides all the needed information to executors - so nothing needs to be added to executor pods. This also makes cluster mode behave a lot more similarly to client mode, since no driver-related config steps are run in the latter case. The main two things that don't need to happen in executors anymore are: - adding the Hadoop config to the executor pods: this is not needed since the Spark driver will serialize the Hadoop config and send it to executors when running tasks. - mounting the kerberos config file in the executor pods: this is not needed once you remove the above. The Hadoop conf sent by the driver with the tasks is already resolved (i.e. has all the kerberos names properly defined), so executors do not need access to the kerberos realm information anymore. The change also avoids creating delegation tokens unnecessarily. This means that they'll only be created if a secret with tokens was not provided, and if a keytab is not provided. In either of those cases, the driver code will handle delegation tokens: in cluster mode by creating a secret and stashing them, in client mode by using existing mechanisms to send DTs to executors. One last feature: the change also allows defining a keytab with a "local:" URI. This is supported in client mode (although that's the same as not saying "local:"), and in k8s cluster mode. This allows the keytab to be mounted onto the image from a pre-existing secret, for example. Finally, the new code always sets SPARK_USER in the driver and executor pods. This is in line with how other resource managers behave: the submitting user reflects which user will access Hadoop services in the app. (With kerberos, that's overridden by the logged in user.) That user is unrelated to the OS user the app is running as inside the containers. Tested: - client and cluster mode with kinit - cluster mode with keytab - cluster mode with local: keytab - YARN cluster with keytab (to make sure it isn't broken) Closes #22911 from vanzin/SPARK-25815. Authored-by: Marcelo Vanzin <vanzin@cloudera.com> Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
- Loading branch information
Marcelo Vanzin
committed
Dec 18, 2018
1 parent
428eb2a
commit 4b3fe3a
Showing
25 changed files
with
649 additions
and
621 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
124 changes: 124 additions & 0 deletions
124
...ore/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfDriverFeatureStep.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,124 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package org.apache.spark.deploy.k8s.features | ||
|
||
import java.io.File | ||
import java.nio.charset.StandardCharsets | ||
|
||
import scala.collection.JavaConverters._ | ||
|
||
import com.google.common.io.Files | ||
import io.fabric8.kubernetes.api.model._ | ||
|
||
import org.apache.spark.deploy.k8s.{KubernetesConf, KubernetesUtils, SparkPod} | ||
import org.apache.spark.deploy.k8s.Config._ | ||
import org.apache.spark.deploy.k8s.Constants._ | ||
|
||
/** | ||
* Mounts the Hadoop configuration - either a pre-defined config map, or a local configuration | ||
* directory - on the driver pod. | ||
*/ | ||
private[spark] class HadoopConfDriverFeatureStep(conf: KubernetesConf) | ||
extends KubernetesFeatureConfigStep { | ||
|
||
private val confDir = Option(conf.sparkConf.getenv(ENV_HADOOP_CONF_DIR)) | ||
private val existingConfMap = conf.get(KUBERNETES_HADOOP_CONF_CONFIG_MAP) | ||
|
||
KubernetesUtils.requireNandDefined( | ||
confDir, | ||
existingConfMap, | ||
"Do not specify both the `HADOOP_CONF_DIR` in your ENV and the ConfigMap " + | ||
"as the creation of an additional ConfigMap, when one is already specified is extraneous") | ||
|
||
private lazy val confFiles: Seq[File] = { | ||
val dir = new File(confDir.get) | ||
if (dir.isDirectory) { | ||
dir.listFiles.filter(_.isFile).toSeq | ||
} else { | ||
Nil | ||
} | ||
} | ||
|
||
private def newConfigMapName: String = s"${conf.resourceNamePrefix}-hadoop-config" | ||
|
||
private def hasHadoopConf: Boolean = confDir.isDefined || existingConfMap.isDefined | ||
|
||
override def configurePod(original: SparkPod): SparkPod = { | ||
original.transform { case pod if hasHadoopConf => | ||
val confVolume = if (confDir.isDefined) { | ||
val keyPaths = confFiles.map { file => | ||
new KeyToPathBuilder() | ||
.withKey(file.getName()) | ||
.withPath(file.getName()) | ||
.build() | ||
} | ||
new VolumeBuilder() | ||
.withName(HADOOP_CONF_VOLUME) | ||
.withNewConfigMap() | ||
.withName(newConfigMapName) | ||
.withItems(keyPaths.asJava) | ||
.endConfigMap() | ||
.build() | ||
} else { | ||
new VolumeBuilder() | ||
.withName(HADOOP_CONF_VOLUME) | ||
.withNewConfigMap() | ||
.withName(existingConfMap.get) | ||
.endConfigMap() | ||
.build() | ||
} | ||
|
||
val podWithConf = new PodBuilder(pod.pod) | ||
.editSpec() | ||
.addNewVolumeLike(confVolume) | ||
.endVolume() | ||
.endSpec() | ||
.build() | ||
|
||
val containerWithMount = new ContainerBuilder(pod.container) | ||
.addNewVolumeMount() | ||
.withName(HADOOP_CONF_VOLUME) | ||
.withMountPath(HADOOP_CONF_DIR_PATH) | ||
.endVolumeMount() | ||
.addNewEnv() | ||
.withName(ENV_HADOOP_CONF_DIR) | ||
.withValue(HADOOP_CONF_DIR_PATH) | ||
.endEnv() | ||
.build() | ||
|
||
SparkPod(podWithConf, containerWithMount) | ||
} | ||
} | ||
|
||
override def getAdditionalKubernetesResources(): Seq[HasMetadata] = { | ||
if (confDir.isDefined) { | ||
val fileMap = confFiles.map { file => | ||
(file.getName(), Files.toString(file, StandardCharsets.UTF_8)) | ||
}.toMap.asJava | ||
|
||
Seq(new ConfigMapBuilder() | ||
.withNewMetadata() | ||
.withName(newConfigMapName) | ||
.endMetadata() | ||
.addToData(fileMap) | ||
.build()) | ||
} else { | ||
Nil | ||
} | ||
} | ||
|
||
} |
40 changes: 0 additions & 40 deletions
40
...e/src/main/scala/org/apache/spark/deploy/k8s/features/HadoopConfExecutorFeatureStep.scala
This file was deleted.
Oops, something went wrong.
35 changes: 0 additions & 35 deletions
35
.../main/scala/org/apache/spark/deploy/k8s/features/HadoopSparkUserExecutorFeatureStep.scala
This file was deleted.
Oops, something went wrong.
Oops, something went wrong.