Skip to content

Commit

Permalink
[SPARK-43196][YARN][FOLLOWUP] Remove unnecessary Hadoop version check
Browse files Browse the repository at this point in the history
### What changes were proposed in this pull request?

It's not necessary to check Hadoop version 2.9+ or 3.0+ now.

### Why are the changes needed?

Simplify code and docs.

### Does this PR introduce _any_ user-facing change?

No.

### How was this patch tested?

Pass GA.

Closes apache#40900 from pan3793/SPARK-43196-followup.

Authored-by: Cheng Pan <chengpan@apache.org>
Signed-off-by: Chao Sun <sunchao@apple.com>
  • Loading branch information
pan3793 authored and sunchao committed Apr 21, 2023
1 parent 958a7d5 commit 2cbe049
Show file tree
Hide file tree
Showing 3 changed files with 7 additions and 17 deletions.
1 change: 0 additions & 1 deletion docs/running-on-yarn.md
Original file line number Diff line number Diff line change
Expand Up @@ -490,7 +490,6 @@ To use a custom metrics.properties for the application master and executors, upd
<td><code>spark.yarn.am.tokenConfRegex</code></td>
<td>(none)</td>
<td>
This config is only supported when Hadoop version is 2.9+ or 3.x (e.g., when using the Hadoop 3.x profile).
The value of this config is a regex expression used to grep a list of config entries from the job's configuration file (e.g., hdfs-site.xml)
and send to RM, which uses them when renewing delegation tokens. A typical use case of this feature is to support delegation
tokens in an environment where a YARN cluster needs to talk to multiple downstream HDFS clusters, where the YARN RM may not have configs
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,6 @@ import org.apache.hadoop.io.{DataOutputBuffer, Text}
import org.apache.hadoop.mapreduce.MRJobConfig
import org.apache.hadoop.security.UserGroupInformation
import org.apache.hadoop.util.StringUtils
import org.apache.hadoop.util.VersionInfo
import org.apache.hadoop.yarn.api._
import org.apache.hadoop.yarn.api.ApplicationConstants.Environment
import org.apache.hadoop.yarn.api.protocolrecords._
Expand All @@ -62,7 +61,7 @@ import org.apache.spark.internal.config.Python._
import org.apache.spark.launcher.{JavaModuleOptions, LauncherBackend, SparkAppHandle, YarnCommandBuilderUtils}
import org.apache.spark.resource.ResourceProfile
import org.apache.spark.rpc.RpcEnv
import org.apache.spark.util.{CallerContext, Utils, VersionUtils, YarnContainerInfoHelper}
import org.apache.spark.util.{CallerContext, Utils, YarnContainerInfoHelper}

private[spark] class Client(
val args: ClientArguments,
Expand Down Expand Up @@ -358,20 +357,13 @@ private[spark] class Client(
private def setTokenConf(amContainer: ContainerLaunchContext): Unit = {
// SPARK-37205: this regex is used to grep a list of configurations and send them to YARN RM
// for fetching delegation tokens. See YARN-5910 for more details.
val regex = sparkConf.get(config.AM_TOKEN_CONF_REGEX)
// The feature is only supported in Hadoop 2.9+ and 3.x, hence the check below.
val isSupported = VersionUtils.majorMinorVersion(VersionInfo.getVersion) match {
case (2, n) if n >= 9 => true
case (3, _) => true
case _ => false
}
if (regex.nonEmpty && isSupported) {
sparkConf.get(config.AM_TOKEN_CONF_REGEX).foreach { regex =>
logInfo(s"Processing token conf (spark.yarn.am.tokenConfRegex) with regex $regex")
val dob = new DataOutputBuffer();
val copy = new Configuration(false);
copy.clear();
val dob = new DataOutputBuffer()
val copy = new Configuration(false)
copy.clear()
hadoopConf.asScala.foreach { entry =>
if (entry.getKey.matches(regex.get)) {
if (entry.getKey.matches(regex)) {
copy.set(entry.getKey, entry.getValue)
logInfo(s"Captured key: ${entry.getKey} -> value: ${entry.getValue}")
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,7 @@ package object config extends Logging {

private[spark] val AM_TOKEN_CONF_REGEX =
ConfigBuilder("spark.yarn.am.tokenConfRegex")
.doc("This config is only supported when Hadoop version is 2.9+ or 3.x (e.g., when using " +
"the Hadoop 3.x profile). The value of this config is a regex expression used to grep a " +
.doc("The value of this config is a regex expression used to grep a " +
"list of config entries from the job's configuration file (e.g., hdfs-site.xml) and send " +
"to RM, which uses them when renewing delegation tokens. A typical use case of this " +
"feature is to support delegation tokens in an environment where a YARN cluster needs to " +
Expand Down

0 comments on commit 2cbe049

Please sign in to comment.