[WIP] Submission client redesign to use a step-based builder pattern #363

mccheah · 2017-06-30T00:03:52Z

Applies changes assuming PySpark is present from #364 .

This change overhauls the underlying architecture of the submission client, but it is intended to entirely preserve existing behavior of Spark applications. Therefore users will find this to be an invisible change.

The philosophy behind this design is to reconsider the breakdown of the submission process. It operates off the abstraction of "submission steps", which are transformation functions that take the previous state of the driver and return the new state of the driver. The driver's state includes its Spark configurations and the Kubernetes resources that will be used to deploy it.

Such a refactor moves away from a features-first API design, which considers different containers to serve a set of features. The previous design, for example, had a container files resolver API object that returned different resolutions of the dependencies added by the user. However, it was up to the main Client to know how to intelligently invoke all of those APIs. Therefore the API surface area of the file resolver became untenably large and it was not intuitive of how it was to be used or extended.

This design changes the encapsulation layout; every module is now responsible for changing the driver specification directly. An orchestrator builds the correct chain of steps and hands it to the client, which then calls it verbatim. The main client then makes any final modifications that put the different pieces of the driver together, particularly to attach the driver container itself to the pod and to apply the Spark configuration as command-line arguments.

The current steps are:

BaseSubmissionStep: Baseline configurations such as the docker image and resource requests.
DriverKubernetesCredentialsStep: Resolves Kubernetes credentials configuration in the driver pod. Mounts a secret if necessary.
InitContainerBootstrapStep: Attaches the init-container, if necessary, to the driver pod. This is optional and wont' be loaded if all URIs are "local" or there are no URIs at all.
DependencyResolutionStep: Sets the classpath, spark.jars, and spark.files properties. This step is partially not isolated as it assumes that files that are remote or locally submitted will be downloaded to a given location. Unit tests should verify that this contract holds.
PythonStep: Configures Python environment variables if using PySpark.

This change overhauls the underlying architecture of the submission client, but it is intended to entirely preserve existing behavior of Spark applications. Therefore users will find this to be an invisible change. The philosophy behind this design is to reconsider the breakdown of the submission process. It operates off the abstraction of "submission steps", which are transformation functions that take the previous state of the driver and return the new state of the driver. The driver's state includes its Spark configurations and the Kubernetes resources that will be used to deploy it. Such a refactor moves away from a features-first API design, which considers different containers to serve a set of features. The previous design, for example, had a container files resolver API object that returned different resolutions of the dependencies added by the user. However, it was up to the main Client to know how to intelligently invoke all of those APIs. Therefore the API surface area of the file resolver became untenably large and it was not intuitive of how it was to be used or extended. This design changes the encapsulation layout; every module is now responsible for changing the driver specification directly. An orchestrator builds the correct chain of steps and hands it to the client, which then calls it verbatim. The main client then makes any final modifications that put the different pieces of the driver together, particularly to attach the driver container itself to the pod and to apply the Spark configuration as command-line arguments.

mccheah · 2017-06-30T00:05:49Z

Unit tests are being rewritten. This passes integration tests in my development environment.

ifilonenko · 2017-06-30T01:51:24Z

Thanks for this! (y) @ash211 @foxish PTAL

mccheah · 2017-06-30T02:30:30Z

...s/core/src/main/scala/org/apache/spark/deploy/kubernetes/submit/submitsteps/PythonStep.scala

+      .addNewEnv()
+        .withName(ENV_PYSPARK_FILES)
+        .withValue(
+            KubernetesFileUtils.resolveFilePaths(otherPyFiles, filesDownloadPath).mkString(","))


if otherPyFiles is empty then this needs to have a value that isn't the empty string - "null", for example. Otherwise the PythonRunner will shift all the driver's arguments to the left by one.

Correct, it would be wise to pass down the Driver arguments to this class and modify the driver containers ENV variables. I will send a patch over on this

We shouldn't need to pass down the arguments to get this. Just:

.withValue(if (otherPyFiles.isEmpty) "null" else KubernetesFileUtils....

ifilonenko · 2017-06-30T16:42:20Z

core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

@@ -621,14 +621,22 @@ object SparkSubmit {
    if (isKubernetesCluster) {
      childMainClass = "org.apache.spark.deploy.kubernetes.submit.Client"
      if (args.isPython) {
+        childArgs += "--py-file"


naming convention, for the sake of consistency, should make this --primary-py-file

ifilonenko · 2017-06-30T16:45:10Z

core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala

        childArgs += args.pyFiles
      } else {
+        childArgs += "--primary-java-resource"


java and python resources can be resolved with just a check if it ends with .py, why is it necessary to pass down extra information from SparkSubmit that isn't needed per say. SparkSubmit imo should have minimal things being passed into our childMainClass

In general I'm against having logic that requires file names to have a certain extension.

It makes the code brittle; if upstream spark-submit starts giving us Python files with extensions that aren't .py then we want to be resilient against that.

ifilonenko · 2017-06-30T17:16:13Z

.../scala/org/apache/spark/deploy/kubernetes/submit/KubernetesSubmissionStepsOrchestrator.scala

+    }
+    val sparkJars = submissionSparkConf.getOption("spark.jars")
+        .map(_.split(","))
+        .getOrElse(Array.empty[String]) ++ additionalMainAppJar.toSeq


.toSeq() vs. +: additionalMainAppJar.getOrElse("")

Why the decision to make into Seq?

There's no point in having an empty string in the sequence if just not having the element at all would achieve the same effect.

ifilonenko · 2017-06-30T17:23:25Z

...rc/main/scala/org/apache/spark/deploy/kubernetes/submit/submitsteps/BaseSubmissionStep.scala

+      .addNewEnv()
+        .withName(ENV_DRIVER_ARGS)
+        .withValue(appArgs.mkString(" "))
+        .endEnv()


thoughts on Creating a Submission step that resolves the arguments here instead of over-writing this created ENV in PythonStep

Check most recent push that I made to this branch which resolves this and previous comments about argument issues

It seems more consistent to just load everything as environment variables correctly. Both Python and Java expect SPARK_DRIVER_ARGS to be the same, just that Python expects additional environment variables PRIMARY_PY_FILE and PY_FILES. I don't think we need any special argument parsing here, just need to load the correct environment variables with the correct values, and the Docker image will take care of formatting them properly.

ifilonenko · 2017-06-30T17:24:41Z

.../scala/org/apache/spark/deploy/kubernetes/submit/KubernetesSubmissionStepsOrchestrator.scala

+    val pythonStep = mainAppResource match {
+      case PythonMainAppResource(mainPyResource) =>
+        Option(new PythonStep(mainPyResource, additionalPythonFiles, filesDownloadPath))
+      case _ => Option.empty[KubernetesSubmissionStep]


So here instead of an Empty Step you would resolve the arguments to just be appArgs as you would have before in the BaseSubmissionStep, but otherwise you use the PythonStep. I believe SparkR has similair arguments to Java/Scala so this generalization could be just called: NonPythonArgumentResolver

ifilonenko · 2017-06-30T17:58:01Z

.../org/apache/spark/deploy/kubernetes/submit/submitsteps/DriverKubernetesCredentialsStep.scala

+import org.apache.spark.deploy.kubernetes.config._
+import org.apache.spark.deploy.kubernetes.constants._
+
+private[spark] class DriverKubernetesCredentialsStep(


Should this be generalized to a Credentials step as a whole for secure HDFS secrets as well? or should HDFS mounting be considered another separate step? @kimoonkim @foxish

HDFS should be separate. This class is already a little large as it is given that it's resolving various configuration options. We can extend this to allow users to give their certificates through a JKS trustStore.

ifilonenko · 2017-06-30T18:06:14Z

I think this refactor promotes a reasonably structured architecture that I think we can easily adapt and introduce to future developers. The ability to just write a KubernetesSubmissionStep to apply a new modification to the Driver pod, container, resources, ...etc is a seemingly painless step. This is especially good timing with the integration of secure HDFS where pod modifications will be important and having that be offset to a step that is cast into an Option() [ if it is necessary or not ] seems to quite organized.

mccheah · 2017-06-30T19:08:50Z

This was not supposed to merge yet, unfortunately - we'll be working on a revert on the base branch and re-opening this in a new push.

…-jars Sanitize empty strings for jars and files

Don't add the init-container step if all URIs are local.

e103225

mccheah changed the title ~~[WIP] Submission steps refactor~~ [WIP] Submission client redesign to use a step-based builder pattern Jun 30, 2017

ifilonenko self-requested a review June 30, 2017 01:51

mccheah commented Jun 30, 2017

View reviewed changes

ifilonenko reviewed Jun 30, 2017

View reviewed changes

ifilonenko merged commit e103225 into pyspark-integration Jun 30, 2017

This was referenced Jun 30, 2017

Python Bindings for launching PySpark Jobs from the JVM #364

Merged

Submission client redesign to use a step-based builder pattern #365

Merged

ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 26, 2019

Merge pull request apache-spark-on-k8s#363 from palantir/filter-empty…

18acf6d

…-jars Sanitize empty strings for jars and files

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Submission client redesign to use a step-based builder pattern #363

[WIP] Submission client redesign to use a step-based builder pattern #363

mccheah commented Jun 30, 2017 •

edited

Loading

mccheah commented Jun 30, 2017

ifilonenko commented Jun 30, 2017 •

edited

Loading

mccheah Jun 30, 2017

ifilonenko Jun 30, 2017

mccheah Jun 30, 2017

ifilonenko Jun 30, 2017

ifilonenko Jun 30, 2017

mccheah Jun 30, 2017

mccheah Jun 30, 2017

ifilonenko Jun 30, 2017 •

edited

Loading

mccheah Jun 30, 2017

ifilonenko Jun 30, 2017

ifilonenko Jun 30, 2017 •

edited

Loading

mccheah Jun 30, 2017

ifilonenko Jun 30, 2017

ifilonenko Jun 30, 2017

mccheah Jun 30, 2017

ifilonenko commented Jun 30, 2017

mccheah commented Jun 30, 2017

[WIP] Submission client redesign to use a step-based builder pattern #363

[WIP] Submission client redesign to use a step-based builder pattern #363

Conversation

mccheah commented Jun 30, 2017 • edited Loading

mccheah commented Jun 30, 2017

ifilonenko commented Jun 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ifilonenko Jun 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ifilonenko Jun 30, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ifilonenko commented Jun 30, 2017

mccheah commented Jun 30, 2017

mccheah commented Jun 30, 2017 •

edited

Loading

ifilonenko commented Jun 30, 2017 •

edited

Loading

ifilonenko Jun 30, 2017 •

edited

Loading

ifilonenko Jun 30, 2017 •

edited

Loading