Skip to content
This repository has been archived by the owner on Jan 9, 2020. It is now read-only.

Introduce blocking submit to kubernetes by default #53

Merged
merged 15 commits into from
Feb 3, 2017

Conversation

ash211
Copy link

@ash211 ash211 commented Jan 26, 2017

Fixes #46

Two new configuration settings, modeled off the equivalent spark.yarn... settings:

  • spark.kubernetes.submit.waitAppCompletion
  • spark.kubernetes.report.interval

@@ -63,6 +64,20 @@ private[spark] class Client(
private val driverLaunchTimeoutSecs = sparkConf.getTimeAsSeconds(
"spark.kubernetes.driverLaunchTimeout", s"${DEFAULT_LAUNCH_TIMEOUT_SECONDS}s")

private[spark] val WAIT_FOR_APP_COMPLETION = ConfigBuilder(
"spark.kubernetes.submit.waitAppCompletion")
.doc("In cluster mode, whether to wait for the application to finish before exiting the " +
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "In cluster mode" => "In kubernetes cluster node"

@@ -24,7 +24,7 @@ import javax.net.ssl.X509TrustManager
import com.google.common.io.Files
import com.google.common.util.concurrent.{SettableFuture, ThreadFactoryBuilder}
import io.fabric8.kubernetes.api.model._
import io.fabric8.kubernetes.client.{Config, ConfigBuilder, DefaultKubernetesClient, KubernetesClientException, Watch, Watcher}
import io.fabric8.kubernetes.client.{Config, ConfigBuilder => KConfigBuilder, DefaultKubernetesClient, KubernetesClientException, Watch, Watcher}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

KConfigBuilder is not intuitive, maybe K8SConfigBuilder?

.timeConf(TimeUnit.MILLISECONDS)
.createWithDefaultString("1s")

private val fireAndForget: Boolean = !sparkConf.get(WAIT_FOR_APP_COMPLETION);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove the trailing semicolon

@lins05
Copy link

lins05 commented Jan 26, 2017

The string representation of podState is not well-formatted, it's something like:

2017-01-26 06:59:46 INFO  PodStateMonitor:54 - Phase changed, new state: Pod(apiVersion=v1, kind=Pod, metadata=ObjectMeta(annotations=null, creationTimestamp=2017-01-26T06:5
9:20Z, deletionGracePeriodSeconds=null, deletionTimestamp=null, finalizers=[], generateName=null, generation=null, labels={driver-launcher-selector=driver-launcher-148541395
9323, spark-app-name=spark-pi}, name=spark-pi-1485413959323, namespace=default, ownerReferences=[], resourceVersion=243127, selfLink=/api/v1/namespaces/default/pods/spark-pi
-1485413959323, uid=f6002452-e394-11e6-b83d-080027acfaf0, additionalProperties={})
... (tens of lines omitted) ...

We should extract the important details like namespace/account, start/running time, service cluster ip, etc. and format them properly.

As a reference, yarn cluster mode would format reports like:

15/06/10 17:29:35 INFO Client:
        client token: N/A
        diagnostics: N/A
        ApplicationMaster host: N/A
        ApplicationMaster RPC port: 0
        queue: default
        start time: 1428686924325
        final status: SUCCEEDED
        tracking URL: http://blue1:8088/proxy/application_1428670545834_0009/
        user: hdfs

@ash211
Copy link
Author

ash211 commented Jan 26, 2017

Agreed that the podState representation is not all that helpful -- tomorrow I'll modify to pull out some of the more useful pieces of information and format similarly to the YARN reports

@ash211
Copy link
Author

ash211 commented Jan 26, 2017

New formatting:

2017-01-26 10:34:32 INFO  PodStateMonitor:54 - Application status for org-apache-spark-examples-sparkpi-1485455619977 (phase: Running)
2017-01-26 10:34:33 INFO  PodStateMonitor:54 - Application status for org-apache-spark-examples-sparkpi-1485455619977 (phase: Running)
2017-01-26 10:34:34 INFO  PodStateMonitor:54 - Application status for org-apache-spark-examples-sparkpi-1485455619977 (phase: Running)
2017-01-26 10:34:35 INFO  PodStateMonitor:54 - Application status for org-apache-spark-examples-sparkpi-1485455619977 (phase: Running)
2017-01-26 10:34:36 INFO  PodStateMonitor:54 - Application status for org-apache-spark-examples-sparkpi-1485455619977 (phase: Succeeded)
2017-01-26 10:34:36 INFO  PodStateMonitor:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1485455619977
	 namespace: default
	 labels: driver-launcher-selector -> driver-launcher-1485455619977,spark-app-name -> org.apache.spark.examples.SparkPi
	 pod uid: f602f708-e3f5-11e6-94b6-0e0cecaa76e2
	 creation time: 2017-01-26T18:33:41Z
	 service account name: default
	 volumes: spark-submission-secret-volume,default-token-fvqq7
	 node name: REDACTED
	 start time: 2017-01-26T18:33:41Z
	 container images: ash211/testrepo:driver-latest2
	 phase: Succeeded
2017-01-26 10:34:36 INFO  Client:54 - Application org-apache-spark-examples-sparkpi-1485455619977 ended with final phase Succeeded

@mccheah
Copy link

mccheah commented Jan 26, 2017

We should look into using a Watch here, which is provided by the fabric8 API. We use it in a few other places in Client but I'm not certain if the thread pool it uses is blocking or not. (Edit: Blocking meaning non-daemon)

One way to use a Watch effectively here would be to create a CountDownLatch and block on the count down latch, and create a Watch which monitors the state of the pod. When the pod runs to completion, count down the latch to unblock the main thread and close the watch.

Edit: Using the watch with the above way removes any dependency on the Watch using a non-daemon thread pool.


private def formatPodState(pod: Pod): String = {

val details = Seq[(String, String)](
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fabric8's objects are Jackson-annotated so I wonder if we could just use a Jackson objectmapper and pass the object through it. I've been meaning to try this for error messaging as well.

@ash211
Copy link
Author

ash211 commented Jan 26, 2017

I considered the watch, but I want to make sure we get the periodic polling of the log message as well. Are you suggesting keeping a "last received state" through the watch, and having a separate timer that just logs that state every n seconds until the watch receives a terminal state?

That would reduce load on the apiserver which would be nice.

@mccheah
Copy link

mccheah commented Jan 26, 2017

That could work to support the periodic polling. The main advantage I see is staying away from having busy-waiting loops in the code. We should look to abstract away the busy waiting loop with constructs like timers and watches.

@foxish
Copy link
Member

foxish commented Jan 27, 2017

I like the idea of using a watch and caching the results. Like you said, we get the entire set of mutations and don't overload the apiserver. The change LGTM otherwise.

@foxish
Copy link
Member

foxish commented Jan 27, 2017

On the other hand, I'm also wondering if the watch implementation may cause us to miss events during the establishment/teardown of the persistent connection since we are getting events over an HTTP streaming interface. This could be a very rare case, but if it did happen, it would certainly give us unexpected results.

@ash211
Copy link
Author

ash211 commented Jan 27, 2017

@foxish do you know of other places in the community where a service polls/watches for status of another pod and reacts on state changes? We're basically using the watch as a form of a message bus to receive events, and you're suggesting that the state change events are not delivered with exactly once semantics across the pod's entire lifetime. So for this code to use watches we're going to need to tighten our understanding of event delivery guarantees. Otherwise if the "pod completed" event gets missed our blocking submit will block forever (until the user's Ctrl+C).

In the meantime, maybe polling apiserver isn't that bad?

@mccheah
Copy link

mccheah commented Jan 27, 2017

We would want to create the watch before creating the pod - you can create watches for things that don't exist and then the CREATED event is propagated to the watch as well. I'm not as certain on tear down.

@mccheah
Copy link

mccheah commented Jan 27, 2017

It turns out that polling would run into a similar problem. Suppose we poll every 10 seconds, since polling too frequently would certainly cause overload. But if two events occur between T=0s and T=10s then we're dropping one of those. In using a watch, we can build a buffer of the events we've seen so far and report them all at the given interval. We could also report the events immediately and the regular poll is just a heartbeat of sorts to inform the user the job is still working.

@iyanuobidele
Copy link

+1. Reacting to events is always more efficient than polling.

Its seems to me like we're doing too much with the by-interval reporting and events together. Are there any reasons why we can't just stick to reporting the SSE to the client and getting rid of the by-interval reporting totally ?

Except if its customary to always report status at some interval. Then in that case, I think @mccheah's suggestion is an ideal approach

@iyanuobidele
Copy link

Anything else stopping this PR from being merged ?

@mccheah
Copy link

mccheah commented Jan 27, 2017

Are we in consensus that this should use a watch instead?

@foxish
Copy link
Member

foxish commented Jan 27, 2017

We're not guaranteed to see every intermediate state even with a watch because there is a fixed window size of buffered events, and in case of a network partition, we may miss some. I don't think here there is even a requirement to see every intermediate state, as long as we can guarantee that eventually, we show the user the final state his/her job ends up in, or display an error in case the apiserver is unreachable.

Yes, a watch and periodically printing out the state of the pod SGTM. If the apiserver becomes unreachable, we should also show that, as an unknown state.


var previousPhase: String = null

while (true) {
Copy link

@mccheah mccheah Jan 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're going with polling then we can use a TimerTask or ScheduledExecutorService to fetch the status periodically, and block the main thread with a CountDownLatch. When the Timer thread detects the pod is finished, count down the latch to unblock the main thread.

@ash211
Copy link
Author

ash211 commented Jan 31, 2017

Finishing up last touches for switching to a Watch from polling, and also to accommodate the large Config change in a recently-merged PR. Should have revised code up for review later today

@ash211
Copy link
Author

ash211 commented Feb 1, 2017

@iyanuobidele @foxish @lins05 @mccheah ready for re-review now that I've moved to a Watch-based monitoring.

Logging now looks like this:

$ ./bin/spark-submit --class org.apache.spark.examples.SparkPi --deploy-mode cluster --master k8s://10.0.20.108:6443 --conf spark.executor.instances=5 --conf spark.kubernetes.driver.docker.image=ash211/testrepo:driver-latest3 --conf spark.kubernetes.executor.docker.image=ash211/testrepo:executor-latest3 --conf spark.kubernetes.submit.caCertFile=mycert.cert ./examples/jars/spark-examples_2.11-2.2.0-SNAPSHOT.jar 10000
2017-01-31 14:27:07 INFO  Client:54 - Starting application org-apache-spark-examples-sparkpi-1485901627225 in Kubernetes...
2017-01-31 14:27:07 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-01-31 14:27:07 INFO  SecurityManager:54 - Changing view acls to: aash
2017-01-31 14:27:07 INFO  SecurityManager:54 - Changing modify acls to: aash
2017-01-31 14:27:07 INFO  SecurityManager:54 - Changing view acls groups to:
2017-01-31 14:27:07 INFO  SecurityManager:54 - Changing modify acls groups to:
2017-01-31 14:27:07 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(aash); groups with view permissions: Set(); users  with modify permissions: Set(aash); groups with modify permissions: Set()
2017-01-31 14:27:09 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: unknown)
2017-01-31 14:27:09 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Pending)
2017-01-31 14:27:09 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1485901627225
	 namespace: default
	 labels: spark-app-id -> org-apache-spark-examples-sparkpi-1485901627225, spark-app-name -> org.apache.spark.examples.SparkPi, spark-driver -> org-apache-spark-examples-sparkpi-1485901627225
	 pod uid: 67a5b3b1-e804-11e6-b3d3-0e0cecaa76e2
	 creation time: 2017-01-31T22:27:09Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-fvqq7
	 node name: N/A
	 start time: N/A
	 container images: N/A
	 phase: Pending
2017-01-31 14:27:09 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Pending)
2017-01-31 14:27:09 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Pending)
2017-01-31 14:27:10 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Pending)
2017-01-31 14:27:11 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:11 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1485901627225
	 namespace: default
	 labels: spark-app-id -> org-apache-spark-examples-sparkpi-1485901627225, spark-app-name -> org.apache.spark.examples.SparkPi, spark-driver -> org-apache-spark-examples-sparkpi-1485901627225
	 pod uid: 67a5b3b1-e804-11e6-b3d3-0e0cecaa76e2
	 creation time: 2017-01-31T22:27:09Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-fvqq7
	 node name: REDACTED
	 start time: 2017-01-31T22:27:09Z
	 container images: ash211/testrepo:driver-latest3
	 phase: Running
2017-01-31 14:27:11 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:11 WARN  Client:66 - Submitting application details, application secret, and local jars to the cluster over an insecure connection. You should configure SSL to secure this step.
2017-01-31 14:27:12 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:13 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:14 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:15 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:16 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:17 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:18 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:18 INFO  Client:54 - Waiting for application org-apache-spark-examples-sparkpi-1485901627225 to finish...
2017-01-31 14:27:19 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:27:20 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
<snip>
2017-01-31 14:29:11 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Running)
2017-01-31 14:29:11 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1485901627225 (phase: Succeeded)
2017-01-31 14:29:11 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1485901627225
	 namespace: default
	 labels: spark-app-id -> org-apache-spark-examples-sparkpi-1485901627225, spark-app-name -> org.apache.spark.examples.SparkPi, spark-driver -> org-apache-spark-examples-sparkpi-1485901627225
	 pod uid: 67a5b3b1-e804-11e6-b3d3-0e0cecaa76e2
	 creation time: 2017-01-31T22:27:09Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-fvqq7
	 node name: REDACTED
	 start time: 2017-01-31T22:27:09Z
	 container images: ash211/testrepo:driver-latest3
	 phase: Succeeded
2017-01-31 14:29:11 INFO  Client:54 - Application org-apache-spark-examples-sparkpi-1485901627225 finished.
2017-01-31 14:29:11 INFO  LoggingPodStatusWatcher:54 - Application org-apache-spark-examples-sparkpi-1485901627225 ended with final phase Succeeded

@mccheah
Copy link

mccheah commented Feb 2, 2017

The continuous logging might be cut off awkwardly in fire-and-forget mode. We could disable the periodic logging in fire-and-forget mode and just report the changes in pod status.

@ash211
Copy link
Author

ash211 commented Feb 2, 2017

I'll make that change today: to turn off the periodic logging in fire-and-forget mode since it cuts off as soon as the submitter is able to forget (local jars are uploaded).

@ash211
Copy link
Author

ash211 commented Feb 2, 2017

Logs when running with --conf spark.kubernetes.submit.waitAppCompletion=false:

2017-02-02 15:54:51 INFO  Client:54 - Starting application org-apache-spark-examples-sparkpi-1486079691728 in Kubernetes...
2017-02-02 15:54:52 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-02-02 15:54:52 INFO  SecurityManager:54 - Changing view acls to: palantir
2017-02-02 15:54:52 INFO  SecurityManager:54 - Changing modify acls to: palantir
2017-02-02 15:54:52 INFO  SecurityManager:54 - Changing view acls groups to:
2017-02-02 15:54:52 INFO  SecurityManager:54 - Changing modify acls groups to:
2017-02-02 15:54:52 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(palantir); groups with view permissions: Set(); users  with modify permissions: Set(palantir); groups with modify permissions: Set()
2017-02-02 15:54:53 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079691728 (phase: Pending)
2017-02-02 15:54:53 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1486079691728
	 namespace: default
	 labels: spark-app-id -> org-apache-spark-examples-sparkpi-1486079691728, spark-app-name -> org.apache.spark.examples.SparkPi, spark-driver -> org-apache-spark-examples-sparkpi-1486079691728
	 pod uid: fdee324a-e9a2-11e6-b3d3-0e0cecaa76e2
	 creation time: 2017-02-02T23:54:53Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-fvqq7
	 node name: N/A
	 start time: N/A
	 container images: N/A
	 phase: Pending
2017-02-02 15:54:53 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079691728 (phase: Pending)
2017-02-02 15:54:53 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079691728 (phase: Pending)
2017-02-02 15:54:55 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079691728 (phase: Running)
2017-02-02 15:54:55 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1486079691728
	 namespace: default
	 labels: spark-app-id -> org-apache-spark-examples-sparkpi-1486079691728, spark-app-name -> org.apache.spark.examples.SparkPi, spark-driver -> org-apache-spark-examples-sparkpi-1486079691728
	 pod uid: fdee324a-e9a2-11e6-b3d3-0e0cecaa76e2
	 creation time: 2017-02-02T23:54:53Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-fvqq7
	 node name: REDACTED
	 start time: 2017-02-02T23:54:53Z
	 container images: ash211/testrepo:driver-latest3
	 phase: Running
2017-02-02 15:54:55 WARN  Client:66 - Submitting application details, application secret, and local jars to the cluster over an insecure connection. You should configure SSL to secure this step.
2017-02-02 15:55:20 INFO  Client:54 - Submitting local resources to driver pod for application org-apache-spark-examples-sparkpi-1486079691728 ...
2017-02-02 15:55:20 INFO  Client:54 - Finished launching local resources to application org-apache-spark-examples-sparkpi-1486079691728
2017-02-02 15:55:20 INFO  Client:54 - Application org-apache-spark-examples-sparkpi-1486079691728 successfully launched.

Logs when running without that (defaults to true):

2017-02-02 15:56:29 INFO  Client:54 - Starting application org-apache-spark-examples-sparkpi-1486079789535 in Kubernetes...
2017-02-02 15:56:29 WARN  NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2017-02-02 15:56:30 INFO  SecurityManager:54 - Changing view acls to: palantir
2017-02-02 15:56:30 INFO  SecurityManager:54 - Changing modify acls to: palantir
2017-02-02 15:56:30 INFO  SecurityManager:54 - Changing view acls groups to:
2017-02-02 15:56:30 INFO  SecurityManager:54 - Changing modify acls groups to:
2017-02-02 15:56:30 INFO  SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(palantir); groups with view permissions: Set(); users  with modify permissions: Set(palantir); groups with modify permissions: Set()
2017-02-02 15:56:30 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: unknown)
2017-02-02 15:56:31 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Pending)
2017-02-02 15:56:31 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1486079789535
	 namespace: default
	 labels: spark-app-id -> org-apache-spark-examples-sparkpi-1486079789535, spark-app-name -> org.apache.spark.examples.SparkPi, spark-driver -> org-apache-spark-examples-sparkpi-1486079789535
	 pod uid: 38470591-e9a3-11e6-b3d3-0e0cecaa76e2
	 creation time: 2017-02-02T23:56:31Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-fvqq7
	 node name: N/A
	 start time: N/A
	 container images: N/A
	 phase: Pending
2017-02-02 15:56:31 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Pending)
2017-02-02 15:56:31 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Pending)
2017-02-02 15:56:31 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Pending)
2017-02-02 15:56:32 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Pending)
2017-02-02 15:56:33 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:33 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1486079789535
	 namespace: default
	 labels: spark-app-id -> org-apache-spark-examples-sparkpi-1486079789535, spark-app-name -> org.apache.spark.examples.SparkPi, spark-driver -> org-apache-spark-examples-sparkpi-1486079789535
	 pod uid: 38470591-e9a3-11e6-b3d3-0e0cecaa76e2
	 creation time: 2017-02-02T23:56:31Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-fvqq7
	 node name: REDACTED
	 start time: 2017-02-02T23:56:31Z
	 container images: ash211/testrepo:driver-latest3
	 phase: Running
2017-02-02 15:56:33 WARN  Client:66 - Submitting application details, application secret, and local jars to the cluster over an insecure connection. You should configure SSL to secure this step.
2017-02-02 15:56:33 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:34 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:35 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:36 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:37 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:38 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:39 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:40 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:41 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:42 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:43 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:44 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:45 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:46 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:47 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:48 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:49 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:50 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:51 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:52 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:53 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:54 INFO  Client:54 - Submitting local resources to driver pod for application org-apache-spark-examples-sparkpi-1486079789535 ...
2017-02-02 15:56:54 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:55 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:56 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:57 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:58 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:56:59 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:00 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:01 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:02 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:03 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:04 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:05 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:06 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:07 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:08 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:09 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:10 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:11 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:12 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:13 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:14 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:14 INFO  Client:54 - Finished launching local resources to application org-apache-spark-examples-sparkpi-1486079789535
2017-02-02 15:57:14 INFO  Client:54 - Waiting for application org-apache-spark-examples-sparkpi-1486079789535 to finish...
2017-02-02 15:57:15 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:16 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:17 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:18 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:19 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:20 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:21 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:22 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:23 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:24 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:25 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:26 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:27 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:28 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:29 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:30 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:31 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:32 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:33 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:34 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:35 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:36 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:37 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:38 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:39 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:40 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:41 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:42 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:43 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:44 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:45 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:46 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:47 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:48 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:49 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:50 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:51 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Running)
2017-02-02 15:57:52 INFO  LoggingPodStatusWatcher:54 - Application status for org-apache-spark-examples-sparkpi-1486079789535 (phase: Succeeded)
2017-02-02 15:57:52 INFO  LoggingPodStatusWatcher:54 - Phase changed, new state:
	 pod name: org-apache-spark-examples-sparkpi-1486079789535
	 namespace: default
	 labels: spark-app-id -> org-apache-spark-examples-sparkpi-1486079789535, spark-app-name -> org.apache.spark.examples.SparkPi, spark-driver -> org-apache-spark-examples-sparkpi-1486079789535
	 pod uid: 38470591-e9a3-11e6-b3d3-0e0cecaa76e2
	 creation time: 2017-02-02T23:56:31Z
	 service account name: default
	 volumes: spark-submission-secret-volume, default-token-fvqq7
	 node name: REDACTED
	 start time: 2017-02-02T23:56:31Z
	 container images: ash211/testrepo:driver-latest3
	 phase: Succeeded
2017-02-02 15:57:52 INFO  Client:54 - Application org-apache-spark-examples-sparkpi-1486079789535 finished.

@ash211
Copy link
Author

ash211 commented Feb 3, 2017

@mccheah anything else? This PR's been open for about a week so I'd like to merge pretty soon and only hold it up longer for things that are truly blocking merge and not for things that can be added post-merge.

* logging will be disabled.
*/
private[kubernetes] class LoggingPodStatusWatcher(podCompletedFuture: CountDownLatch,
appId: String,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: four-space indentation from the leftmost margin here

@mccheah
Copy link

mccheah commented Feb 3, 2017

Going to merge this - was going to refactor Client soon and I'll catch the style change there.

@mccheah mccheah merged commit 5ffbac2 into k8s-support-alternate-incremental Feb 3, 2017
@mccheah mccheah deleted the blocking-submit branch February 3, 2017 01:34
ash211 added a commit that referenced this pull request Feb 8, 2017
* Introduce blocking submit to kubernetes by default

Two new configuration settings:
- spark.kubernetes.submit.waitAppCompletion
- spark.kubernetes.report.interval

* Minor touchups

* More succinct logging for pod state

* Fix import order

* Switch to watch-based logging

* Spaces in comma-joined volumes, labels, and containers

* Use CountDownLatch instead of SettableFuture

* Match parallel ConfigBuilder style

* Disable logging in fire-and-forget mode

Which is enabled with spark.kubernetes.submit.waitAppCompletion=false
(default: true)

* Additional log line for when application is launched

* Minor wording changes

* More logging

* Drop log to DEBUG
ash211 added a commit that referenced this pull request Feb 24, 2017
#53 added these config but didn't document them
ash211 added a commit that referenced this pull request Feb 25, 2017
* Document blocking submit calls

#53 added these config but didn't document them

* Update running-on-kubernetes.md
ash211 added a commit to palantir/spark that referenced this pull request Mar 3, 2017
* Document blocking submit calls

apache-spark-on-k8s#53 added these config but didn't document them

* Update running-on-kubernetes.md

(cherry picked from commit 82275ae)
ash211 added a commit that referenced this pull request Mar 8, 2017
* Introduce blocking submit to kubernetes by default

Two new configuration settings:
- spark.kubernetes.submit.waitAppCompletion
- spark.kubernetes.report.interval

* Minor touchups

* More succinct logging for pod state

* Fix import order

* Switch to watch-based logging

* Spaces in comma-joined volumes, labels, and containers

* Use CountDownLatch instead of SettableFuture

* Match parallel ConfigBuilder style

* Disable logging in fire-and-forget mode

Which is enabled with spark.kubernetes.submit.waitAppCompletion=false
(default: true)

* Additional log line for when application is launched

* Minor wording changes

* More logging

* Drop log to DEBUG
ash211 added a commit that referenced this pull request Mar 8, 2017
* Document blocking submit calls

#53 added these config but didn't document them

* Update running-on-kubernetes.md
foxish pushed a commit that referenced this pull request Jul 24, 2017
* Introduce blocking submit to kubernetes by default

Two new configuration settings:
- spark.kubernetes.submit.waitAppCompletion
- spark.kubernetes.report.interval

* Minor touchups

* More succinct logging for pod state

* Fix import order

* Switch to watch-based logging

* Spaces in comma-joined volumes, labels, and containers

* Use CountDownLatch instead of SettableFuture

* Match parallel ConfigBuilder style

* Disable logging in fire-and-forget mode

Which is enabled with spark.kubernetes.submit.waitAppCompletion=false
(default: true)

* Additional log line for when application is launched

* Minor wording changes

* More logging

* Drop log to DEBUG
foxish pushed a commit that referenced this pull request Jul 24, 2017
* Document blocking submit calls

#53 added these config but didn't document them

* Update running-on-kubernetes.md
ifilonenko pushed a commit to ifilonenko/spark that referenced this pull request Feb 25, 2019
…8s#53)

* Introduce blocking submit to kubernetes by default

Two new configuration settings:
- spark.kubernetes.submit.waitAppCompletion
- spark.kubernetes.report.interval

* Minor touchups

* More succinct logging for pod state

* Fix import order

* Switch to watch-based logging

* Spaces in comma-joined volumes, labels, and containers

* Use CountDownLatch instead of SettableFuture

* Match parallel ConfigBuilder style

* Disable logging in fire-and-forget mode

Which is enabled with spark.kubernetes.submit.waitAppCompletion=false
(default: true)

* Additional log line for when application is launched

* Minor wording changes

* More logging

* Drop log to DEBUG
puneetloya pushed a commit to puneetloya/spark that referenced this pull request Mar 11, 2019
…8s#53)

* Introduce blocking submit to kubernetes by default

Two new configuration settings:
- spark.kubernetes.submit.waitAppCompletion
- spark.kubernetes.report.interval

* Minor touchups

* More succinct logging for pod state

* Fix import order

* Switch to watch-based logging

* Spaces in comma-joined volumes, labels, and containers

* Use CountDownLatch instead of SettableFuture

* Match parallel ConfigBuilder style

* Disable logging in fire-and-forget mode

Which is enabled with spark.kubernetes.submit.waitAppCompletion=false
(default: true)

* Additional log line for when application is launched

* Minor wording changes

* More logging

* Drop log to DEBUG
puneetloya pushed a commit to puneetloya/spark that referenced this pull request Mar 11, 2019
* Document blocking submit calls

apache-spark-on-k8s#53 added these config but didn't document them

* Update running-on-kubernetes.md
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants