This repository has been archived by the owner on Jan 9, 2020. It is now read-only.
forked from apache/spark
-
Notifications
You must be signed in to change notification settings - Fork 118
Python Bindings for launching PySpark Jobs from the JVM #364
Merged
Merged
Changes from 24 commits
Commits
Show all changes
29 commits
Select commit
Hold shift + click to select a range
d3cf58f
Adding PySpark Submit functionality. Launching Python from JVM
ifilonenko bafc13c
Addressing scala idioms related to PR351
ifilonenko 59d9f0a
Removing extends Logging which was necessary for LogInfo
ifilonenko 4daf634
Refactored code to leverage the ContainerLocalizedFileResolver
ifilonenko 51105ca
Modified Unit tests so that they would pass
ifilonenko bd30f40
Modified Unit Test input to pass Unit Tests
ifilonenko 720776e
Setup working environent for integration tests for PySpark
ifilonenko 4b5f470
Comment out Python thread logic until Jenkins has python in Python
ifilonenko 1361a26
Modifying PythonExec to pass on Jenkins
ifilonenko 0abc3b1
Modifying python exec
ifilonenko 0869b07
Added unit tests to ClientV2 and refactored to include pyspark submis…
ifilonenko 38d48ce
Merge branch 'branch-2.1-kubernetes' of https://github.com/apache-spa…
ifilonenko 9bf7b9d
Modified unit test check
ifilonenko 4561194
Scalastyle
ifilonenko 2cf96cc
Merged with PR 348 and added further tests and minor documentation
ifilonenko eb1079a
PR 348 file conflicts
ifilonenko 4a6b779
Refactored unit tests and styles
ifilonenko 363919a
further scala stylzing and logic
ifilonenko 9c7adb1
Modified unit tests to be more specific towards Class in question
ifilonenko 0388aa4
Removed space delimiting for methods
ifilonenko 6acab03
Merge branch 'branch-2.1-kubernetes' of https://github.com/apache-spa…
ifilonenko 5499f6d
Submission client redesign to use a step-based builder pattern.
mccheah e103225
Don't add the init-container step if all URIs are local.
mccheah 4533df2
Python arguments patch + tests + docs
ifilonenko cc289f1
Revert "Python arguments patch + tests + docs"
mccheah c267286
Revert "Don't add the init-container step if all URIs are local."
mccheah 8045c94
Revert "Submission client redesign to use a step-based builder pattern."
mccheah 41b6b8c
style changes
ifilonenko 923f956
space for styling
ifilonenko File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -180,6 +180,32 @@ The above mechanism using `kubectl proxy` can be used when we have authenticatio | |
kubernetes-client library does not support. Authentication using X509 Client Certs and OAuth tokens | ||
is currently supported. | ||
|
||
### Running PySpark | ||
|
||
Running PySpark on Kubernetes leverages the same spark-submit logic when launching on Yarn and Mesos. | ||
Python files can be distributed by including, in the conf, `--py-files` | ||
|
||
Below is an example submission: | ||
|
||
|
||
``` | ||
bin/spark-submit \ | ||
--deploy-mode cluster \ | ||
--master k8s://http://127.0.0.1:8001 \ | ||
--kubernetes-namespace default \ | ||
--conf spark.executor.memory=500m \ | ||
--conf spark.driver.memory=1G \ | ||
--conf spark.driver.cores=1 \ | ||
--conf spark.executor.cores=1 \ | ||
--conf spark.executor.instances=1 \ | ||
--conf spark.app.name=spark-pi \ | ||
--conf spark.kubernetes.driver.docker.image=spark-driver-py:latest \ | ||
--conf spark.kubernetes.executor.docker.image=spark-executor-py:latest \ | ||
--conf spark.kubernetes.initcontainer.docker.image=spark-init:latest \ | ||
--py-files local:///opt/spark/examples/src/main/python/sort.py \ | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I know this is unrelated, but so happy to have local support :) |
||
local:///opt/spark/examples/src/main/python/pi.py 100 | ||
``` | ||
|
||
## Dynamic Executor Scaling | ||
|
||
Spark on Kubernetes supports Dynamic Allocation with cluster mode. This mode requires running | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
24 changes: 24 additions & 0 deletions
24
...core/src/main/scala/org/apache/spark/deploy/kubernetes/PodWithDetachedInitContainer.scala
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
/* | ||
* Licensed to the Apache Software Foundation (ASF) under one or more | ||
* contributor license agreements. See the NOTICE file distributed with | ||
* this work for additional information regarding copyright ownership. | ||
* The ASF licenses this file to You under the Apache License, Version 2.0 | ||
* (the "License"); you may not use this file except in compliance with | ||
* the License. You may obtain a copy of the License at | ||
* | ||
* http://www.apache.org/licenses/LICENSE-2.0 | ||
* | ||
* Unless required by applicable law or agreed to in writing, software | ||
* distributed under the License is distributed on an "AS IS" BASIS, | ||
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
* See the License for the specific language governing permissions and | ||
* limitations under the License. | ||
*/ | ||
package org.apache.spark.deploy.kubernetes | ||
|
||
import io.fabric8.kubernetes.api.model.{Container, Pod} | ||
|
||
private[spark] case class PodWithDetachedInitContainer( | ||
pod: Pod, | ||
initContainer: Container, | ||
mainContainer: Container) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to check if it's empty?