Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ashwin/patch spark3 jars #17

Open
wants to merge 5 commits into
base: affirm-3.1.2
Choose a base branch
from

Conversation

ash7bhide
Copy link

What changes were proposed in this pull request?

  • Propagate spark jars to executors (bug in current spark version for k8s)
  • Change all "pyspark" references to "pyspark3"
  • Fix k8s auth issue which resulted in 401 between driver and spark-submit client

Copy link

@metra metra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving with some comments. I don't have much context to review this.

// We need this in K8s cluster mode so that we can upload local deps
// via the k8s application, like in cluster mode driver
childClasspath ++= resolvedMavenCoordinates.split(",")
} else {
// In K8s client mode, when in the driver, add resolved jars early as we might need
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unindent

for (jar <- resolvedMavenCoordinates.split(",")) {
addJarToClasspath(jar, loader)
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure why you added if (isKubernetesClusterModeDriver) to the else branch but I presume it's for a good reason.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's because when spark-submit is in the k8s driver - deployMode == client

val isKubernetesCluster = clusterManager == KUBERNETES && deployMode == CLUSTER
val isKubernetesClient = clusterManager == KUBERNETES && deployMode == CLIENT
 val isKubernetesClusterModeDriver = isKubernetesClient &&
      sparkConf.getBoolean("spark.kubernetes.submitInDriver", false)

@@ -16,4 +16,4 @@
# See the License for the specific language governing permissions and
# limitations under the License.

__version__ = "3.1.2+affirm4"
__version__ = "3.1.2+affirm8"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: what happened to 5 through 7?

<artifactId>jackson-dataformat-yaml</artifactId>
</exclusion>
</exclusions>
</dependency>
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this ported from the mainline?

return serviceHost != null && serviceHost.length > 0
}

def getHomeDir(): String = {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where's this coming from?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what sets it, but it's always present on k8s containers. We have something similar in ml_pipelines

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant more are you writing this code from scratch or are you porting it from somewhere?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it's coming from @daggertheog's diff for the spark 2 k8s auth issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants