Skip to content

idirze/spark-images

Repository files navigation

ci Release License Apache2

Collection of Apache Spark docker images for OKDP platform.

Currently, the images are built from the Apache Spark project distribution and the requirement may evolve to produce them from the source code.

The image relashionship is described by the following diagram:

Image Description
JRE The JRE LTS base image supported by Apache Spark depending on the version. This includes Java 11/17/21. Please, check the reference versions or Apache Spark website for more information.
spark-base The Apache Spark base image with official spark binaries (scala/java) and without OKDP extensions.
spark The Apache Spark image with official spark binaries (scala/java) and OKDP extensions.
spark-py The Apache Spark image with official spark binaries (scala/java), OKDP extensions and python support.
spark-r The Apache Spark image with official spark binaries (scala/java), OKDP extensions and R support.

Tagging

The project builds the images with a long format tags. Each tag combines multiple compatible versions combinations.

There are multiple tags levels and the format to use is depending on your convenience in term of stability and reproducibility.

The images are pushed to OKDP quay.io repository with the following tags:

Images Tags
spark-base, spark spark-<SPARK_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>
spark-<SPARK_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>
spark-<SPARK_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<RELEASE_VERSION>
spark-<SPARK_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>-<RELEASE_VERSION>
spark-py spark-<SPARK_VERSION>-python-<PYTHON_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>
spark-<SPARK_VERSION>-python-<PYTHON_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>
spark-<SPARK_VERSION>-python-<PYTHON_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<RELEASE_VERSION>
spark-<SPARK_VERSION>-python-<PYTHON_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>-<RELEASE_VERSION>
spark-r spark-<SPARK_VERSION>-r-<R_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>
spark-<SPARK_VERSION>-r-<R_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>
spark-<SPARK_VERSION>-r-<R_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<RELEASE_VERSION>
spark-<SPARK_VERSION>-r-<R_VERSION>-scala-<SCALA_VERSION>-java-<JAVA_VERSION>-<BUILD_DATE>-<RELEASE_VERSION>

Note

  1. <RELEASE_VERSION> corresponds to the Github release version or git tag without the leading v. Ex.: 1.0.0

  2. <BUILD_DATE> corresponds to the images build date with the YYYY-MM-DD format. The latest release tag is built every week.

An example of a py-spark image with a long form tag including spark/java/scala/python compatible versions and a build date with a release version is:

quay.io/okdp/spark-py:spark-3.3.4-python-3.10-scala-2.12-java-17-2024-04-04-1.0.0.

Alternatives

About

OKDP Spark docker images

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published