Skip to content

Commit

Permalink
Fixup: whitespaces
Browse files Browse the repository at this point in the history
  • Loading branch information
AnandInguva committed Feb 24, 2022
1 parent 072ead7 commit f504f4f
Show file tree
Hide file tree
Showing 2 changed files with 7 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -192,22 +192,21 @@ Beam offers a way to take a Beam container image and customize it. But if you ha
ENTRYPOINT ["/opt/apache/beam/boot"]
```
> **NOTE**: This example assumes necessary dependencies (in this case, Python 3.7 and pip) have been installed on the existing base image. Installing the Apache Beam SDK into the image will ensure that the image has the necessary SDK dependencies and reduce the worker startup time.
> The version specified in the `RUN` instruction must match the version used to launch the pipeline. <br>
> **Users need to make sure that whatever base image they use has the same Python/Java interpreter version that they use to run the pipeline**.
>**NOTE**: This example assumes necessary dependencies (in this case, Python 3.7 and pip) have been installed on the existing base image. Installing the Apache Beam SDK into the image will ensure that the image has the necessary SDK dependencies and reduce the worker startup time.
>The version specified in the `RUN` instruction must match the version used to launch the pipeline.<br>
>**Users need to make sure that whatever base image they use has the same Python/Java interpreter version that they use to run the pipeline**.

2. [Build](https://docs.docker.com/engine/reference/commandline/build/) and [push](https://docs.docker.com/engine/reference/commandline/push/) the image using Docker.

```
export BASE_IMAGE="apache/beam_python3.7_sdk:2.25.0"
export IMAGE_NAME="myremoterepo/mybeamsdk"
export TAG="latest"
# Optional - pull the base image into your local Docker daemon to ensure
# you have the most up-to-date version of the base image locally.
docker pull "${BASE_IMAGE}"
docker build -f Dockerfile -t "${IMAGE_NAME}:${TAG}" .
```

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ If your pipeline uses public packages from the [Python Package Index](https://py
The runner will use the `requirements.txt` file to install your additional dependencies onto the remote workers.

**Important:** Remote workers will install all packages listed in the `requirements.txt` file. Because of this, it's very important that you delete non-PyPI packages from the `requirements.txt` file, as stated in step 2. If you don't remove non-PyPI packages, the remote workers will fail when attempting to install packages from sources that are unknown to them.
> **NOTE**: An alternative to `pip check` is to use a library like [pip-tools](https://github.com/jazzband/pip-tools) to compile the `requirements.txt` with all the dependencies required for the pipeline.
> **NOTE**: An alternative to `pip check` is to use a library like [pip-tools](https://github.com/jazzband/pip-tools) to compile the `requirements.txt` with all the dependencies required for the pipeline.
## Custom Containers {#custom-containers}

You can pass a [container](https://hub.docker.com/search?q=apache%2Fbeam&type=image) image with all the dependencies that are needed for the pipeline instead of `requirements.txt`. [Follow the instructions on how to run pipeline with Custom Container images](https://beam.apache.org/documentation/runtime/environments/#running-pipelines).
Expand Down Expand Up @@ -81,7 +81,6 @@ If your pipeline uses packages that are not available publicly (e.g. packages th

See the [sdist documentation](https://docs.python.org/2/distutils/sourcedist.html) for more details on this command.

## Multiple File Dependencies

Often, your pipeline code spans multiple files. To run your project remotely, you must group these files as a Python package and specify the package when you run your pipeline. When the remote workers start, they will install your package. To group your files as a Python package and make it available remotely, perform the following steps:
Expand Down Expand Up @@ -153,5 +152,4 @@ To use pre-building the dependencies from `requirements.txt` on the container im
--docker_registry_push_url <IMAGE_URL>
> To use Docker, the `--prebuild_sdk_container_base_image` should be compatible with Apache Beam Runner. Please follow the [instructions](https://beam.apache.org/documentation/runtime/environments/#building-and-pushing-custom-containers) on how to build a base container image compatible with Apache Beam.
**NOTE**: For now, this feature is available only for the `Dataflow`.
**NOTE**: For now, this feature is available only for the `Dataflow`.

0 comments on commit f504f4f

Please sign in to comment.