This is the code that builds an opinionated base image for installing applications that run in the AWS Kinesis Client Library for Python.
This product and its documentation are intended for users who have a good basic understanding of Docker, Kinesis and Python Packaging. However, if you are confused, please feel free to create an issue in this project describing your confusion, and I'll attempt to clarify.
Docker images are often abused by being built more from scratch than necessary, because base images are a little too primitive to be used to deliver working software. The official Docker images contain either a completely finished service product, such as a database, or a build environment, such as python or even more generically, ubuntu. As a result, we all waste a lot of time and energy running package managers and copying files from a github repo into docker images to build up the same base images over and over, when all we really want is to install a Python package into its intended runtime environment.
This product aims to save you time by providing a reliable, standard interface for installing a Python application into a pre-existing KCL for Python environment.
Follow along with this example Dockerfile to see the intended use of this image as a base image:
FROM kojiromike/kclpy
COPY my-package.whl . # Skip if your package is on pypi.
RUN pip install my-package
COPY *.properties . # Provide a properties file per the KCL for Python docs.
Then build the image. If all goes well, you'll have an image that will run your python app in the KCL.
The image is built automatically on the Docker hub, but if you want to build this image locally, you can run
make
which will build and tag the image, or you can run
docker build [args] context
if you want to have more control over docker build arguments.
We can use the sample application that awslabs provides with in the AWS Kinesis Client Library for Python repository to test that this image meets its goal, and to provide a sort-of reference implementation. We use localstack to avoid connecting to real AWS endpoints.
Start localstack, then get its port numbers for various services:
docker-compose up -d
kinesis_endpoint=$(docker-compose port kinesis 443)
kinesis_endpoint="https://localhost:${kinesis_endpoint#*:}"
aws --region=us-east-1 --no-verify-ssl --endpoint-url="$kinesis_endpoint" \
kinesis create-stream --stream-name demo --shard-count 1
If you build the image locally, please keep in mind that docker-compose.yml
is configured to expect the image to be tagged kojiromike/kclpy
.
Implement the RecordProcessor interface. Beware that there are two versions, which have somewhat different interfaces. You can code to whichever version suits you.
This image is designed for you to easily install a Python 3 package from a compatible repository. If your package is open source and available on pypi, then you can just
pip install your-package
Pip will run in user mode, so you don't need root privileges. The package will be installed in /home/user/.local
, but you don't need to do anything special -- the environment is configured for user-mode python.
It is not possible to override aws API endpoints in the python library right now. We make the following compromises to point clients to local endpoints for demonstration and local development purposes:
- Use docker links (a deprecated docker feature) to make the real AWS endpoints point to the fake local endpoints.
- Disable SSL cert checking so that the fake endpoints self-signed certificate are accepted.
- Run multiple localstack containers each with internal port 443 exposed, so that the fake local endpoints are all on port 443, rather than running all aws services in a single container and allowing docker to randomize ports.