Skip to content

Latest commit

 

History

History
266 lines (202 loc) · 9.93 KB

USER_GUIDE.md

File metadata and controls

266 lines (202 loc) · 9.93 KB

Trow User Guide

More information is available in the README and Installation instructions.

Persisting Data/Images

If your cluster does not support Persistent Volumes, or you would like to use a different driver (e.g. cephfs) you will need to manually assign a volume. This should be straightforward, but is cluster-specific. Make sure that the volume is writeable by the Trow user (user id 333333 by default). Normally this is taken care of by the fsGroup setting in the securityContext part of the deployment, but this may not work for certain types of volume e.g. hostPath - in these cases you may need to perform an explicit chown or chmod using the UID of the Trow user.

Backing up the Trow registry can be done by copying the data directory (/data by default).

Proxying other registries (and MutatingWebhook)

Trow can be configured as a proxy cache for other registries by passing the argument --proxy-registry-config-file on start-up. Any repositories under f/{alias}/ will automatically be pulled from the matching registry. For example, if we start Trow with:

# proxy.yaml
registries:
  - alias: docker
    host: registry-1.docker.io
  - alias: my-custom-registry
    host: my_custom_registry.example.com
    username: toto
    password: pass1234
$ trow --proxy-registry-config-file ./proxy.yaml
Starting Trow 0.6.0 on 0.0.0.0:8000
Hostname of this registry (for the MutatingWebhook): "0.0.0.0:8000"
Image validation webhook not configured
Proxy registries configured:
  - docker: registry-1.docker.io
  - quay: quay.io
  - nvcr: nvcr.io

And then make the following request to the empty registry:

$ docker pull localhost:8443/f/docker/nginx:latest
latest: Pulling from f/docker/nginx
bb79b6b2107f: Already exists
5a9f1c0027a7: Pull complete
b5c20b2b484f: Pull complete
166a2418f7e8: Pull complete
1966ea362d23: Pull complete
Digest: sha256:34f3f875e745861ff8a37552ed7eb4b673544d2c56c7cc58f9a9bec5b4b3530e
Status: Downloaded newer image for localhost:8443/f/docker/nginx:latest
localhost:8443/f/docker/nginx:latest

Trow will keep a cached copy and check for new versions on each pull. The check is done via a HEAD request which does not count towards the dockerhub rate limits. If the image cannot be pulled a cached version will be returned, if available. This can be used to effectively mitigate availability issues with registries.

The helm chart contains a MutatingWebhookConfiguration that will automatically rewrite pod specs to pull through Trow.

Validating Webhook

Configuration

The validating webhook can be configured using --image-validation-config-file argument like so:

# validation.yaml
default: Deny
allow:
  - my-trow-domain.trow.io/
  - k8s.gcr.io/
deny:
  - my-trow-domain.trow.io/my-secret-image
$ ./trow --image-validation-config-file ./validation.yaml
Starting Trow 0.6.0 on 0.0.0.0:8000
Hostname of this registry (for the MutatingWebhook): "0.0.0.0"
Image validation webhook configured:
  Default action: Deny
  Allowed prefixes: ["my-trow-domain.trow.io/", "k8s.gcr.io/"]
  Denied prefixes: ["my-trow-domain.trow.io/my-secret-image"]
Proxy registries not configured

Troubleshooting

If a deployment isn't starting, check the logs for the replica set e.g:

kubectl get rs my-app-844d6db962

If there is a failed create message, the image may have been refused validation by Trow. If the message reads like:

Error creating: admission webhook "validator.trow.io" denied the request: my_registry.io/nginx: Image is neither explicitly allowed nor denied (using default behavior)

That means:

  1. The validation webhook is active
  2. my_registry.io/ has not been added to the allow list
  3. The default behavior is configured to "Deny"

Otherwise, if the error reads like:

Error creating: Internal error occurred: failed calling admission webhook "validator.trow.io": Post https://trow.kube-public.svc:443/validate-image?timeout=30s: no endpoints available for service "trow"

Trow probably isn't running and the webhook is configured to Fail on error. You will need to disable the admission webhook (or, for helm chart: onWebhookFailure: Ignore) and restart Trow.

Listing Repositories and Tags

Trow implements the OCI Distribution Specification which includes API methods for listing repositories and tags. Unfortunately the Docker CLI doesn't support these endpoints, so we need to use a third-party tool. It is possible to use curl, but this gets complicated when dealing with password protected registries, so we recommend the docker-ls tool.

Using docker-ls is fairly straightforward, for example, to list all repositories in a registry:

docker-ls repositories -u myuser -p mypass -r https://registry.trow.io
requesting list . done
repositories:
- alpine
- one/two
- user1/web
- user2/web

To list all tags for a repository:

docker-ls tags user1/web -u myuser -p mypass -r https://registry.trow.io
requesting list . done
repository: user1/web
tags:
- default
- test

If you want to play with the underlying APIs, the URL for listing repositories is /v2/_catalog and the tags for any given repository can be listed with /v2/<repository_name>/tags/list.

The catalog endpoint is a matter of debate by the OCI and may be replaced in future versions. Do not expect different registries to have compatible implementations of this endpoint for historical reasons and ambiguities in specification.

Multiplatform Builds

Trow has builds for amd64, armv7 and arm64. Images with a release version but no explicit platform e.g. trow:0.3 or trow:0.3.2 should be multiplatform images that will automatically pull the correct version of the image for the current platform. Images tagged latest or default are currently amd64 only. Images should be pushed to both GHCR and the Docker Hub.

If there's another build you would like to see, please get in contact.

Troubleshooting

Where are the logs?

The first place to look for debugging information is in the output from the kubectl describe command. It's worth looking at the output for the deployment, replicaset and pod. Assuming the namespace for the Trow is "trow":

$ kubectl describe deploy -n trow trow-deploy
$ kubectl describe replicaset -n trow trow-deploy
$ kubectl describe pod -n trow trow-deploy

In particular, look for problems pulling images or with containers crashing.

For the actual application logs try:

$ kubectl logs -n trow trow-deploy-596bf849c8-m7b7l

The ID at the end of your pod name will be different, but you should be able to use autocomplete to get the correct name (hit the tab key after typing "trow-deploy").

If there are no logs or you get output like:

Error from server (BadRequest): container "trow-pod" in pod "trow-deploy-6f6f8fbc6d-rndtd" is waiting to start: PodInitializing

Look at the logs for the init container:

$ kubectl logs -n trow trow-deploy-596bf849c8-m7b7l -c trow-init

I can't push images into Trow

If it seems like you can connect to Trow successfully but then uploads fail with manifest invalid or Internal Server Error, Trow may be having trouble saving to the filesystem. First check the logs (see "Where are the logs?" above). If this is the case, check there is free space on the volume and the Trow user has the correct privileges to write to the volume. In particular, verify that the settings for the volume match the UID of the Trow user (333333 by default):

# ...
    spec:
      containers:
      - name: trow
      # ...
      securityContext:
        runAsUser: 333333
        runAsGroup: 333333
        fsGroup: 333333

My pod can't pull images from Trow

If you get the error:

Error creating: Internal error occurred: failed calling admission webhook "validator.trow.io": Post https://trow.kube-public.svc:443/validate-image?timeout=30s: no endpoints available for service "trow"

Trow probably isn't running and the webhook is configured to Fail on error. You will need to disable the admission webhook (or, for helm chart: onWebhookFailure: Ignore) and restart Trow.

Permission Denied Errors in Logs

If you get errors such as { code: 13, kind: PermissionDenied, message: "Permission denied" }, it is possible that Trow can't write to the data directory. Please verify that the data volume is accessible and writeable by the Trow user. If not, please use chown or chmod to give the Trow user access. As the Trow user only exists in the container, you will likely need to use it's equivalent UID e.g. chown 333333 /data.

Errors When Pushing or Pulling Large Images

If you get errors when dealing with large images, but not with smaller images, you may need to configure your ingress to explicitly allow large transfers. For example, if you are using the NGINX ingress, add the following annotation to the Kubernetes configuration:

nginx.ingress.kubernetes.io/proxy-body-size: "0"