Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run agent + chromium as non-root user #965

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

Conversation

The-9880
Copy link
Contributor

@The-9880 The-9880 commented Oct 29, 2024

Closes #847

This might be a little naive, please keep me honest. cc @mem @nadiamoe

Looking for feedback on the following

  • Confirm: version this as a breaking change?
  • Can we make this more restrictive?
  • Q about k8s securityContext at the bottom of this description.


Summary:

  • Sets up a non-root user for both the release and with-browser build targets of the Dockerfile.
  • Updates the scratch/tarball-based images to do the same.
  • Tested every protocol check config I could attempt (IPv4, IPv6, DNS, traceroute, etc.)
  • Tested scripted and browser checks, seem to work.
  • Tested on Docker directly as well as K8s.

K8s securityContext

Here's the toy YAML that I used to run this in my local cluster:

apiVersion: v1
kind: Pod
metadata:
  name: agent
  labels:
    app.kubernetes.io/name: agent
spec:
  securityContext:
  containers:
    - name: agent-tip
      image: only-agent
      imagePullPolicy: Never
      ports:
        - containerPort: 4050
          name: http-metric
      args:
        - --api-server-address=$(API_SERVER)
        - --api-token=$(API_TOKEN)
        - --verbose=true
      env:
        - name: API_TOKEN
          value: "<API_TOKEN>"
        - name: API_SERVER
          value: "synthetic-monitoring-grpc-us-east-0.grafana.net:443"
      securityContext:
        runAsUser: 12345
        runAsGroup: 12345
        runAsNonRoot: true
        # allowPrivilegeEscalation: false
        capabilities:
          drop: ["all"]
          add: ["NET_RAW"]

The issue with this securityContext:

securityContext:
  runAsUser: 12345
  runAsGroup: 12345
  runAsNonRoot: true
  # allowPrivilegeEscalation: false
  capabilities:
    drop: ["all"]
    add: ["NET_RAW"]

Is that if I uncomment allowPrivilegeEscalation: false then I run into the following error:

listen ip4:icmp 0.0.0.0: socket: operation not permitted

Which, I guess, is from the default behaviour of listening on localhost:4050 for the /metrics endpoint?

@The-9880 The-9880 force-pushed the docker-non-root branch 2 times, most recently from d4f4cf2 to c92087a Compare January 30, 2025 23:50
@The-9880 The-9880 marked this pull request as ready for review January 31, 2025 00:21
@The-9880 The-9880 requested a review from a team as a code owner January 31, 2025 00:21
Copy link
Member

@nadiamoe nadiamoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for taking a stab at this! Left a small comment about multi-layering.

Dockerfile Outdated
ADD --chmod=0555 https://github.com/grafana/xk6-sm/releases/download/v0.0.3-pre/sm-k6-${TARGETOS}-${TARGETARCH} /usr/local/bin/sm-k6
COPY dist/${HOST_DIST}/synthetic-monitoring-agent /usr/local/bin/synthetic-monitoring-agent
COPY scripts/pre-stop.sh /usr/local/lib/synthetic-monitoring-agent/pre-stop.sh
RUN apk --no-cache add libcap
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be extracted to a separate stage, so we don't end up with libcap on the final image (or have the need to apk del it).
See as an example this: https://github.com/grafana/sm-k6-runner/blob/6f1c47c5a1760aa4dd7e5ad301297b1f8e2ae65b/Dockerfile#L24
Caps stick with the file so it can be COPYed just fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, thanks for the suggestion!

@nadiamoe
Copy link
Member

nadiamoe commented Feb 1, 2025

Confirm: version this as a breaking change?

As usual, it depends on what we consider part of our API. Let me answer with a question: If you try to run the new image, both in Docker and K8s, with the securityContext we used before in the examples, does it work? If it does, I'd say feature. If it doesn't, I'd say breaking.

Can we make this more restrictive?

I'm not aware of any other way, but I may be wrong here :)

Is that if I uncomment allowPrivilegeEscalation: false then I run into the following error:

Yes, that is "expected". allowPrivilegeEscalation: false actually negates file-based capabilities, so it cancels out the setcap command: https://github.com/grafana/sm-k6-runner/blob/6f1c47c5a1760aa4dd7e5ad301297b1f8e2ae65b/README.md?plain=1#L113

@The-9880 The-9880 requested a review from nadiamoe February 3, 2025 01:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Run agent and chromium as non-root
2 participants