Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow agent container RunAsUser/Group to be specified via annotations #60

Merged

Conversation

joemiller
Copy link
Contributor

@joemiller joemiller commented Jan 25, 2020

This PR adds two new annotations:

  • vault.hashicorp.com/agent-run-as-user
  • vault.hashicorp.com/agent-run-as-group

Which allows overriding the default userid (100) and groupid (1000) that the Vault Agent init and sidecar containers will run as.

This complements the vault.hashicorp.com/agent-inject-command- annotation in #57

An example use case for these PRs is described in #56

TL;DR -- Given an example such as running an nginx container with dynamic TLS certs retrieved from Vault, it is necessary to send a SIGHUP to nginx to load the new certs. This requires specifying a command to run after new certs are rendered (#57) and that command will need to be run from the Vault Agent sidecar as the same userID that nginx is running in the main/app container (addressed in this PR).

Copy link
Contributor

@jasonodonnell jasonodonnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great @joemiller ! I think this would also make a good candidate for environment variables so an operator could globally override uid/gid. Thoughts?

@joemiller
Copy link
Contributor Author

Yeah, makes sense. Will do

@joemiller
Copy link
Contributor Author

resolved merge conflicts.
still TODO: env vars

@joemiller
Copy link
Contributor Author

joemiller commented Feb 20, 2020

@jasonodonnell I'm looking at the env var code in subcommand/injector/flags.go and I see that most (all?) env vars are also configurable via -flags as well as env vars. Just to confirm, we would want to implement both here too?

@jasonodonnell
Copy link
Contributor

jasonodonnell commented Feb 20, 2020

@joemiller Yes! It's a duplication of functionality but flags existed before envs, but envs are a bit easier to configure in Kube vs running the binary.

@joemiller
Copy link
Contributor Author

@jasonodonnell I'm following the pattern used to propagate AGENT_INJECT_VAULT_IMAGE env var thru the codebase and I think I'll need to increase the arity on the agent.Init() func to include two new vars - user and group:

https://github.com/hashicorp/vault-k8s/blob/master/agent-inject/agent/annotations.go#L137

Are you ok if I proceed this way? Or refactor? perhaps modifying agent.Init() to take a struct of default annotations/options instead?

@jasonodonnell
Copy link
Contributor

@joemiller Yeah I'd love that actually, it's becoming quite big as is.

@joemiller
Copy link
Contributor Author

joemiller commented Feb 26, 2020

@jasonodonnell Updated to allow setting user/group at a global level via env vars: AGENT_INJECT_RUN_AS_{USER,GROUP}.

I refactored the code from the original commit quite a bit for this. I chose to represent the user/group as strings for the purpose of -flags and env vars, ultimately converting to int64's just before usage in the Agent. This simplified a lot of the testing and parsing and for allowing a zero value of "0" that would result in using the DefaultUser/Group instead of accidentally running as root.

Also modified the agent.Init() func to take an AgentConfig{} struct per discussion above in order to avoid increasing the arity of the func. Open to suggestions on a better name for this struct.

It looks like I have quite a few files to resolve conflicts before it is mergeable but the code itself should be ready for review now.

@joemiller
Copy link
Contributor Author

Ok, I think this is ready now. There were a significant amount of conflicts to resolve from the merge of the revokeOnShutdown work that touched all of the same parts of code as this PR (lesson learned: be fast ;) )

Copy link
Contributor

@jasonodonnell jasonodonnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joemiller This looks great! I'm currently traveling but will return in a few days to functionally test and do a final review. Thanks for the refactor!

@jasonodonnell
Copy link
Contributor

Hi @joemiller, just took a look at this. The code looks OK but I'm having issues with functional testing.

First, when specifying a non-vault user, the following appears when trying to sink the token:

Error creating file sink: error during write check: error opening temp file in dir /home/vault for writing: open /home/vault/.vault-token.tmp.b405fff6: permission denied

Next, although I set a custom group, only runAsUser appears (runAsGroup has no value).

Here's a complicated example I am using to test this:

spec:
  template:
    metadata:
      annotations:
        # AGENT INJECTOR SETTINGS
        vault.hashicorp.com/agent-run-as-user: "2000"
        vault.hashicorp.com/agent-run-as-group: "100"
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/secret-volume-path: "/etc/test"
        vault.hashicorp.com/agent-inject: "true"
        vault.hashicorp.com/agent-inject-status: "update"
        # DATABASE SECRET
        vault.hashicorp.com/secret-volume-path-db-creds: "/vault/secretfoo"
        vault.hashicorp.com/agent-inject-secret-db-creds: "database/creds/db-app"
        vault.hashicorp.com/agent-inject-template-db-creds: |
          {{- with secret "database/creds/db-app" -}}
          postgres://{{ .Data.username }}:{{ .Data.password }}@postgres.postgres.svc:5432/wizard?sslmode=disable
          {{- end }}
        # TLS SERVER CERTIFICATE
        vault.hashicorp.com/agent-inject-secret-server.cert: "pki/issue/hashicorp-com"
        vault.hashicorp.com/agent-inject-template-server.cert: |
          {{- with secret "pki/issue/hashicorp-com" "common_name=www.hashicorp.com" -}}
          {{ .Data.certificate }}
          {{- end }}
        # TLS SERVER KEY
        vault.hashicorp.com/agent-inject-secret-server.key: "pki/issue/hashicorp-com"
        vault.hashicorp.com/agent-inject-template-server.key: |
          {{- with secret "pki/issue/hashicorp-com" "common_name=www.hashicorp.com" -}}
          {{ .Data.private_key }}
          {{- end }}
        # TLS CA CERTIFICATE
        vault.hashicorp.com/agent-inject-secret-ca.cert: "pki/issue/hashicorp-com"
        vault.hashicorp.com/agent-inject-template-ca.cert: |
          {{- with secret "pki/issue/hashicorp-com" "common_name=www.hashicorp.com" -}}
          {{ .Data.issuing_ca }}
          {{- end }}
        # VAULT SETTINGS
        vault.hashicorp.com/role: "db-app"
        vault.hashicorp.com/tls-secret: "tls-test-client"
        vault.hashicorp.com/client-cert: "/vault/tls/client.crt"
        vault.hashicorp.com/client-key: "/vault/tls/client.key"
        vault.hashicorp.com/ca-cert: "/vault/tls/ca.crt"
        vault.hashicorp.com/agent-inject-secret-token: "auth/token/lookup-self"
        vault.hashicorp.com/agent-inject-template-token: |
          {{- with secret "auth/token/lookup-self" -}}{{.Data.id}}{{- end }}

@joemiller
Copy link
Contributor Author

@jasonodonnell I'll look at this today, as soon as I can.

@joemiller
Copy link
Contributor Author

joemiller commented Mar 2, 2020

@jasonodonnell Ok, I can re-create the issue of being unable to write the token when the user is neither '0' nor the default vault user in the vault docker image.

I'm going to look at seeing if it is possible to set the ownership on the /home/vault mount to the same user as is specified by run-as-{user,group}

I am also seeing the 'group' GID is not being set in the container:

# vault-injector (webhook) deploy:
         - name: AGENT_INJECT_RUN_AS_USER
           value: "0"
         - name: AGENT_INJECT_RUN_AS_GROUP
           value: "1000"
# logs from vault-agent pod:
2020/03/02 23:05:49.751264 [INFO] (child) spawning: /bin/sh -c pkill -HUP nginx || true; whoami; id
root
uid=0(root) gid=0(root) groups=0(root),1(bin),2(daemon),3(sys),4(adm),6(disk),10(wheel),11(floppy),20(dialout),26(tape),27(video)

I would expect gid=1000 in this case. Investigating.

@joemiller
Copy link
Contributor Author

RE: group not being set. I am wondering if this is a behavior in k8s. I'm using the example pod from: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
which specifies:

  securityContext:
    runAsUser: 1000
    runAsGroup: 3000
    fsGroup: 2000

And the observed outcome is different than what is illustrated in the docs:

$ ku exec -it security-context-demo sh                             
/ $ id
uid=1000 gid=0(root) groups=2000

@joemiller
Copy link
Contributor Author

Ah, I think I see -- runAsGroup is feature-gated in 1.13, full available in 1.14. I'm testing against a 1.13 (GKE) cluster and don't have a 1.14 cluster readily available for testing. @jasonodonnell are you testing against 1.13 too?

rancher/rancher#19901 (comment)

@joemiller
Copy link
Contributor Author

And I think we can fix the permissions issue by setting fsGroup to the value of runAsGroup, however fsGroup is only available in the PodSecurityContext{} and not the container SecurityContext{} that we're currently setting so this will require a bit more code to modify the pod's securitycontext

@tvoran tvoran added enhancement New feature or request injector Area: mutating webhook service labels Mar 3, 2020
@joemiller
Copy link
Contributor Author

joemiller commented Mar 3, 2020

Ok, another update.

Perhaps due to my misunderstanding, but now I'm starting to grok things a little more.

So, both the init container and the sidecar container store the token in /home/vault which is not shared between the pods. Since this dir is not a volume mount, the pod fsGroup setting doesn't help here.

I'm wondering if we should do the following:

  • Only set runAsUser/runAsGroup on the sidecar container and never on the init container.
  • chown /home/vault as part of the sidecar container's args

I think this simplifies things and will make it work how we would expect. And since the init container is guaranteed to run before any regular containers it doesn't make sense to support vault.hashicorp.com/agent-inject-command- on the initContainer's config anyway since there is nothing to send a signal to. The case where this might not be true is if you override the image used by the initContainer to include some tools/scripts that you actually want to be executed by the initContainer.

EDIT If the user would like to set an fsGroup, for example to ensure their main container and the sidecar container are using the same gid on any mounted volumes including the /vault/secrets volume, then they could set that in their pod Spec, and this could be documented. I initially thought about setting fsGroup from the injector but since this is a pod-level setting that affects all containers and all volumes it seems best to let the user manage this.

EDIT 2 I realized we'll have the same problem writing to /home/vault/.vault-token in the sidecar container when the user is neither vault(1000) (default) or root(0). Need to think about this a bit more. We could inject a command to chown /home/vault in the case of runAsUser/Group being overriden. Since we're already setting the container args to:

"echo ${VAULT_CONFIG?} | base64 -d > /tmp/config.json && vault agent -config=/tmp/config.json"

that could be extended to do the chown /home/vault;

chown uid:gid /home/vault ; "echo ${VAULT_CONFIG?} | base64 -d > /tmp/config.json && vault agent -config=/tmp/config.json"

thoughts?

@jasonodonnell
Copy link
Contributor

@joemiller

Ah, I think I see -- runAsGroup is feature-gated in 1.13, full available in 1.14. I'm testing against a 1.13 (GKE) cluster and don't have a 1.14 cluster readily available for testing. @jasonodonnell are you testing against 1.13 too?

Indeed I was, this makes sense!

And I think we can fix the permissions issue by setting fsGroup to the value of runAsGroup, however fsGroup is only available in the PodSecurityContext{} and not the container SecurityContext{} that we're currently setting so this will require a bit more code to modify the pod's securitycontext

We likely shouldn't go mucking with pod fsGroup as it could cause issues with other containers in the pod.

Here's what I think should work: we can make /home/vault an emptyDir volume. This should get around issues with permissions. Thoughts?

@joemiller
Copy link
Contributor Author

I’ll try that

@joemiller
Copy link
Contributor Author

joemiller commented Mar 4, 2020

@jasonodonnell After a quick local POC I think the emptyDir volume for /home/vault will work. I need to polish up the code and update tests. I'll work on that today, though I have a busy day so might be a little delayed.

In summary, it should look like:

  • In both the init container and sidecar container:
  • set the container's securityContext user/group from (in order) pod annotations / ENV vars on the vault-injector / defaults
  • Both init and sidecar will share an emptyDir /home/vault where they will write their token (/home/vault/.vault-token).

@joemiller
Copy link
Contributor Author

joemiller commented Mar 5, 2020

@jasonodonnell Pushed a new (separate) commit implementing the emptyDir for /home/vault - 6358247

Made it a separate commit for review purposes. It could be squashed.

I chose 'token volume' as the name for variables and func names. Could change it to 'home volume' or something else if desired.

@joemiller joemiller force-pushed the user-group-runas-annotations branch from 4335a23 to b666a9b Compare March 5, 2020 04:31
@joemiller joemiller force-pushed the user-group-runas-annotations branch from b666a9b to a40d0df Compare March 5, 2020 14:25
@jasonodonnell
Copy link
Contributor

Sorry we couldn't get this into v0.3.0, but this will definitely be in the next release!

@joemiller joemiller force-pushed the user-group-runas-annotations branch from ed36e4d to 6358247 Compare March 5, 2020 20:42
@joemiller
Copy link
Contributor Author

@jasonodonnell Awesome. I rebased to bring in the latest master (agent.go, vaultImage tag update)

@joemiller
Copy link
Contributor Author

@jasonodonnell I looked around to see if there were any docs for the injector that I should try to update to include these annotations but I couldn't find anything. I'm happy to send PRs to document this and #57 if that is available.

@joemiller joemiller force-pushed the user-group-runas-annotations branch from 6358247 to de7ccae Compare March 6, 2020 16:42
@jasonodonnell
Copy link
Contributor

@joemiller Thanks Joe, I can update them once this is merged. I plan to review this tomorrow!

@sbeaulie
Copy link
Contributor

What's the update on this? I came accross the need for it for secrets populated in a .ssh directory since SSH has strict requirements for the file ownership and permissions. The files need to be user owned, and the directory to be 700 perms and files 600. The perms are supported via annotations, but not the owner uid/gid yet.

Copy link
Contributor

@jasonodonnell jasonodonnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the contribution!

@jasonodonnell jasonodonnell merged commit 125347d into hashicorp:master Mar 17, 2020
RemcoBuddelmeijer pushed a commit to RemcoBuddelmeijer/vault-k8s that referenced this pull request Feb 22, 2022
…hashicorp#60)

* allow agent container RunAsUser/Group to be specified via annotations

* set run-as-{user,group} globally in the injector via environment variables

* emptyDir volume for token sink (/home/vault) in init and sidecar containers
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request injector Area: mutating webhook service
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants