Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vault agent can't authenticate using k8s 1.21 #562

Closed
Marc-Pons opened this issue Jul 1, 2021 · 15 comments
Closed

Vault agent can't authenticate using k8s 1.21 #562

Marc-Pons opened this issue Jul 1, 2021 · 15 comments
Labels
bug Something isn't working vault-server Area: operation and usage of vault server in k8s

Comments

@Marc-Pons
Copy link

Describe the bug
Since I updated kind to 0.11, which by default uses k8s 1.21, my sidecar vault-agents can't authenticate with Vault showing the following error:
[ERROR] auth.handler: error authenticating: error="context deadline exceeded" backoff=1s

Using kind 0.11 and k8s 1.20, vault-agent authenticates correctly. The only fix I could find using k8s 1.21 is configuring the auth method specifying the issuer as follows:

vault write auth/kubernetes/config \
token_reviewer_jwt="$SA_JWT_TOKEN" \
kubernetes_host="$K8S_HOST" \
kubernetes_ca_cert="$SA_CA_CRT" \
issuer="https://kubernetes.default.svc.cluster.local"

Disabling the issuer validation with disable_iss_validation=true also works. I think the issue may be related to the following change introduced on k8s 1.21:
The ServiceAccountIssuerDiscovery feature has graduated to GA, and is unconditionally enabled.

To Reproduce
Steps to reproduce the behavior:

  1. Install vault helm chart 0.13 in a kind 0.11 cluster using k8s 1.21
  2. Configure an auth method as documented here: https://www.vaultproject.io/docs/auth/kubernetes.
    2.1 Enable auth method: vault auth enable kubernetes
    2.2 Configure the auth endpoint:
vault write auth/kubernetes/config \
    token_reviewer_jwt="<your reviewer service account JWT>" \
    kubernetes_host=https://192.168.99.100:<your TCP port or blank for 443> \
    kubernetes_ca_cert=@ca.crt

2.3 Create a named role:

vault write auth/kubernetes/role/demo \
    bound_service_account_names=vault-auth \
    bound_service_account_namespaces=default \
    policies=default \
    ttl=1h

2.4 Configure Kubernetes creeating a ClusterRoleBinding:

apiVersion: rbac.authorization.k8s.io/v1beta1
kind: ClusterRoleBinding
metadata:
  name: role-tokenreview-binding
  namespace: default
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: system:auth-delegator
subjects:
  - kind: ServiceAccount
    name: vault-auth
    namespace: default
  1. Deploy a pod with vault-agent as sidecar using the configured auth method.

  2. Check logs of vault-agent. You should see a periodic log that looks like:
    [ERROR] auth.handler: error authenticating: error="context deadline exceeded" backoff=1s

Expected behavior
The documented steps to configure the auth method (https://www.vaultproject.io/docs/auth/kubernetes) should be working despite I am using k8s 1.21.

Environment

  • Kubernetes version:
    • Kind 0.11 with k8s 1.21
  • vault-helm version:
    • Vault Helm Chart 0.13

Chart values:

USER-SUPPLIED VALUES:
server:
  extraLabels:
    vault-server: "true"
  ha:
    config: "ui = true\n\nlistener \"tcp\" {\n  tls_disable = 1\n  address = \"[::]:8200\"\n
      \ cluster_address = \"[::]:8201\"\n}\n\nstorage \"consul\" {\n  path = \"vault/\"
      \ \n  address = \"consul-consul-server:8500\"\n}\n\nservice_registration \"kubernetes\"
      {}\n"
    enabled: true
    replicas: 2
@Marc-Pons Marc-Pons added the bug Something isn't working label Jul 1, 2021
@jasonodonnell
Copy link
Contributor

Hi @Marc-Pons, this is related to hashicorp/vault#11953 and may be helpful to fix your authentication problems. Hope that helps!

@HakimG
Copy link

HakimG commented Jul 21, 2021

Hi, We encountered the same error after upgrading from AWS EKS to 1.21.

We solved the error by setting issuer for Kubernetes authentication method:

kubectl proxy &

issuer="$(curl --silent http://127.0.0.1:8001/api/v1/namespaces/default/serviceaccounts/default/token \
  -H "Content-Type: application/json" \
  -X POST \
  -d '{"apiVersion": "authentication.k8s.io/v1", "kind": "TokenRequest"}' \
  | jq -r '.status.token' \
  | cut -d. -f2 \
  | base64 -D | jq .iss)"

vault write auth/kubernetes/config token_reviewer_jwt=$tr_account_token kubernetes_host=$k8s_host kubernetes_ca_cert=$k8s_cacert issuer=$issuer

@ArchiFleKs
Copy link

ArchiFleKs commented Aug 4, 2021

I've tried with the following on EKS 1.21 without success:

vault write auth/kubernetes/config \
  token_reviewer_jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" \
  kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443" \
  kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt \
  issuer="https://kubernetes.default.svc.cluster.local" \

The only way I got it working is by adding disable_iss_validation=true. I'm not sure of what I need to put as issuer. Should https:// be included ? And should I use the public AWS API endpoint like REDACTED.gr3.eu-west-1.eks.amazonaws.com` ?

Edit: found thanks to previous post, issuer is actually the OIDC provider for EKS cluster: https://oidc.eks.eu-west-1.amazonaws.com/id/REDACTED

mkysel pushed a commit to nuodb/nuodb-helm-charts that referenced this issue Aug 9, 2021
…on of disable_iss_validation=true (#237)

- move HC tests into their own larger class so that we can test the engines too
- workaround HC bug
- add doc note

- root cause hashicorp/vault-helm#562 and hashicorp/vault#11953
@Itiho
Copy link

Itiho commented Sep 2, 2021

disable_iss_validation=true and issuer=$ISSUER no work for me in my lab with kind 1.21

@ay0o
Copy link

ay0o commented Sep 27, 2021

It's not working on Docker Desktop which at the time of this writing uses Kubernetes v1.21.4.

Either setting disable_iss_validation=true or issuer=kubernetes/serviceaccount, the injector can't connect, failing with authentication issues and context deadline exceeded. Basically, what this guy says in this question https://discuss.hashicorp.com/t/external-vault-init-container-stuck-at-init-0-1-with-context-deadline-exceeded-error/27256.

The value of iss:

/ $ cat /var/run/secrets/kubernetes.io/serviceaccount/token | cut -d. -f2 | base64 -d
{"iss":"kubernetes/serviceaccount","kubernetes.io/serviceaccount/namespace":"default","kubernetes.io/serviceaccount/secret.name":"vault-token-d5q92","kubernetes.io/serviceaccount/service-account.name":"vault","kubernetes.io/serviceaccount/service-account.uid":"9331a479-c445-4c60-922c-c92b69dbdf1a","sub":"system:serviceaccount:default:vault"base64: truncated base64 input

UPDATE:

Just tested on EKS, and it's the same. This time the iss corresponds to the OIDC as mentioned earlier, but the error is still the same: can't authenticate.

@tvoran
Copy link
Member

tvoran commented Sep 28, 2021

Hi folks, we've collected a couple different ways to find the issuer on Kubernetes 1.21+ clusters here: https://www.vaultproject.io/docs/auth/kubernetes#discovering-the-service-account-issuer

Using those instructions the issuer comes back on kind (1.21.1) and docker-desktop (1.21.3) as https://kubernetes.default.svc.cluster.local for me.

However, if setting disable_iss_validation=true doesn't help, that suggests something else is misconfigured, so the next place to look is in the Vault server logs. Note that a projected service account only lasts as long as the Pod it was projected into, which can be a problem if a projected token was used for the token_reviewer_jwt parameter. See this issue for an example: hashicorp/vault-csi-provider#112 (comment)

@ay0o
Copy link

ay0o commented Sep 28, 2021

oh well, in my case the issue was different indeed.

It all started by the 1.21 changes, but it looks like with further testing I started to write policy instead of policies for the role definition, resulting in a 403 again, but caused by a different source. Funnily, Vault didn't complain at all that what I was writing was wrong.

@tvoran tvoran added the vault-server Area: operation and usage of vault server in k8s label Jan 7, 2022
@tvoran
Copy link
Member

tvoran commented Jan 7, 2022

As of Vault 1.9, the issuer and disable_iss_validation parameters are considered deprecated, and the disable_iss_validation default is now true for new configs: https://www.vaultproject.io/api-docs/auth/kubernetes#disable_iss_validation. (More details around that decision can be found here.)

We've also detailed the options for configuring K8s auth with respect to the changes in K8s 1.21 here: https://www.vaultproject.io/docs/auth/kubernetes#kubernetes-1-21

And we also have some changes to K8s auth coming soon to make it easier to run Vault in K8s with short-lived tokens: hashicorp/vault-plugin-auth-kubernetes#122

Closing this for now, thanks for all your input!

@tvoran tvoran closed this as completed Jan 7, 2022
@Marc-Pons
Copy link
Author

Marc-Pons commented Jan 7, 2022

Thanks for the deprecation details @tvoran. Now my vault-agent can authenticate correctly without having to specify the issuer when configuring the auth method if I use K8S 1.21 on Kind.

However, it is not working for me on AWS with K8S 1.21 if I don't specify the issuer. I am not sure why it is failing if dissable_iss_validation default value is true. For clarity, I am using Vault 1.9.0

My vault-agent is receiving a 403 when trying to authenticate against the vault server.

On the vault server I observe the following error:

[ERROR] auth.kubernetes.auth_kubernetes_547fe762: login unauthorized due to: Post "https://5A9FC76BE5CCAEC54751BD9D2F581069.gr7.eu-west-1.eks.amazonaws.com/apis/authentication.k8
s.io/v1/tokenreviews": x509: certificate signed by unknown authority

Do you have any clue about what could be going on? Thanks in advance!

UPDATE: Just tried again specifying disable_iss_validation=true (which should be default value according to Vault doc) and it just worked. Seems like its not respecting the default value of disable_iss_validation.

@tvoran
Copy link
Member

tvoran commented Jan 11, 2022

@Marc-Pons Glad you got it working on AWS! And yes, disable_iss_validation=true is the default, but only for new k8s auth configs. So if a k8s auth config was created on Vault 1.8, and then Vault is upgraded to Vault 1.9, the old default of disable_iss_validation=false could still be set.

@prasanjitshome
Copy link

prasanjitshome commented Mar 8, 2022

@tvoran we are currently using a very old version of vault (helm - 0.6.0) and have recently upgrade to EKS (1.21.5) .. Also we have spot instances so often we can see vault pods going down but we have 3 replicas, so until 1.20 EKS version we were good.

After the upgrade i can see we have to keep updating the auth/kubernetes/config as mentioned here whenever a pod goes down.

I understand there is a vault fix already present but is there a fix yet in vault helm which we can upgrade to.

Also i could see disable_iss_validation="true" is not getting picked up even though it says success.

/ $ vault write auth/kubernetes/config issuer="https://oidc.eks.us-west-2.amazonaws.com/id/XXXXXXXXXXX" token_reviewer_jwt="$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)" kub
ernetes_host="https://172.20.0.1:443" kubernetes_ca_cert=@/var/run/secrets/kubernetes.io/serviceaccount/ca.crt disable_iss_validation="true"
Success! Data written to: auth/kubernetes/config
/ $

Also i don't see "disable_iss_validation" when i read it back.

/ $ vault read auth/kubernetes/config
Key                   Value
---                   -----
issuer                XXXXXXX
kubernetes_ca_cert    XXXXXX
kubernetes_host       https://172.20.0.1:443
pem_keys              []
/ $

API call

/ # curl -s -XGET --header "X-Vault-Token: XXXXXX"  vault.cloudops.svc.cluster.local:8200/v1/auth/kubernetes/config | jq . -C
{
  "request_id": "avckldls",
  "lease_id": "",
  "renewable": false,
  "lease_duration": 0,
  "data": {
    "issuer": "https://oidc.eks.XXXX.amazonaws.com/id/XXXXXX",
    "kubernetes_ca_cert": "XXXX",
    "kubernetes_host": "https://172.20.0.1:443",
    "pem_keys": []
  },
  "wrap_info": null,
  "warnings": null,
  "auth": null
}
/ #

i see it's been fixed in this PR #695 -- 7 days ago

did we came up with any new release version yet .. i can see 0.19.0 was release in Jan 20th, 2022. Nothing after that.

@evanphx , @tvoran: can I go ahead and use main or should i wait for a new release from your side.

@evanphx
Copy link

evanphx commented Mar 9, 2022

@prasanjitshome Please don't at-mention random folks.

@tvoran
Copy link
Member

tvoran commented Mar 10, 2022

Hi @prasanjitshome, these are probably good questions for our discuss forums: https://discuss.hashicorp.com/c/vault

As for the question a new vault-helm release, we don't typically release a new version of vault-helm for every Vault release, since they aren't that tightly coupled version-wise. We recommend setting a specific Vault version in the user-specified overrides in the chart values when running in production: https://www.vaultproject.io/docs/platform/k8s/helm/configuration#image-1

So for example you can use vault-helm v0.19.0 and set server.image.tag=1.9.4 to use the latest version of Vault.

My guess about not seeing disable_iss_validation is you might be on an older version of Vault where that option doesn't exist. The vault write command just ignores options it doesn't recognize. I think it was added in 1.5.0: https://github.com/hashicorp/vault/blob/main/CHANGELOG.md#150

auth/kubernetes: Allow disabling iss validation [GH-91]

@rusanki
Copy link

rusanki commented Jul 16, 2022

This is failing for me in the new installation.
Kubernetes Server - EKS 1.22
Vault version - 1.10.3

Getting the following error -
[ERROR] auth.handler: error authenticating: error="context deadline exceeded"

I have created a topic here

Any clue on this ?

@tvoran

@rusanki
Copy link

rusanki commented Jul 18, 2022

this is sorted as it was due to istio sidecar blocking the connections for auth.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working vault-server Area: operation and usage of vault server in k8s
Projects
None yet
Development

No branches or pull requests

10 participants