Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quickstart: transport: authentication handshake failed #7632

Closed
Tracked by #7589
jmclean-starburst opened this issue Jan 25, 2022 · 18 comments · Fixed by #7651
Closed
Tracked by #7589

Quickstart: transport: authentication handshake failed #7632

jmclean-starburst opened this issue Jan 25, 2022 · 18 comments · Fixed by #7651
Labels

Comments

@jmclean-starburst
Copy link

jmclean-starburst commented Jan 25, 2022

Summary

What happened/what you expected to happen?
When connecting to the UI of Argo Workflows (after port forwarding), I receive this message and virtually no components load:

Service Unavailable: connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority"

What version of Argo Workflows are you running?
latest (ie 3.2.6)

Diagnostics

Quickstart Guide - follow steps in guide to reproduce

What Kubernetes provider are you using?
EKS/Minikube

Configuration:

    spec:
      containers:
      - args:
        - server
        - --namespaced
        - --auth-mode
        - server
        - --auth-mode
        - client
        - --insecure-skip-verify
        - "true"

kustomization.yaml

resources:
  - namespace.yml
  - https://raw.githubusercontent.com/argoproj/argo-workflows/v3.2.6/manifests/quick-start-minimal.yaml

namespace: argo-workflows

image

Note: This is reproducible w/ the minimal quickstart via minikube as well


Message from the maintainers:

Impacted by this bug? Give it a 👍. We prioritise the issues with the most 👍.

@materkey
Copy link

I also have this issue

@materkey
Copy link

argo-server seems to respond with 503 but can't see it in argo-server pod logs even with -v flag

@materkey
Copy link

materkey commented Jan 25, 2022

UPD:
After @jmclean-starburst comment #7632 (comment)
I'm using v3.2.6 for quay.io/argoproj/argocli container image and for my kustomization.yaml to solve this issue

apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- github.com/argoproj/argo-workflows/manifests/base?ref=v3.2.6
- github.com/argoproj/argo-workflows/manifests/cluster-install/workflow-controller-rbac?ref=v3.2.6
- github.com/argoproj/argo-workflows/manifests/cluster-install/argo-server-rbac?ref=v3.2.6
...
Old comment text

I've ended up by using --secure=false and scheme: HTTP for argo-server probe as a workaround. Also deleted nginx.ingress.kubernetes.io/backend-protocol: https for ingress.
Https for argo workflows worked for months and now it is broken for an unknown reason.

@jmclean-starburst
Copy link
Author

jmclean-starburst commented Jan 25, 2022

@materkey I have done this in the past...its just kind of annoying to update the readiness probe, add a new flag to server start, update ingress; This definitely needs to be resolve by the Argo team bc any new user of workflows is going to be put off by these errors. I have used the quickstart in the past without any issues...so this seems a bit wonky.

@jmclean-starburst
Copy link
Author

I believe I found the issue; the argo CLI is using latest for the image tag, instead of specifying the version tag (ie v3.2.6). I have tested this w/ v3.1.15 and after changing the image tag, it works. I will now validate same for version 3.2.6

@jmclean-starburst
Copy link
Author

I was able to fix this by patching the image off of latest

resources:
  - namespace.yml
  - https://raw.githubusercontent.com/argoproj/argo-workflows/v3.2.6/manifests/quick-start-minimal.yaml

images:
- name: quay.io/argoproj/argocli
  newTag: v3.2.6

namespace: argo-workflows

@alexec
Copy link
Contributor

alexec commented Jan 25, 2022

Please, you must use the manifests attached to the release:

kubectl create namespace argo
kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.2.6/install.yaml

@alexec alexec closed this as completed Jan 25, 2022
@jmclean-starburst
Copy link
Author

@alexec I believe the quickstart guide is broken then, as the latest tag on argocli will spawn the issue mentioned by this bug

@alexec alexec reopened this Jan 25, 2022
@alexec
Copy link
Contributor

alexec commented Jan 25, 2022

I now think this might be bug.

@alexec alexec mentioned this issue Jan 25, 2022
28 tasks
@alexec
Copy link
Contributor

alexec commented Jan 25, 2022

Maybe

3614db6 feat: adding support for getting tls certificates from kubernetes secret (e.g. (#7621)

@alexec
Copy link
Contributor

alexec commented Jan 25, 2022

@ChaosInTheCRD do you think your commit could have introduced this bug please?

@tachyus-ryan
Copy link

tachyus-ryan commented Jan 26, 2022

I ran into this error after updating one of my clusters from GKE 1.19 to 1.20. I cannot seem to connect to argo-server since restarting deployments, no matter what I try. I was on v3.2.4 and tried updating to v3.2.6 to no avail.

@tachyus-ryan
Copy link

I was able to fix this by patching the image off of latest

This worked for me, too.

@plaicebo
Copy link

Please, you must use the manifests attached to the release:

kubectl create namespace argo
kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.2.6/install.yaml

This did it for me. I was using the quick start manifest instead of the release specific. That will have to be updated.

@andrein
Copy link

andrein commented Jan 26, 2022

Please, you must use the manifests attached to the release:

kubectl create namespace argo
kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.2.6/install.yaml

For kustomize users, it's much easier to reference the remote base from git. We tried to do this for the 3.2.6 tag and experienced the same issue.

We've been doing the same with argo-cd for a while without issues.

It look like they're overriding the images with the correct version when the git tag is created:

I found #2715 that seems related, @alexec can you reopen it to continue the kustomize discussion there?

@ChaosInTheCRD
Copy link
Contributor

@alexec it seems that yes, the problem has come from my PR, apologies for that. My suspicion is that it is related to the removal of this line, which I overlooked as I guess some process (I guess the UI) can't trust the self signed cert generated. I will investigate this further.

I guess though, the main thing to resolve right now is to ensure that this quickstart does not use the latest image. This way it can be ensured that prerelease bugs like this don't hinder users.

As for the error itself, can I get some clarity on what is resolving the error? Is it directly from the ui or is there something else involved (e.g. API).

@alexec
Copy link
Contributor

alexec commented Jan 26, 2022

This issues is a blocker or v3.3, which we'd hoped to release this week. If you don't have time to look into this, I suggest we revert the commit.

@tachyus-ryan
Copy link

As for the error itself, can I get some clarity on what is resolving the error? Is it directly from the ui or is there something else involved (e.g. API).

I found that I could not submit workflows to the API, so this does not appear to be a UI-only issue. Ensuring latest was not used was the means of resolving the error for me.

alexec pushed a commit that referenced this issue Jan 26, 2022
Signed-off-by: Tom Meadows <thomas.meadows@jetstack.io>
yriveiro pushed a commit to yriveiro/argo-workflows that referenced this issue Jan 27, 2022
Signed-off-by: Tom Meadows <thomas.meadows@jetstack.io>
@alexec alexec mentioned this issue Jan 27, 2022
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants