Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make local cluster tooling configurable #44

Conversation

everettraven
Copy link
Collaborator

Description
Update the Makefile to make the tooling used to create a local Kubernetes cluster configurable. Right now make it only support kind and k3d. Update/add make targets as necessary to facilitate this.

Motivation
For the life of me I just can not get kind to work for me with this project. I've started using k3d for local development but it's a pain to not be able to use the simple Makefile targets for setting things up. This PR makes it so I (or anyone else) can switch between kind and k3d while still using simple Makefile workflow (make cluster install)

Signed-off-by: Bryce Palmer <bpalmer@redhat.com>
@joelanford
Copy link
Member

joelanford commented Apr 14, 2023

I sorta think it may be worth the time debugging your kind issue. I've encountered a bunch of maintenance issues when repos have multiple supported dev environments. It's very hard to keep parity and it adds to the Makefile/scripts complexity. We also have operator-controller and rukpak that use kind exclusively.

What's your environment? OS, docker or podman? What specific errors do you get with kind?

@everettraven
Copy link
Collaborator Author

OS, docker or podman?

Fedora 36 + Docker

What specific errors do you get with kind?

kind starts up just fine and everything works until applying the sample CRD. The unpack pod fails with:

2023/04/17 13:57:54 render reference "quay.io/operatorhubio/catalog:latest": error resolving name for image ref quay.io/operatorhubio/catalog:latest: failed to do request: Head "https://quay.io/v2/operatorhubio/catalog/manifests/latest": x509: certificate is not valid for any names, but wanted to match quay.io

I have tried debugging this and looking for a solution but can't seem to find any. Some things I've tried:

  • Restarting Docker
  • Fresh Docker install
  • Fresh kind install
  • Updating Fedora
  • Adding Quay certs to the pod
  • Running opm with --skip-tls-verify and --use-http (or something like that - can't remember the exact flags at the moment)

Technically this seems like an opm issue since the error is coming from the unpack pod but this issue doesn't happen on my mac and I haven't heard this issue happening for anyone else so I figured it was an environment issue.

I'm all ears for ideas on how to fix it but after fiddling with it for a few days I gave up and decided to try k3d instead of kind.

I'm also happy to close this PR if we don't want to support multiple dev environments - I can just keep manually using k3d if that is what I have to do. Just thought I would open this PR since I was fiddling with automating my k3d workflow a bit and these Makefile changes were the result :)

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 18, 2023
@openshift-merge-robot
Copy link

PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@joelanford
Copy link
Member

This sounds like a Golang cert handling issue somewhere along the way. A few other things that could help:

  1. What does the resulting unpack pod json look like for you? Is it the same in kind and k3d?
  2. Have you tried docker pull <catalogImage> a quay catalog locally yourself? Any issues there?

It is totally bizarre that this works in k3d, but not kind. Any chance there's some sort of https proxy in the path of kind <-> internet? And I take it you've tried kind several times and over time and always get this same error? Is there any chance you can run a pod in that kind cluster that runs a simple Go program that saves the certificate presented by quay.io so that we can inspect it?

Also, the fact that skipping TLS verification does not seem to workaround the problem is puzzling. Something for sure seems amiss here, that's for sure.

@everettraven
Copy link
Collaborator Author

What does the resulting unpack pod json look like for you? Is it the same in kind and k3d?

Yep, it is the same

Have you tried docker pull a quay catalog locally yourself? Any issues there?

No issues - kind doesn't even have problems pulling an image from quay to run a pod, only has trouble pulling from within the pod

Any chance there's some sort of https proxy in the path of kind <-> internet?

Not that I'm aware of - not even sure how I would check that to be honest

And I take it you've tried kind several times and over time and always get this same error?

Yep - happens consistently every time. Have had this issue for ~2 months. Weird thing is I don't have this issue when i run this on my mac

Is there any chance you can run a pod in that kind cluster that runs a simple Go program that saves the certificate presented by quay.io so that we can inspect it?

I can look into this.

@joelanford
Copy link
Member

only has trouble pulling from within the pod

Another theory is that the pod does not have (or at least isn't using) a trust store that can verify quay.io. Can you try kubectl run with that same opm container image, get a shell, and try to curl or wget some other https servers?

@everettraven
Copy link
Collaborator Author

Another theory is that the pod does not have (or at least isn't using) a trust store that can verify quay.io

This was my original thought but I couldn't seem to get a trust store properly configured. I'll do the kubectl run suggestion and update here

@everettraven
Copy link
Collaborator Author

Can you try kubectl run with that same opm container image, get a shell, and try to curl or wget some other https servers?

For whatever reason I couldn't get a pod stood up using the same opm container image where i could get a shell and try to curl stuff (a bunch of command missing errors in an attempt to get the pod to stay alive), however i was able to try hitting a couple different registries using opm render and go the same results:

docker.io:

2023/04/21 15:09:13 render reference "docker.io/anik120/community-index-operators:v4.11": error resolving name for image ref docker.io/anik120/community-index-operators:v4.11: failed to do request: Head "https://registry-1.docker.io/v2/anik120/community-index-operators/manifests/v4.11": x509: certificate is not valid for any names, but wanted to match registry-1.docker.io

ghcr.io:

2023/04/21 15:11:00 render reference "ghcr.io/not/real:latest": error resolving name for image ref ghcr.io/not/real:latest: failed to do request: Head "https://ghcr.io/v2/not/real/manifests/latest": x509: certificate is not valid for any names, but wanted to match ghcr.io

@everettraven
Copy link
Collaborator Author

closing this since #54 ends up fixing the problem I was having with Kind

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants