Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

0.14.1 causing crash loopback related to v1.Gateway #4366

Closed
PurseChicken opened this issue Apr 5, 2024 · 15 comments · Fixed by #4610
Closed

0.14.1 causing crash loopback related to v1.Gateway #4366

PurseChicken opened this issue Apr 5, 2024 · 15 comments · Fixed by #4610
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@PurseChicken
Copy link

What happened:

After updating to 0.14.1, the pod goes into a crash loop with the error:

failed to sync *v1.Gateway: context deadline exceeded

This does not happen with 0.14.0.

What you expected to happen:

For the pod to run without crashing.

How to reproduce it (as minimally and precisely as possible):

Updated to 0.14.1.

Anything else we need to know?:

I believe this is due to the use of a gateway-api source. Our source configuration looks at ingress and http-route resources.

  sources:
    - ingress
    - gateway-httproute

My assumption is that the changes to gateway-api in 0.14.1 are looking for v1 resources \ CRD's. That said, at least in GKE, gateway-api resources are still being deployed using v1beta1. The GKE documentation also currently references v1beta1 CRD's.

I imagine that external-dns changes to gateway-api need to support both v1beta1 as well as v1.

@PurseChicken PurseChicken added the kind/bug Categorizes issue or PR as related to a bug. label Apr 5, 2024
@PurseChicken
Copy link
Author

I believe my assumption is true. I can see in "gateway-api: fix wildcard matching" #4124 v1beta1 was removed for v1.

@jnauska
Copy link

jnauska commented Apr 5, 2024

Experiencing the same issues with GKE Gateway API

@Tarjei400
Copy link

Tarjei400 commented Apr 6, 2024

Same issue here I was getting nuts :D For now I will downgrade a chart version

@Tarjei400
Copy link

Tarjei400 commented Apr 6, 2024

Just in case using chart in version 6.38.0 and

apiVersion: gateway.networking.k8s.io/v1alpha2
kind: HTTPRoute

seems to do a trick temporaily on GKE

@jnauska
Copy link

jnauska commented Apr 22, 2024

After reviewing more carefully the changes in 0.14.1, I think this is the culprit #4019

Basically, removes v1beta1 for Gateways, HTTPRoutes as they have been upgraded to V1 and bumps dependencies to use v1.0.0 of gateway-apis.

@abursavich
Copy link
Contributor

I did the initial implementation for Gateway API in External DNS. I wasn't the one who migrated from v1beta1 to v1, but it seemed like a reasonable change to me... I'll take a look at what would be necessary to support both v1 and v1beta1.

@Tarjei400
Copy link

@abursavich Looks like everyone upgraded to v1, does anyone know when Google plans to upgrade to v1?, maybe supporting both CRD versions wont be needed if its planned sometime soon.

@abursavich
Copy link
Contributor

abursavich commented Apr 22, 2024

Technically (pedantically), the docs say:

As the Gateway API is still in an experimental phase, ExternalDNS makes no backwards compatibility guarantees regarding its support.

If you install a newer version of the CRDs then the resources will be auto-converted by the Kubernetes API server to the new versions, but I don't know if GKE will stomp the change.

There's a SIG-Network Gateway API meeting this afternoon, which I plan to join to get input on this. As an added benefit, there's usually someone from Google there that will probably care about the GKE implementation.

@abursavich
Copy link
Contributor

abursavich commented Apr 22, 2024

The CRD Management guidelines seem to imply that if GKE rolled back the newer CRDs then it would be a bug in GKE:

Some implementations may also want to bundle CRDs to simplify installation. This is acceptable as long as they never:

  1. Overwrite Gateway API CRDs that have unrecognized or newer versions.
  2. Overwrite Gateway API CRDs that have a different release channel.
  3. Remove Gateway API CRDs.

@Raffo
Copy link
Contributor

Raffo commented Apr 23, 2024

Thank you for your help @abursavich , much appreciated. Ping me if you need any type of support.

@PurseChicken
Copy link
Author

Still an issue in 0.14.2

@jnauska
Copy link

jnauska commented Jun 13, 2024

I tested that 0.14.1 and later work with newer Gateway API version in GKE.

Maybe External-DNS documentation should now be changed to reflect that Gateway API isn't experimental phase anymore since the 1.0.0 GA release and subsequent breaking changes should be labeled as such.

@omriarieli
Copy link

same problem here using 0.14.2 on gke

@candita
Copy link

candita commented Jun 28, 2024

@abursavich you clarified that this is an issue in GKE, correct? Can this issue be closed?

@jnauska
Copy link

jnauska commented Jun 29, 2024

@candita This is not just a GKE issue, this is an issue for everyone not using GA version of Gateway API. GKE just bundles the Gateway API CRDs within the GKE versions, that's why mostly all the reported issues are manifesting there. Change that caused this is the same as removing support for betav1 from Ingress in the middle of transition period.

But as said already in this thread, external-dns stated

As the Gateway API is still in an experimental phase, ExternalDNS makes no backwards compatibility guarantees regarding its support.

This issue can be wiped under the rug with this documentation comment, as users should be able to upgrade the Gateway API version by themselves. But maybe the documentation should now reflect that Gateway API is GA in the documentation

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants