Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changes to webhook.Validator pattern cause version update problems #2596

Closed
nathanperkins opened this issue Nov 27, 2023 · 22 comments · Fixed by #2639
Closed

Changes to webhook.Validator pattern cause version update problems #2596

nathanperkins opened this issue Nov 27, 2023 · 22 comments · Fixed by #2639
Labels
kind/support Categorizes issue or PR as a support question.

Comments

@nathanperkins
Copy link

nathanperkins commented Nov 27, 2023

The webhook.Validator interface is implemented on API types to provide easy webhook creation. However, tying this interface to API types means that API code is coupled with implementation code.

When the webhook.Validator interface is changed, modules cannot import each other across the breaking change boundary. It is difficult to achieve synchronous updates across a single organization maintaining many controllers, let alone the entire community.

For example, in 0.15, the webhookValidate interface was changed to return (admission.Warnings, error) instead of return error. We have updated our controllers to 0.15, but can't import metallb's APIs until they are also at 0.15, because their use of webhook.Validator is incompatible with our module's choice to use 0.15.

Breaking changes in other areas of the code are less problematic because it is common for repos to only import each other for APIs. If the code dependent on the breaking change is not actually used, it will not be included in compilation and it will compile fine, even though a complete compilation would fail.

We were not aware of the danger of this pattern, and only recently started moving away from it for our controllers to reduce the scope of this problem. But we cannot fix it in third party repos, without forking it and fixing it locally ourselves. We could do pull requests, but they will run into their own problems where updating their repo requires them to update all the downstream repos.

Is there a solution for controller-runtime users? If not, is there something that can be done in the project to reduce this issue?

Related: #2327

@nathanperkins
Copy link
Author

nathanperkins commented Nov 27, 2023

The issue with metallb is even worse than we thought because metallb imports open policy agent. So, it seems like we can't import metallb until metallb is at 0.15 and metallb can't update to 0.15 until open policy agent is at 0.15.

Edit: open policy agent has a tag which is on controller-runtime 0.15 so we can bump the version locally and resolve that issue.

But it does bring up the issue that if any of our indirect dependencies skipped 0.15, we would be forced to bump up to 0.16 too.

@troy0820
Copy link
Member

/kind support

@k8s-ci-robot k8s-ci-robot added the kind/support Categorizes issue or PR as a support question. label Nov 27, 2023
@alvaroaleman
Copy link
Member

API packages should not import anything other than other api packages to avoid precisely this problem. Controller-Runtime does not guarantee compatibility across versions.

@troy0820
Copy link
Member

troy0820 commented Jan 4, 2024

@nathanperkins with your resolution from the "edit" you had in your comment, does @alvaroaleman's explanation along with the fix you provided settle this issue?

@nathanperkins
Copy link
Author

nathanperkins commented Jan 4, 2024

@alvaroaleman

API packages should not import anything other than other api packages to avoid precisely this problem. Controller-Runtime does not guarantee compatibility across versions.

This is true but not possible with the webhook.Validator pattern because is implemented on the API type itself and the methods return admission types which are imported from controller-runtime.

https://pkg.go.dev/sigs.k8s.io/controller-runtime@v0.16.3/pkg/webhook/admission#Validator

If this is the suggestion, then the webhook.Validator pattern should be removed in favor of patterns which don't require imports from the API package or it needs to be adjusted so that it can be implemented without API packages having to import controller-runtime.

@troy0820

@nathanperkins with your resolution from the "edit" you had in your comment, does @alvaroaleman's explanation along with the fix you provided settle this issue?

It doesn't settle the issue. It's not possible to implement the webhook.Validator pattern without importing controller-runtime in the API package.

@sbueringer
Copy link
Member

Not sure if you're aware but you can implement CustomValidator & CustomDefaulter outside of the API package instead.

This doesn't help of course if a 3rd party dependency of yours still implements Validator & Defaulter instead. For this it would only help if we drop it in CR and the dependency upgrades to a new CR version without Validator & Defaulter.

@troy0820
Copy link
Member

troy0820 commented Jan 4, 2024

I know that cluster-api did a lot to decouple the api types from importing controller-runtime

This PR is what I'm talking about. kubernetes-sigs/cluster-api#9045

@alvaroaleman
Copy link
Member

It doesn't settle the issue. It's not possible to implement the webhook.Validator pattern without importing controller-runtime in the API package.

Fair, but at that point you should not be doing that and implement it outside of that.

I personally do think that the webhook.Validator pattern is a mistake because of this precise reason, api packages should not import anything other than other api packages.

@nathanperkins
Copy link
Author

nathanperkins commented Jan 4, 2024

@alvaroaleman

Fair, but at that point you should not be doing that and implement it outside of that.

Sure but this pattern encourages doing this and it is also described this way in the Kubebuilder book.

https://book.kubebuilder.io/cronjob-tutorial/webhook-implementation

We can only fix this in our projects but the presence of this pattern causes problems across the ecosystem which are out of our control.

I personally do think that the webhook.Validator pattern is a mistake because of this precise reason, api packages should not import anything other than other api packages.

It seems like the only solution for this well-known problem is to not use webhook.Validator. Given that it causes unavoidable cross project upgrade issues, can we remove it or replace it in the next version?

@sbueringer

Not sure if you're aware but you can implement CustomValidator & CustomDefaulter outside of the API package instead.

Now that we are aware of this issue, we use the other patterns in our projects which decouple controller-runtime from our API types. But any project which uses this pattern will cause upgrade issues in the K8s ecosystem.

@nathanperkins
Copy link
Author

nathanperkins commented Jan 4, 2024

We should check whether you can implement this pattern outside your API package by embedding the API type, like:

type PodValidator struct {
  corev1.Pod
}

func (v PodValidator) ValidateCreate(...) (...) {
  ...
}

That would at least provide an easy path for projects to shift their existing webhook code and avoid the API dependency issues.

If this works, we could update the Kubebuilder doc to suggest this instead.

edit: on second thought, I doubt this would work. It would require the controller-runtime internals to be unmarshaling JSON into the webhook types, which would be inefficient. They probably use type assertion, which wouldn't work here.

@troy0820
Copy link
Member

troy0820 commented Jan 4, 2024

I think with the PR that was linked from cluster-api, it should encourage the kubebuilder scaffold to behave differently when that time comes to migrate from the older scaffold. Albeit a change you are still importing apimachinery/runtime which changes versions with k8s releases as well but not coupled to CR.

I think that's a reasonable ask and maybe something we can settle on to not only provide that as a solution but also adhere to the best practices around api importing.

@vincepri
Copy link
Member

vincepri commented Jan 4, 2024

We can deprecate the current pattern as long as we have a clear better way to create webhooks, a deprecation in the interfaces and code would allow folks to migrate over time

@nathanperkins
Copy link
Author

Deprecating the pattern over time sounds good to me.

If you create a replacement interface, it could look something like this:

type Validator interface {
  ValidateCreate(context.Context, client.Object) (webhook.Warnings, error)
  ...
}
  • Migrating existing code is simple, with a pod := o.(*v1.Pod) at the top.
  • Having access to the context enables using client queries for validations.

@vincepri
Copy link
Member

vincepri commented Jan 4, 2024

We might be able to do a bit better with a generic interface?

@sbueringer
Copy link
Member

sbueringer commented Jan 4, 2024

We should check whether you can implement this pattern outside your API package by embedding the API type, like:
...
edit: on second thought, I doubt this would work. It would require the controller-runtime internals to be unmarshaling JSON into the webhook types, which would be inefficient. They probably use type assertion, which wouldn't work here.

I tried this before we introduced the Custom* interfaces. Didn't work (but I don't remember the details)

@nathanperkins
Copy link
Author

nathanperkins commented Jan 4, 2024

We might be able to do a bit better with a generic interface?

Even better :)

I mostly just want to register my request to have ctx included, since this webhook validation is part of a bounded request and can timeout :)

@sbueringer
Copy link
Member

sbueringer commented Jan 5, 2024

The CustomValidator seems pretty much already what you're asking for: https://github.com/kubernetes-sigs/controller-runtime/blob/main/pkg/webhook/admission/validator_custom.go#L32-L48

@nathanperkins
Copy link
Author

I see, looks good to me. Then, I think it's just a matter of deprecating (and eventually removing) the webhook.Validator interface and updating the Kubebuilder docs to use webook.CustomValidator instead.

@nathanperkins
Copy link
Author

nathanperkins commented Jan 13, 2024

As a note to future viewers, it's not enough to implement this new interface but still within the API packages. It results in the same upgrade issue.

The webhook implementation has to be moved to a separate package to ensure that the API packages don't have controller-runtime interfaces and types. This removes the coupling which causes the upgrade issue.

@camilamacedo86
Copy link
Member

Hi @vincepri, @alvaroaleman, @sbueringer,

In Kubernetes and kubebuilder, the norm is one webhook per API Kind, ensuring specific validation and logic for each Kind. The suggestion of a generic webhook validator for multiple Kinds seems to deviate from this standard, potentially introducing complexities and affecting principles like single responsibility. Could you please clarify how you envision decoupling the webhook from the API in this context?

Additionally, I'm curious about the implications of updating a project that imports another project using controller runtime, particularly if the latter hasn't been upgraded. Can we ensure compatibility in such cases?

Lastly, regarding the default webhook implementation in the api/ package of kubebuilder (see: Kubebuilder Webhook Implementation), how would you recommend addressing the deprecation/breaking change in controller-runtime? Should we consider decoupling webhooks into a separate package?

Looking forward to your guidance and insights.

Best regards,

@alvaroaleman
Copy link
Member

In Kubernetes and kubebuilder, the norm is one webhook per API Kind, ensuring specific validation and logic for each Kind.

This is still the case, there is one endpoint per type its just that multiple types might be served from the same binary, which is perfectly acceptable. Refer to

func generateMutatePath(gvk schema.GroupVersionKind) string {
return "/mutate-" + strings.ReplaceAll(gvk.Group, ".", "-") + "-" +
gvk.Version + "-" + strings.ToLower(gvk.Kind)
}
func generateValidatePath(gvk schema.GroupVersionKind) string {
return "/validate-" + strings.ReplaceAll(gvk.Group, ".", "-") + "-" +
gvk.Version + "-" + strings.ToLower(gvk.Kind)
}
to see how this is handled in the builder. There really is no new vision here, that capability has existed for a long time.

Additionally, I'm curious about the implications of updating a project that imports another project using controller runtime, particularly if the latter hasn't been upgraded. Can we ensure compatibility in such cases?

No one is envisioning entire projects importing other projects, that projects importing other projects API/CRD. In order to make this easy it is suggested to make the API/CRD always a dedicated go module that has no dependencies except to other APIs/CRD and in particular not to controller-runtime.

Lastly, regarding the default webhook implementation in the api/ package of kubebuilder (see: Kubebuilder Webhook Implementation), how would you recommend addressing the deprecation/breaking change in controller-runtime? Should we consider decoupling webhooks into a separate package?

I can't answer this question as I don't use kubebuilder, I think you are best positioned to do that given that you are a maintainer. The only thing I'd always recommend is to have a dedicated go module for the API/CRD, how you structure the rest isn't really that important.

@camilamacedo86
Copy link
Member

Hi @alvaroaleman,

Thank you for the answer. It seems to confirm my understanding regarding the change/motivation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/support Categorizes issue or PR as a support question.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants