Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Webhook provider #3063

Merged
merged 23 commits into from
Sep 25, 2023
Merged

Webhook provider #3063

merged 23 commits into from
Sep 25, 2023

Conversation

Raffo
Copy link
Contributor

@Raffo Raffo commented Oct 3, 2022

Description

This is the trivial implementation of a webhook provider. The whole idea of the webhook is so that we can add new providers without having to merge them in the core repository of external-dns and make it easy to maintain while lowering the burden for the maintainers. Like this, we can focus on the core functionality on the project and save countless hours spent in code review to review/maintain new providers.

As I said, this is a trivial implementation, there are probably a million corner cases to be discussed and it only implements a strict subset of the Provider interface, due to some details of how that can be implemented today. I post it early because I would love some feedback on this approach and idea, especially from @njuettner @szuecs @vinny-sabatini .

I tested this with an extracted aws provider, the code is in https://github.com/Raffo/external-dns-aws-plugin . The development approach was the following:

  • extracted aws.go to a different repo.
  • added a main that instantiated an http server.
  • configured ExternalDNS to make calls to it.
  • do all the testing from a local machine with access to a remove EKS cluster in AWS with full access to route53 from the local computer, to simplify the development.

I could achieve the following:

ExternalDNS log:

INFO[0000] Instantiating new Kubernetes client
INFO[0000] Using kubeConfig
[{"dnsName":"external.dns",,"recordType":"NS","recordTTL":172800},{"dnsName":"cname-externaldns-e2e.external.dns","targets":["\"heritage=external-dns,external-dns/owner=external.dns,external-dns/resource=service/default/kuard\""],"recordType":"TXT","recordTTL":300},{"dnsName":"externaldns-e2e.external.dns",,"recordType":"CNAME","recordTTL":300,"providerSpecific":[{"name":"aws/evaluate-target-health","value":"true"},{"name":"alias","value":"true"}]},{"dnsName":"externaldns-e2e.external.dns","targets":["\"heritage=external-dns,external-dns/owner=external.dns,external-dns/resource=service/default/kuard\""],"recordType":"TXT","recordTTL":300}]

Extracted plugin:

./external-dns-aws-plugin
get records
post applychanges
WARN[0005] Total changes for cname-externaldns-e2e.external.dns exceeds max batch size of 0, total changes: 1
WARN[0005] Total changes for externaldns-e2e.external.dns exceeds max batch size of 0, total changes: 2
WARN[0005] Total changes for cname-externaldns-e2e.external.dns exceeds max batch size of 0, total changes: 1
WARN[0005] Total changes for externaldns-e2e.external.dns exceeds max batch size of 0, total changes: 2
^C

Which proves the functionality of the webhook.

With this PR, I'm not really looking for a code review, rather for a general feedback on the idea. PR needs work and has conflict, but that's okay for now.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 3, 2022
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 3, 2022
provider/plugin/plugin.go Outdated Show resolved Hide resolved
provider/plugin/plugin.go Outdated Show resolved Hide resolved
@szuecs
Copy link
Contributor

szuecs commented Oct 5, 2022

@Raffo in general I like the idea. I added some small nits as comments.

@Raffo
Copy link
Contributor Author

Raffo commented Oct 12, 2022

Left to do:

  • address comments
  • remove debug lines
  • check how the other providers are using the baseProvider methods
  • write the docs on how this should be used

@linki
Copy link
Member

linki commented Oct 21, 2022

I like the idea and I think it's the only way to keep ExternalDNS maintainable.

However, I would take a different approach: Instead of using a custom HTTP interface between ExternalDNS and its Provider plugins, I would use a DNSRecord CRD as the interface. Assuming the CRD that we already support would satisfy our needs already (let's call it DNSRecord), the approach would become something like this:

  • Similar to this PR, create a "CRD" Provider that creates DNSRecord CRD objects from the Sources.
  • Similar to this PR, extract aws.go and crd.go into a separate project. It will read the CRD and create records in Route53.

Before actually separating anything the approach can be tested by running two ExternalDNS instances with the following flags:

  • Source/Ingestion instance: external-dns --source ingress,service --provider crd => Turns Kubernetes Resources into DNSRecord CRD objects
  • Provider-specific/Target instance: external-dns --source crd --provider aws => Reads the CRD objects and turns them into Route53 entries, similar for other providers/cloud platforms.

Then you can go ahead and extract just the code that you need to make external-dns --source crd --provider aws work into your external-dns-aws-plugin, similar for other providers/cloud platforms.

A similar thing you can then do on the Source side. For example, core ExternalDNS would process core Kubernetes resources (Ingress, Service) and turn it into CRD objects. External resources, such as RouteGroup for example would have their own binary that turns RouteGroup into the desired DNSRecord CRD entries. This makes sense, since those non-core resources often come with their own controller anyways, such as Zalando's Ingress controller for Route Groups.

At the heart of all the different projects would be the CRD, so it needs to be very well defined.

@Raffo
Copy link
Contributor Author

Raffo commented Oct 26, 2022

@linki thank you for commenting, it's an interesting idea. I think what you are proposing is a lot more work compared to this proposal and would require having people implementing a controller (with kubebuilder or similar, doesn't matter) which is frankly a ton more work than just implementing an http server that runs as a sidecar, without even getting to think about RBAC and so on. I agree that it would be more "kubernetes native", but I'm not sure what the benefit would really be. Nevertheless, I want to think more about it, but I am not sure it makes sense to push more CRDs where there is something simple that can be done with just an http server and a simple json marshal.

@szuecs
Copy link
Contributor

szuecs commented Oct 27, 2022

Let's discuss it on maintainers meeting. Maybe @linki can join?

@Raffo
Copy link
Contributor Author

Raffo commented Oct 29, 2022

@szuecs ok, I will send a message next week to schedule the meeting, I can't at the usual time and I know you have trouble at that time too.

@jobstoit
Copy link

Is this still being worked on? Anywhere we can follow the discussion on this? Others seem to be curious too like in #2921.
I love the idea that It'd be similar to how CertManager does it with a webhook while also having some core providers in it's own codebase. I think it'll do a lot for maintainability of the project.
I think however that this small plugin provider might be a good temporary solution which could be deprecated after overhauling the project with implementing CRD's.
I'd love to know a bit more about the status:

  • Is anyone already writing the CRD spec's?
  • Someone working on CRD controllers?

Thank you guys for the hard work

@waldner
Copy link

waldner commented Dec 25, 2022

Agreed. While I don't have a concrete proposal (although, like @jobstoit said, I'd like a cert-manager style webook system), it seems to me that the number of in-tree providers is too big for the project to maintain.

@jobstoit jobstoit mentioned this pull request Dec 28, 2022
2 tasks
@lgayatri
Copy link

lgayatri commented Jan 2, 2023

@asifdxtreme

@jobstoit
Copy link

jobstoit commented Jan 2, 2023

Happy new year guys. @szuecs @linki do you guys maybe have answers to my previous questions?

@Raffo
Copy link
Contributor Author

Raffo commented Jan 11, 2023

@jobstoit We have been discussing this today and decided the following:

  • we will spend a bit of time to understand if we want to keep the current json approach or go with grpc, the main problem being that we need to establish a contract with providers that will be implemented externally, so that they can react on changes to the types. We can do this with openapi or grpc, but it's something that we likely have to do. If we go with json/http now we can add the openapi spec later and already enabled some new providers to be implemented while learning from this. Grpc is a bit of a bigger lift as it was never used in this project.
  • we will still be considering the CRD approach, but we will favor the http one for now like it is implemented in this PR.
  • @linki kindly offered to review this PR so that we can move forward.

/cc @szuecs @njuettner @linki

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 15, 2023
@Raffo
Copy link
Contributor Author

Raffo commented Jan 15, 2023

@linki merged master, added headers and fixed the tests, with no logical changes with respect to the original PR.

@szuecs
Copy link
Contributor

szuecs commented Jan 23, 2023

@Raffo as I am more strong against grpc, because of random errors all over the place (Opentracing and all of this for example, but not limited to that case) that I had to figure out and this does not need to have efficiency in that sense (parsing json will be good enough), I would stick with a simple json API.

@Raffo
Copy link
Contributor Author

Raffo commented Jan 25, 2023

@linki do you think you can do a review of this one soon?

return nil, err
}

fmt.Println(string(b))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a leftover, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, 100%, let me re-review this.

)

type PluginProvider struct {
provider.BaseProvider
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BaseProvider has no domainfilter feature as far as I see GetDomainFilter. Do you intend that the 3rd party provider filters for us?
If it would be a generic feature we can add it here which frees the 3rd party providers from that code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I need to think about how this will be 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, the issue is really that the provider interface has GetDomainFilter() endpoint.DomainFilterInterface which returns an object that does the filter operation and that wouldn't really make sense to do over HTTP.

The other problem, is also that we need to add the two other functions:

  • PropertyValuesEqual
  • AdjustEndpoints

My thinking is that we can add those two, but GetDomainFilter's logic needs to be implemented by the providers. WDYT @szuecs ?

Copy link
Contributor

@szuecs szuecs Feb 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in person:

  • PropertyValuesEqual
  • AdjustEndpoints
    will be part of the webhook interface.

DomainFilter somehow needs to be done by the provider itself and is out of scope right now.
Maybe we can think about a list of regexp domain filter in the plugin provider and expose config via cli or configmap, not sure if this would be sufficient. This can be also done as an enhancement later.

docs/tutorials/webhook-provider.md Outdated Show resolved Hide resolved
Signed-off-by: Raffaele Di Fazio <difazio.raffaele@gmail.com>
Signed-off-by: Raffaele Di Fazio <difazio.raffaele@gmail.com>
@Raffo
Copy link
Contributor Author

Raffo commented Sep 22, 2023

@johngmyers PTAL when you have time.

@johngmyers
Copy link
Contributor

/lgtm
Any last comments @mloiseleur ?

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 23, 2023
@mloiseleur
Copy link
Contributor

/lgtm

@Raffo
Copy link
Contributor Author

Raffo commented Sep 25, 2023

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Raffo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 25, 2023
@k8s-ci-robot k8s-ci-robot merged commit 8251b6d into master Sep 25, 2023
9 checks passed
@waldner
Copy link

waldner commented Sep 26, 2023

Thanks for the great work! Is some template repository in the style of https://github.com/cert-manager/webhook-example planned? I think it would be useful to get people started writing their own external-dns webhook.

@Raffo
Copy link
Contributor Author

Raffo commented Sep 27, 2023

@waldner I have a reference implementation that I have used for the initial testing in https://github.com/Raffo/external-dns-aws-plugin , but I still have to update that with the latest content of this PR. The folks at IONOS have done a good work at writing a very well documented provider: https://github.com/ionos-cloud/external-dns-ionos-webhook .

@akrieg-ionos
Copy link
Contributor

@Raffo Do you have an idea when this PR will be released ?

@Raffo
Copy link
Contributor Author

Raffo commented Sep 28, 2023

@akrieg-ionos We have made a release 3 weeks ago, haven't planned a new one yet. You can start using the images from https://gcr.io/k8s-staging-external-dns/external-dns if you don't want to build one, but I will need to consult the other maintainers to see what else we want/need to merge before cutting a release.

@muhlba91
Copy link
Contributor

@Raffo are there already any updates on when the next release is planned?

@linki
Copy link
Member

linki commented Nov 9, 2023

This was released as part of v0.14.0.

Great job @Raffo @johngmyers @mloiseleur 👏

mconfalonieri added a commit to mconfalonieri/external-dns that referenced this pull request Nov 10, 2023
Added reference to webhook provider, after looking at PR [kubernetes-sigs#3063](kubernetes-sigs#3063)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. new-provider size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. tide/merge-method-squash Denotes a PR that should be squashed by tide when it merges.
Projects
None yet
Development

Successfully merging this pull request may close these issues.