Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deletion of DNS Records for VirtualServices when an Error occures #3140

Merged
merged 1 commit into from
Nov 10, 2022
Merged

Fix deletion of DNS Records for VirtualServices when an Error occures #3140

merged 1 commit into from
Nov 10, 2022

Conversation

ricoberger
Copy link
Contributor

Description

This PR fixes a bug where external-dns deletes all DNS records when they are created via VirtualServices and the Kubernetes API returns an error (#2858).

Before this change we saw the following in the external-dns logs:

2022-11-08T11:17:55.397663277Z stderr F {"file":"/sigs.k8s.io/external-dns/source/istio_virtualservice.go:207","func":"sigs.k8s.io/external-dns/source.(*virtualServiceSource).getGateway","level":"error","msg":"Failed retrieving gateway istio-system/istio-default-gateway referenced by VirtualService kobs/hub: Unauthorized","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.397675627Z stderr F {"file":"/sigs.k8s.io/external-dns/source/istio_virtualservice.go:167","func":"sigs.k8s.io/external-dns/source.(*virtualServiceSource).Endpoints","level":"debug","msg":"No endpoints could be generated from VirtualService kobs/hub","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.399294036Z stderr F {"file":"/sigs.k8s.io/external-dns/source/istio_virtualservice.go:207","func":"sigs.k8s.io/external-dns/source.(*virtualServiceSource).getGateway","level":"error","msg":"Failed retrieving gateway istio-system/istio-default-gateway referenced by VirtualService kobs/satellite: Unauthorized","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.399303892Z stderr F {"file":"/sigs.k8s.io/external-dns/source/istio_virtualservice.go:167","func":"sigs.k8s.io/external-dns/source.(*virtualServiceSource).Endpoints","level":"debug","msg":"No endpoints could be generated from VirtualService kobs/satellite","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.399380606Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:243","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).Zones","level":"debug","msg":"Refreshing zones list cache","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.634084053Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:289","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).Zones","level":"debug","msg":"Considering zone: /hostedzone/Z3QB1OZK4VUYV6 (domain: staffbase.dev.)","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.634131742Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:473","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).GetDomainFilter","level":"info","msg":"Applying provider record filter for domains: [staffbase.dev. .staffbase.dev.]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.638343068Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:243","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).Zones","level":"debug","msg":"Refreshing zones list cache","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851697866Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:289","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).Zones","level":"debug","msg":"Considering zone: /hostedzone/Z3QB1OZK4VUYV6 (domain: staffbase.dev.)","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851720979Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:806","func":"sigs.k8s.io/external-dns/provider/aws.changesByZone","level":"debug","msg":"Adding kobssatellite-kobs.staffbase.dev. to zone staffbase.dev. [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851723747Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:806","func":"sigs.k8s.io/external-dns/provider/aws.changesByZone","level":"debug","msg":"Adding kobs.staffbase.dev. to zone staffbase.dev. [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851727723Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:806","func":"sigs.k8s.io/external-dns/provider/aws.changesByZone","level":"debug","msg":"Adding kobssatellite-kobs.staffbase.dev. to zone staffbase.dev. [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851730861Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:806","func":"sigs.k8s.io/external-dns/provider/aws.changesByZone","level":"debug","msg":"Adding a-kobssatellite-kobs.staffbase.dev. to zone staffbase.dev. [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851734414Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:806","func":"sigs.k8s.io/external-dns/provider/aws.changesByZone","level":"debug","msg":"Adding kobs.staffbase.dev. to zone staffbase.dev. [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851737499Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:806","func":"sigs.k8s.io/external-dns/provider/aws.changesByZone","level":"debug","msg":"Adding a-kobs.staffbase.dev. to zone staffbase.dev. [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851740678Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:516","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).submitChanges","level":"info","msg":"Desired change: DELETE a-kobs.staffbase.dev TXT [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851759692Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:516","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).submitChanges","level":"info","msg":"Desired change: DELETE a-kobssatellite-kobs.staffbase.dev TXT [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851762493Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:516","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).submitChanges","level":"info","msg":"Desired change: DELETE kobs.staffbase.dev A [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851764762Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:516","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).submitChanges","level":"info","msg":"Desired change: DELETE kobs.staffbase.dev TXT [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.85176709Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:516","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).submitChanges","level":"info","msg":"Desired change: DELETE kobssatellite-kobs.staffbase.dev A [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:55.851769362Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:516","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).submitChanges","level":"info","msg":"Desired change: DELETE kobssatellite-kobs.staffbase.dev TXT [Id: /hostedzone/Z3QB1OZK4VUYV6]","time":"2022-11-08T11:17:55Z"}
2022-11-08T11:17:56.130815951Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:533","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).submitChanges","level":"info","msg":"6 record(s) in zone staffbase.dev. [Id: /hostedzone/Z3QB1OZK4VUYV6] were successfully updated","time":"2022-11-08T11:17:56Z"}
2022-11-08T11:18:52.964113346Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:243","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).Zones","level":"debug","msg":"Refreshing zones list cache","time":"2022-11-08T11:18:52Z"}
2022-11-08T11:18:53.874412929Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:289","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).Zones","level":"debug","msg":"Considering zone: /hostedzone/Z3QB1OZK4VUYV6 (domain: staffbase.dev.)","time":"2022-11-08T11:18:53Z"}

With the proposed change the error will be returned from the function where the error occurs, so that the DNS records are not deleted.

The logs with this change are looking as follows:

2022-11-08T13:21:53.211872478Z stderr F {"file":"/sigs.k8s.io/external-dns/source/istio_virtualservice.go:207","func":"sigs.k8s.io/external-dns/source.(*virtualServiceSource).getGateway","level":"error","msg":"Failed retrieving gateway istio-system/istio-default-gateway referenced by VirtualService kobs/hub: Unauthorized","time":"2022-11-08T13:21:53Z"}
2022-11-08T13:21:53.211883248Z stderr F {"file":"/sigs.k8s.io/external-dns/controller/controller.go:295","func":"sigs.k8s.io/external-dns/controller.(*Controller).Run","level":"error","msg":"Unauthorized","time":"2022-11-08T13:21:53Z"}
2022-11-08T13:22:52.236449148Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:243","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).Zones","level":"debug","msg":"Refreshing zones list cache","time":"2022-11-08T13:22:52Z"}
2022-11-08T13:22:53.118371381Z stderr F {"file":"/sigs.k8s.io/external-dns/provider/aws/aws.go:289","func":"sigs.k8s.io/external-dns/provider/aws.(*AWSProvider).Zones","level":"debug","msg":"Considering zone: /hostedzone/Z3QB1OZK4VUYV6 (domain: staffbase.dev.)","time":"2022-11-08T13:22:53Z"}

We tested this on our clusters for the following cases:

  • Create new DNS record for a VirtualService
  • Delete DNS record when a VirtualService is deleted
  • Do not delete DNS records in case of a node failure and an unreachable Kubernetes API

Fixes #2858

Checklist

  • Unit tests updated
  • End user documentation updated

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Nov 8, 2022
@k8s-ci-robot
Copy link
Contributor

Welcome @ricoberger!

It looks like this is your first PR to kubernetes-sigs/external-dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/external-dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Nov 8, 2022
@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 8, 2022
gateway := sc.getGateway(ctx, gateway, virtualService)
gateway, err := sc.getGateway(ctx, gateway, virtualService)
if err != nil {
return targets, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be: nil, err ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this I used return targets, err, because in L303 it also returns the targets and the error

tgs, err := sc.targetsFromGateway(gateway)

What do you think, should I change it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My rule of thumb is to return nil in case you return an err.
So if there's no reason for having this in particular I would say please change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I adjusted it, to just return the error.

@szuecs
Copy link
Contributor

szuecs commented Nov 10, 2022

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: njuettner, ricoberger, szuecs

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@szuecs
Copy link
Contributor

szuecs commented Nov 10, 2022

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 10, 2022
@k8s-ci-robot k8s-ci-robot merged commit a4ac1ce into kubernetes-sigs:master Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

External-dns deletes all DNS records under management when k8s API call timeout occurs
4 participants