Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws: cache zones list #1704

Merged
merged 1 commit into from
Sep 10, 2020

Conversation

bpineau
Copy link
Contributor

@bpineau bpineau commented Aug 4, 2020

When it syncs AWS DNS with k8s cluster content (at --interval), external-dns submits two distinct Route53 API calls:

  • to fetch available zones (eg. for tag based zones discovery, or when zones are created after exernal-dns started),
  • to fetch relevant zones' resource records.

Each call taxes the Route53 APIs calls budget (5 API calls per second per AWS account/region hard limit), increasing the probability of being throttled. Changing synchronisation interval would mitigate those calls' impact, but at the cost of keeping stale records for a longer time.

For most practical uses cases, zones list aren't expected to change frequently. Even less so when external-dns is provided an explicit, static zones set (--zone-id-filter rather than --aws-zone-tags).

Using a zones list cache halves the number of Route53 read API calls.

Example

Route53 API calls before/after deploying that change at 08:50, on a single cluster:
plain2 external-dns AWS API calls

Route53 calls returning http 400, after rolling change to several cluster at 10:44:
external-dns route53 calls_s status=400

Logs entries containing "throttling", several clusters:
external-dns_ AWS throttling

Checklist

  • Update changelog in CHANGELOG.md, use section "Unreleased".

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 4, 2020
@k8s-ci-robot
Copy link
Contributor

Welcome @bpineau!

It looks like this is your first PR to kubernetes-sigs/external-dns 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/external-dns has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Aug 4, 2020
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Aug 4, 2020
@bpineau
Copy link
Contributor Author

bpineau commented Aug 5, 2020

/assign @Raffo

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 5, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 6, 2020
@seanmalloy
Copy link
Member

/kind feature

@k8s-ci-robot k8s-ci-robot added kind/feature Categorizes issue or PR as related to a new feature. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 19, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 19, 2020
Copy link
Member

@njuettner njuettner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is helpful @bpineau, thank you for creating this PR 👍.
In case I don't want caching, any chance I can avoid that?

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 19, 2020
@bpineau bpineau force-pushed the aws-cache-zones-list branch 2 times, most recently from 26e12d1 to 2c48eb7 Compare August 20, 2020 06:16
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 20, 2020
@bpineau
Copy link
Contributor Author

bpineau commented Aug 20, 2020

Thanks @njuettner , --aws-zones-cache-duration=0s will now disable zones caching.

@njuettner
Copy link
Member

One small thing @bpineau could you add one or two sentence for the docs. I'd like to have this somewhere so people understand how they can turn it off and what's the impact of this. I think this PR is pretty important for people who are dealing with throttling issues.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 20, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 20, 2020
CHANGELOG.md Outdated Show resolved Hide resolved
@tariq1890
Copy link
Contributor

@bpineau Thank you for this PR. This feature would help us as well.

I also think we should default to disabling the cache because many users would be blindsided by this when they wonder why they may be getting state data. We should be careful about changing default behaviours IMO

@bpineau
Copy link
Contributor Author

bpineau commented Aug 21, 2020

@tariq1890 ok, updated accordingly

@tariq1890
Copy link
Contributor

@bpineau Looks like you need to rebase you PR. Sorry :(

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 27, 2020
@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Aug 28, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 31, 2020
@bpineau
Copy link
Contributor Author

bpineau commented Aug 31, 2020

@njuettner would you mind taking an other view? thanks!

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 1, 2020
@Raffo
Copy link
Contributor

Raffo commented Sep 2, 2020

@bpineau sorry for the changelog mess, please rebase once more.

When it syncs AWS DNS with k8s cluster content (at `--interval`), external-dns submits two distinct Route53 API calls:
* to fetch available zones (eg. for tag based zones discovery, or when zones are created after exernal-dns started),
* to fetch relevant zones' resource records.

Each call taxes the Route53 APIs calls budget (5 API calls per second per AWS account/region hard limit), increasing the probability of being throttled.
Changing synchronization interval would mitigate those calls' impact, but at the cost of keeping stale records for a longer time.

For most practical uses cases, zones list aren't expected to change frequently.
Even less so when external-dns is provided an explicit, static zones set (`--zone-id-filter` rather than `--aws-zone-tags`).

Using a zones list cache halves the number of Route53 read API calls.
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 3, 2020
@Raffo
Copy link
Contributor

Raffo commented Sep 9, 2020

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 9, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bpineau, Raffo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 9, 2020
@k8s-ci-robot k8s-ci-robot merged commit 8b81c10 into kubernetes-sigs:master Sep 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants