Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache ListRecords result #178

Closed
ideahitme opened this issue Apr 27, 2017 · 15 comments
Closed

Cache ListRecords result #178

ideahitme opened this issue Apr 27, 2017 · 15 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@ideahitme
Copy link

ideahitme commented Apr 27, 2017

Currently each iteration of the synchronisation loop requires external-dns to fetch the list of all records from DNS Provider, which can be in general case avoided by caching records in memory. We can define a cache with lease period, which will be updated in two scenarios:

  1. Lease period expires
  2. create records fails due to the name overlapping.

In general case it should greatly help with reducing the API rates, especially in cases where create API rarely fails (never happens in case if DNS provider is used solely by single instance of external-dns). In case of stable and moderately active cluster (with not so many ingress/service being created/modified) external-dns will be able to reduce its interaction with DNS provider to bare minimum (on average to one request per lease period)

This behaviour can be pluggable via cmd line flag e.g. --enable-cache

@ideahitme ideahitme added the kind/feature Categorizes issue or PR as related to a new feature. label Apr 27, 2017
@ideahitme
Copy link
Author

/cc @linki @justinsb @iterion @hjacobs thoughts? I think this can be potentially useful in the view of AWS rate limiting issues

@linki
Copy link
Member

linki commented Apr 27, 2017

In general 👍 But how does this differ from setting the --interval to a higher value?

@ideahitme
Copy link
Author

Setting --interval to higher value would mean user having to wait longer for the record to be created. With cache enabled we will "try our luck" by creating (not upsert) and if it fails update the cache. It would mean with lease period of 1 hour and interval of 1 min - each minute we would only need to send ChangeRecords request to DNS Provider if something has changed according to the cache (without actually polling for records first) - so comparing to current implementation within one hour we could make as less as 1/60 of the requests.

@linki
Copy link
Member

linki commented Apr 27, 2017

sgtm 👍

@hjacobs
Copy link
Contributor

hjacobs commented Apr 27, 2017

@ideahitme what would you use as default TTL for the cache? I think 1 hour is far too long as it would mean that "manually" changing/deleting records would only be "restored" after one hour: IMHO the system should strive for correctness, i.e. Kubernetes state should reflect real DNS state. I guess something in the range of minutes is good enough, e.g. we could reduce the default interval to 30s and have a cache TTL of 300s (5 minutes):

  • the reduced default interval makes sure that any changes in Kubernetes (mostly triggered by users) are quickly synced to DNS ➡️ users get quick feedback
  • the 5 min cache makes sure that we stay within (AWS) API rate limits

@ideahitme
Copy link
Author

@hjacobs yes, 1 hour is just an example :D but even setting a TTL of 5min would give a huge win - minimising number of potential clashes (after manual changes) and significantly reducing number of AWS API requests

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 21, 2019
@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 21, 2019
@fejta-bot
Copy link

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@fejta-bot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@raravena80
Copy link

raravena80 commented Sep 30, 2019

/reopen

@raxod502-plaid
Copy link

The feature was never implemented, so it was inappropriate for this issue to be closed. ExternalDNS still badly needs a cache; with a large number of records in a hosted zone, it will max out Route 53 rate limits every time the sync loop runs.

@tehlers320
Copy link

/reopen

does that work @raxod502-plaid ?

@k8s-ci-robot
Copy link
Contributor

@tehlers320: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen

does that work @raxod502-plaid ?

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

lou-lan pushed a commit to lou-lan/external-dns that referenced this issue May 11, 2022
…-sigs` (kubernetes-sigs#178)

* Change import paths from GoogleContainerTools -> kubernetes-sigs

* Replace all remaining occurances GoogleContainerTools -> kubernetes-sigs

With this commit the krew-index is also switched to kubernetes-sigs
@raxod502-plaid
Copy link

does that work @raxod502-plaid ?

Sorry, would you mind clarifying what you're asking?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

8 participants