Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[etcd] Set reasonable cluster connection/sync settings by default #2860

Merged
merged 2 commits into from
Nov 9, 2020

Conversation

vdarulis
Copy link
Collaborator

@vdarulis vdarulis commented Nov 9, 2020

What this PR does / why we need it:

All of the services depend on timely data from KV and default etcd keepalive settings are far too relaxed, and can cause outages due to 5-10 minute latency, before dead connections are closed and retried.

Also, we are not setting DialTimeout (probably because pre-3.4 it would cause client to block, but now it's optional), and not using keepalives if no streams are currently active, which is a good idea in watch-happy code.

Special notes for your reviewer:

Does this PR introduce a user-facing and/or backwards incompatible change?:


Does this PR require updating code package or user-facing documentation?:


Copy link
Collaborator

@abliqo abliqo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vdarulis vdarulis merged commit 6536b7c into master Nov 9, 2020
@vdarulis vdarulis deleted the v/etcdclient_keepalive branch November 9, 2020 17:58
soundvibe added a commit that referenced this pull request Nov 11, 2020
* master: (28 commits)
  [dbnode] Add claims for index segments volume index (#2846)
  [dbnode] Remove namespaces from example config and integration tests (#2866)
  [dbnode] Resurrect flaky test skip (#2868)
  [aggregator] Fix checkCampaignStateLoop (#2867)
  [dbnode] implement deletion method in namespace kvadmin service (#2861)
  Replace closer with resource package (#2864)
  Add coding style guide (#2831)
  Add GOVERNANCE.md to describe governance (#2830)
  Add COMPATIBILITY.md to describe version compatibility (#2829)
  Refactor etcd config as discovery section with convenience types (#2843)
  Refactor x/lockfile into dbnode/server (#2862)
  [lint] Disable nlreturn linter (#2865)
  [m3cluster] Expose placement algorithm in placement service (#2858)
  [etcd] Set reasonable cluster connection/sync settings by default (#2860)
  [dbnode] Use bits.LeadingZeros64 to improve encoder performance (#2857)
  Cleanup m3nsch leftovers (#2856)
  Update ci-scripts to correct coverage tracking (#2854)
  [aggregator] Process campaign state without waiting for first campaign check interval (#2855)
  Bump go to 1.14 (#2853)
  [query] Remove single series error from M3
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants