Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate Leader for life based leader election #117

Open
varshaprasad96 opened this issue Jul 10, 2023 · 0 comments
Open

Deprecate Leader for life based leader election #117

varshaprasad96 opened this issue Jul 10, 2023 · 0 comments

Comments

@varshaprasad96
Copy link
Member

Feature Request

Is your feature request related to a problem? Please describe.
Currently, the repository contains a leader-for-life based election model that ensures that a single leader is elected for life during a HA state.

To briefly describe what leader for life and leader for lease approaches are:
(1) Leader for life: A leader pod is selected for life until its garbage collected. This ensures there is only one leader at any instant of time.
(2) Leader for lease: Leader election happens periodically at a defined interval which can be tweaked. When the current leader is not able to renew the lease, a new leader is elected.

More details on both of these approaches are available here.

Approach (2) is implemented upstream, in client-go and is scaffolded by default for the past 30+ releases of SDK, since it is implemented as a part of setting up the manager in controller-runtime. It guarantees faster election of leaders, less downtime, recovery from disconnected/frozen node failures. However, it does not eliminate the split brain scenario - where more than a single leader is available at an instant of time.

Approach (1) on the other hand, was developed long before we had a leader-for-lease implemented upstream. Though it solves the split brain scenario, it does not guarantee recovery from node failure nor faster recovery. We also have issues with integrating it to controller-runtime (#48). Neither is it being maintained nor used as widely as leader for lease.

Here is a detailed comment explaining the preference of (2) over the other, wherein users would prefer a faster recovery even though there is split brain scenario intermittently, rather than an implementation that does not guarantee faster recovery.

Describe the solution you'd like
Since leader for life approach is not being widely used, neither works with controller-runtime seamlessly, it is better to adopt a well tested upstream library than to depend on what is currently available as an option.

The solution for this is:

  1. To deprecate and remove leader for life in future releases of Operator SDK.
  2. To bring this up upstream (in controller-runtime), for easier integration. This had already been brought up upstream (FR: Add leader-for-life implementation for leader-election kubernetes-sigs/controller-runtime#1963) but there was no response on the same.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant