Chart & Operator support for replica configuration #47

mrajashree · 2020-09-30T21:31:01Z

Context: Today BRO chart and operator logic isn't designed or intended to have multiple replicas. This is a known aspect of the BRO design and for now it's intended to be a "singleton" pod.

#41 (comment)

To consider:

Manage lease when more than one operator pods(controllers) are running
Adjusting the Chart to allow replicas to be configured

MKlimuszka · 2022-12-07T21:22:51Z

This is a ticket to track tech debt that was pointed out in a review of an old PR that had been merged.

alexandreLamarre · 2024-12-05T19:30:03Z

It seems the consideration here is to have a failover/redundancy mechanism for BRO.

If we aim to accomplish redudancy, then adding a leader election to the BRO operator is a 3 line code change, and a trivial CRD change to deploy the number of replicas. However, I don't see this as useful.

Although a nice improvement, we haven't seen any of these issues in production. The scenarios where redundancy is useful as a failure mechanism is if the pod fails critically due to a sporadic software bug (in which case the software bug should be fixed, and redundancy will only speed up recovering) or if resource saturation is reached.

When resource saturation is reached, either the cluster is having problems which is outside the scope in which BRO can fix itself or the resource limits are improperly configured, becoming a user configuration issue. A useful use case of redundancy is in the case of node failures and to deploy BRO as a daemon-set, but a singleton will still attempt to recover on healthy nodes, so the only gain here is speeding up failure recovery scenarios.

If the idea here is instead to scale up the BRO operator (optimize), then redundancy isn't useful unless we shard the backup and restore workloads, but that isn't incredibly useful as the workloads are IO (and thus CPU) bound. The only speed up feasible I see with sharding the operator workloads is to shard across node resources, but that requires more sophisticated techniques beyond kubernetes leases, controller runtime leader election or any kind of network load balancing.

I believe sharding will be the last resort for optimizing these types of workloads anyways, so even in this case I don't see it being useful.

mrajashree added this to the v2.5.x milestone Sep 30, 2020

mrajashree self-assigned this Sep 30, 2020

maggieliu added [zube]: Team Red Backlog and removed [zube]: Team Red Backlog labels Oct 2, 2020

deniseschannon modified the milestones: v2.5.x, v2.6 Jan 8, 2021

SheilaghM added [zube]: To Triage and removed [zube]: Team Blue Backlog labels Apr 13, 2021

Jono-SUSE-Rancher added the kind/task label Apr 22, 2021

Jono-SUSE-Rancher unassigned mrajashree Apr 22, 2021

Jono-SUSE-Rancher modified the milestones: v2.6, v2.x - Backlog Apr 22, 2021

deniseschannon added the team/area3 label Nov 25, 2021

superseb removed this from the v2.x - Backlog milestone Jun 6, 2022

superseb added the area/backup-recover label Jun 10, 2022

MKlimuszka added priority/2 kind/tech-debt labels Dec 7, 2022

MKlimuszka added [zube]: Team Area 3 and removed [zube]: To Triage labels Dec 7, 2022

MKlimuszka added the squad/BRO label Nov 1, 2023

zube bot added this to the v2.x - Backlog milestone Nov 6, 2023

MKlimuszka removed squad/BRO labels Feb 13, 2024

Jono-SUSE-Rancher removed the [zube]: Team Area 3 label May 22, 2024

mallardduck added the team/observability&backup label Jul 26, 2024

MKlimuszka removed this from the v2.x - Backlog milestone Aug 21, 2024

mallardduck changed the title ~~Manage lease when more than one operator pods(controllers) are running~~ [RFE] Chart & Operator support for replica configuration Nov 1, 2024

mallardduck changed the title ~~[RFE] Chart & Operator support for replica configuration~~ Chart & Operator support for replica configuration Dec 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chart & Operator support for replica configuration #47

Chart & Operator support for replica configuration #47

mrajashree commented Sep 30, 2020 •

edited by mallardduck

Loading

MKlimuszka commented Dec 7, 2022

alexandreLamarre commented Dec 5, 2024 •

edited

Loading

Chart & Operator support for replica configuration #47

Chart & Operator support for replica configuration #47

Comments

mrajashree commented Sep 30, 2020 • edited by mallardduck Loading

MKlimuszka commented Dec 7, 2022

alexandreLamarre commented Dec 5, 2024 • edited Loading

mrajashree commented Sep 30, 2020 •

edited by mallardduck

Loading

alexandreLamarre commented Dec 5, 2024 •

edited

Loading