chore: 🐛 use reconcile to reduce thrash in pod watch and service watch #364

cmwylie19 · 2024-04-23T15:05:41Z

Description

We are watching for changes to the Kubernetes service and Neuvector Jobs in order to update network policy. This triggers a cascade of events on each change leading to likely thrash in the kube-apiserver. If we put them into a queue then they will be processes one at a time in the order in which the event came in. Cutting down the load on the API Server.

Visual Proof

Look at the wild spikes in CPU on the watcher. Now with this ordered processing, it seems to throttle the amount used and therefore is not hitting that strange frozen state.

See the issue for additional screenshots

52 mins no errors

First failure after 78 mins (memory seems to be growing)

Related Issue

Fixes - Hopefully 🤞 363

Relates to #363 defenseunicorns/pepr#745

Type of change

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Other (security config, docs update, etc)

Checklist before merging

Test, docs, adr added or updated as needed
Contributor Guide Steps(https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md#submitting-a-pull-request) followed

Signed-off-by: Case Wylie <cmwylie19@defenseunicorns.com>

mjnagel · 2024-04-23T16:50:40Z

I opened a parallel PR to added reconcile for the service watch but not the pod one - #362. Might let this one simmer for a bit and see if it seems to help before pressing forward - thinking about the case here for our job termination...unless we are seeing duplicate attempts to terminate the sidecar, I'm not sure sequential/queue-based processing is needed here.

cmwylie19 · 2024-04-23T22:31:43Z

no problem, we can close this. thanks! Close as you see fit

cmwylie19 added 2 commits April 23, 2024 10:04

chore: throw service changes into queue for order processing

ced2c5d

Signed-off-by: Case Wylie <cmwylie19@defenseunicorns.com>

chore: run in queue

176b747

Signed-off-by: Case Wylie <cmwylie19@defenseunicorns.com>

cmwylie19 requested a review from mjnagel April 23, 2024 15:05

cmwylie19 changed the title ~~🐛 : use reconcile to reduce thrash in pod watch and service watch~~ chore: 🐛 use reconcile to reduce thrash in pod watch and service watch Apr 23, 2024

mjnagel closed this Apr 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: 🐛 use reconcile to reduce thrash in pod watch and service watch #364

chore: 🐛 use reconcile to reduce thrash in pod watch and service watch #364

cmwylie19 commented Apr 23, 2024 •

edited

Loading

mjnagel commented Apr 23, 2024

cmwylie19 commented Apr 23, 2024

chore: 🐛 use reconcile to reduce thrash in pod watch and service watch #364

chore: 🐛 use reconcile to reduce thrash in pod watch and service watch #364

Conversation

cmwylie19 commented Apr 23, 2024 • edited Loading

Description

Related Issue

Type of change

Checklist before merging

mjnagel commented Apr 23, 2024

cmwylie19 commented Apr 23, 2024

cmwylie19 commented Apr 23, 2024 •

edited

Loading