Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Controller: Allow reconciling resources in parallel #346

Closed
nightkr opened this issue Dec 12, 2020 · 2 comments · Fixed by #347
Closed

Controller: Allow reconciling resources in parallel #346

nightkr opened this issue Dec 12, 2020 · 2 comments · Fixed by #347
Labels
docs unclear documentation question Direction unclear; possibly a bug, possibly could be improved. runtime controller runtime related

Comments

@nightkr
Copy link
Member

nightkr commented Dec 12, 2020

This isn't really documented properly ATM, but my mental rules for the controller are roughly the following:

  1. A given controller MUST NEVER run two reconcilers for the same object in parallel
  2. A controller MAY reconcile separate objects in parallel (which could be anything between 1, a static limit, or all of them; this is implementation defined)
  3. Multiple controllers MAY reconcile the same object in parallel (there should be no difference between running them in-process vs separate processes)

Currently we uphold these rules by effectively running each controller "single-tasked", but that makes kube-runtime pretty much unusable for some kinds of operators. For example, a single Rook orchestration can easily take multiple minutes (most of which is spent waiting for the managed services).

However, there are some things that'd have to be resolved when tackling this..

  1. Avoid head-of-line blocking where other objects are kept in the queue rather than being scheduled because the current head of the queue is busy
  2. Ensure that each object can still only occupy one queue slot
@nightkr nightkr added docs unclear documentation question Direction unclear; possibly a bug, possibly could be improved. runtime controller runtime related labels Dec 12, 2020
@clux
Copy link
Member

clux commented Dec 13, 2020

Yeah, I that matches my rules as well.

Architecture-wise, async ought to mean rule 2 for us, and we ought to strive towards that property.

But the problem case is rule 1, so if we were to just use buffer_unordered, it could lead to a pretty racey controller.

I can see a potential way this could be done by having a three queues;

  • one for tracking what reconciles are in_progress
  • one for incoming events plus schedule_tx sends (probably just the stream we already have)
  • main pending queue used to pull from using buffer_unbounded that actually calls reconcile.

It might be a bit hairy getting the scheduler to "rebalance" the queues on every event depending on what's in progress, but we could potentially break up the big controller stream flow into two flows. One that handles input delegation and scheduling, and one that handles output (in parallel).

@nightkr
Copy link
Member Author

nightkr commented Dec 13, 2020

I can see a potential way this could be done by having a three queues;

* one for tracking what reconciles are `in_progress`

* one for incoming events plus schedule_tx sends (probably just the stream we already have)

* main `pending` queue used to pull from using `buffer_unbounded` that actually calls `reconcile`.

Yup, that's what I was thinking too. Though pending would probably have to be integrated into the scheduler rather than the applier (/Controller).

And I'm not really sure about a good design for asking what is still supposed to be a stream for something like "give me anything except for these objects".

nightkr added a commit to Appva/kube-rs that referenced this issue Dec 17, 2020
This is a first step towards allowing parallel reconcilers, while respecting
the laws laid out in kube-rs#346.
nightkr added a commit to Appva/kube-rs that referenced this issue Dec 19, 2020
@clux clux closed this as completed in #347 Dec 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs unclear documentation question Direction unclear; possibly a bug, possibly could be improved. runtime controller runtime related
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants