Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch events #175

Merged
merged 6 commits into from
Aug 16, 2022
Merged

Batch events #175

merged 6 commits into from
Aug 16, 2022

Conversation

pleshakov
Copy link
Contributor

Proposed changes

Previously, the Gateway would handle one event (upserting or deleting a
Kubernetes resource) at a time.

This PR introduces event batching: multiple events (a batch) are
handled at once.

Batching is needed because, because typically handling an event (or
multiple events at once) will result into reloading NGINX, which is
the operation we want to minimize, for the following reasons:
(1) A reload takes time - at least 200ms. The time depends on the size
of the configuration including the number of TLS certs, available CPU
cycles.
(2) A reload can have side-effects for the data plane traffic.

Now, when a new event comes, there are two cases:

  • If there is no event(s) currently being handled, the new event is
    handled immediately.
  • Otherwise, the new event will be saved for later handling. All saved
    events will be handled after the handling of the current event(s)
    finishes. Multiple saved events will be handled at once in one batch.

Additional implementation notes:

(a) The EventLoop was split into two parts:
(1) The EventHandler, which is only responsible for handling a batch of
events, without dealing with concurrency.
(2) The stripped-down EventLoop, which is responsible for batching events
and propagating batches to the EventHandler in a dedicated goroutine.

(b) The ChangeProcessor was fixed, so that when multiple changes are
captured (coming from a single batch) -- first, changes that cause
data plane reconfiguration, followed by changes that do not cause
reconfiguration -- in that case, the data plane will be reconfigured.
Without the fix, the latter changes would prevent data plane from
reconfiguring. Note that the bug only appeared when batching was added.

internal/events/handler.go Show resolved Hide resolved
internal/events/loop.go Outdated Show resolved Hide resolved
internal/events/loop.go Outdated Show resolved Hide resolved
internal/state/change_processor.go Outdated Show resolved Hide resolved
Copy link
Contributor

@kate-osborn kate-osborn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

Previously, the Gateway would handle one event (upserting or deleting a
Kubernetes resource) at a time.

This commit introduces event batching: multiple events (a batch) are
handled at once.

Batching is needed because, because typically handling an event (or
multiple events at once) will result into reloading NGINX, which is
the operation we want to minimize, for the following reasons:
(1) A reload takes time - at least 200ms. The time depends on the size
of the configuration including the number of TLS certs, available CPU
cycles.
(2) A reload can have side-effects for the data plane traffic.

Now, when a new event comes, there are two cases:
- If there is no event(s) currently being handled, the new event is
handled immediately.
- Otherwise, the new event will be saved for later handling. All saved
events will be handled after the handling of the current event(s)
finishes. Multiple saved events will be handled at once in one batch.

Additional implementation notes:

(a) The EventLoop was split into two parts:
(1) The EventHandler, which is only responsible for handling a batch of
events, without dealing with concurrency.
(2) The stripped-down EventLoop, which is responsible for batching events
and propagating batches to the EventHandler in a dedicated goroutine.

(b) The ChangeProcessor was fixed, so that when multiple changes are
captured (coming from a single batch) -- first, changes that cause
data plane reconfiguration, followed by changes that do not cause
reconfiguration -- in that case, the data plane will be reconfigured.
Without the fix, the latter changes would prevent data plane from
reconfiguring. Note that the bug only appeared when batching was added.
@pleshakov pleshakov merged commit 7f831b5 into main Aug 16, 2022
@pleshakov pleshakov deleted the feature/batch-events branch August 16, 2022 17:21
@lucacome lucacome added the enhancement New feature or request label Aug 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants