-
Notifications
You must be signed in to change notification settings - Fork 286
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
buffer, limit: use tokio-util
's PollSemaphore
#556
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
This reproduces an issue in previous versions of `tower::buffer`, where multiple calls to `poll_ready` when the buffer is already ready would acquire additional channel capacity that was never used. This essentially leaked channel capacity. This issue doesn't exist with the current implementation, but this test should help ensure it isn't reintroduced when changing the buffer internals (such as switching to `tokio-util`'s pollable channel). Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Signed-off-by: Eliza Weisman <eliza@buoyant.io>
jonhoo
approved these changes
Feb 10, 2021
hawkw
added a commit
to linkerd/linkerd2-proxy
that referenced
this pull request
Feb 17, 2021
…922) The proxy currently has its own implementation of a `tower` `Service` that makes an inner service `Clone`able by driving it in a spawned task and buffering requests on a channel. This also exists upstream, as `tower::buffer`. We implemented our own version for a couple of reasons: to avoid an upstream issue where memory was leaked when a buffered request was cancelled, and to implement an idle timeout when the buffered service has been unready for too long. However, it's no longer necessary to reimplement our own buffer service for these reasons: the upstream bug was fixed in `tower` 0.4 (see tower-rs/tower#476, tower-rs/tower#480, and tower-rs/tower#556); and we no longer actually use the buffer idle timeout (instead, we idle out unresponsive services with the separate `Failfast` middleware, note that `push_spawn_buffer_with_idle_timeout` is never actually used). Therefore, we can remove our _sui generis_ implementation in favour of `tower::buffer` from upstream. This eliminates dead code for the idle timeout, which we never actually use, and reduces duplication (since `tonic` uses `tower::buffer` internally, its code is already compiled into the proxy). It also reduces the amount of code I'm personally responsible for maintaining in two separate places ;) Since the `linkerd-buffer` crate erases the type of the buffered service, while `tower::buffer` does not, I've changed the `push_spawn_buffer`/`spawn_buffer` helpers to also include a `BoxService` layer. This required adding a `BoxServiceLayer` type, since `BoxService::layer` returns a `LayerFn` with an unnameable type. Also, this change ran into issues due to a compiler bug where generators (async blocks) sometimes forget concrete lifetimes, rust-lang/rust#64552. In order to resolve this, I had to remove the outermost `async` blocks from the OpenCensus and identity daemon tasks. These async blocks were used only for emitting a tracing event when the task is started, so it wasn't a big deal to remove them; I moved the trace events into the actual daemon task functions, and used a `tracing` span to propagate the remote addresses which aren't known inside the daemon task functions. Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
A-buffer
Area: The tower "buffer" middleware
A-tower
Area: the tower crate itself
C-enhancement
Category: A PR with an enhancement or a proposed on in an issue.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The
Semaphore
implementation intokio::sync
doesn't expose apoll_acquire
method to acquire owned semaphore permits in anon-async-await context. Currently, the
limit::concurrency
andbuffer
middleware use our own pollable wrapper fortokio::sync::Semaphore
. This works by creating a newSemaphore::acquire_owned
future and boxing it every time the semaphorestarts acquiring a new permit.
Recently, the
tokio_util
crate introduced its ownPollSemaphore
wrapper type. This provides the same functionality of a pollable
version of
tokio::sync
'sSemaphore
, just like our semaphore wrapper.However,
tokio_util
's version is significantly more efficient: ratherthan allocating a new
Box
for eachacquire_owned
future, it usesthe
ReusableBoxFuture
type to reuse a single allocation everytime a new
acquire_owned
future is needed. This means that rather thanallocating every time a
Buffer
orConcurrencyLimit
service startsacquiring a new permit, there's a single allocation for each clone of
the service. Unless services are cloned per-request, this means that the
allocation is moved out of the request path in most cases.
I had originally considered an approach similar to this, but I didn't
think the reduced allocations were worth adding a bunch of unsafe code
in
tower
(which presently contains no unsafe code). However, this typefits in perfectly in
tokio-util
, and now that there's an upstreamimplementation, we should use it.
This introduces a dependency on
tokio-util
when the "limit" or "buffer"features are enabled.
Additionally, I've added a new test for
Buffer
asserting that once anindividual
Buffer
service has been driven to readiness but not called,additional
poll_ready
calls won't acquire additional buffer capacity.This reproduces a bug that existed in earlier versions of
tower::buffer
, which could result in starvation of buffer waiters.This bug doesn't exist in 0.4, but I wanted to ensure that changing the
buffer internals here didn't introduce any new regressions.