buffer, limit: use `tokio-util`'s `PollSemaphore` #556

hawkw · 2021-02-10T19:02:29Z

The Semaphore implementation in tokio::sync doesn't expose a
poll_acquire method to acquire owned semaphore permits in a
non-async-await context. Currently, the limit::concurrency and
buffer middleware use our own pollable wrapper for
tokio::sync::Semaphore. This works by creating a new
Semaphore::acquire_owned future and boxing it every time the semaphore
starts acquiring a new permit.

Recently, the tokio_util crate introduced its own PollSemaphore
wrapper type. This provides the same functionality of a pollable
version of tokio::sync's Semaphore, just like our semaphore wrapper.
However, tokio_util's version is significantly more efficient: rather
than allocating a new Box for each acquire_owned future, it uses
the ReusableBoxFuture type to reuse a single allocation every
time a new acquire_owned future is needed. This means that rather than
allocating every time a Buffer or ConcurrencyLimit service starts
acquiring a new permit, there's a single allocation for each clone of
the service. Unless services are cloned per-request, this means that the
allocation is moved out of the request path in most cases.

I had originally considered an approach similar to this, but I didn't
think the reduced allocations were worth adding a bunch of unsafe code
in tower (which presently contains no unsafe code). However, this type
fits in perfectly in tokio-util, and now that there's an upstream
implementation, we should use it.

This introduces a dependency on tokio-util when the "limit" or "buffer"
features are enabled.

Additionally, I've added a new test for Buffer asserting that once an
individual Buffer service has been driven to readiness but not called,
additional poll_ready calls won't acquire additional buffer capacity.
This reproduces a bug that existed in earlier versions of
tower::buffer, which could result in starvation of buffer waiters.
This bug doesn't exist in 0.4, but I wanted to ensure that changing the
buffer internals here didn't introduce any new regressions.

Signed-off-by: Eliza Weisman <eliza@buoyant.io>

This reproduces an issue in previous versions of `tower::buffer`, where multiple calls to `poll_ready` when the buffer is already ready would acquire additional channel capacity that was never used. This essentially leaked channel capacity. This issue doesn't exist with the current implementation, but this test should help ensure it isn't reintroduced when changing the buffer internals (such as switching to `tokio-util`'s pollable channel). Signed-off-by: Eliza Weisman <eliza@buoyant.io>

Signed-off-by: Eliza Weisman <eliza@buoyant.io>

…922) The proxy currently has its own implementation of a `tower` `Service` that makes an inner service `Clone`able by driving it in a spawned task and buffering requests on a channel. This also exists upstream, as `tower::buffer`. We implemented our own version for a couple of reasons: to avoid an upstream issue where memory was leaked when a buffered request was cancelled, and to implement an idle timeout when the buffered service has been unready for too long. However, it's no longer necessary to reimplement our own buffer service for these reasons: the upstream bug was fixed in `tower` 0.4 (see tower-rs/tower#476, tower-rs/tower#480, and tower-rs/tower#556); and we no longer actually use the buffer idle timeout (instead, we idle out unresponsive services with the separate `Failfast` middleware, note that `push_spawn_buffer_with_idle_timeout` is never actually used). Therefore, we can remove our _sui generis_ implementation in favour of `tower::buffer` from upstream. This eliminates dead code for the idle timeout, which we never actually use, and reduces duplication (since `tonic` uses `tower::buffer` internally, its code is already compiled into the proxy). It also reduces the amount of code I'm personally responsible for maintaining in two separate places ;) Since the `linkerd-buffer` crate erases the type of the buffered service, while `tower::buffer` does not, I've changed the `push_spawn_buffer`/`spawn_buffer` helpers to also include a `BoxService` layer. This required adding a `BoxServiceLayer` type, since `BoxService::layer` returns a `LayerFn` with an unnameable type. Also, this change ran into issues due to a compiler bug where generators (async blocks) sometimes forget concrete lifetimes, rust-lang/rust#64552. In order to resolve this, I had to remove the outermost `async` blocks from the OpenCensus and identity daemon tasks. These async blocks were used only for emitting a tracing event when the task is started, so it wasn't a big deal to remove them; I moved the trace events into the actual daemon task functions, and used a `tracing` span to propagate the remote addresses which aren't known inside the daemon task functions. Signed-off-by: Eliza Weisman <eliza@buoyant.io>

hawkw added 5 commits February 10, 2021 10:08

limit: use tokio-util's PollSemaphore

4b4d041

Signed-off-by: Eliza Weisman <eliza@buoyant.io>

buffer: replace our semaphore with tokio_util's PollSemaphore

baa8cc5

buffer also depends on tokio-util now

9a5501e

Signed-off-by: Eliza Weisman <eliza@buoyant.io>

buffer doesn't need tokio-stream

65dd663

hawkw added C-enhancement Category: A PR with an enhancement or a proposed on in an issue. A-buffer Area: The tower "buffer" middleware A-tower Area: the tower crate itself labels Feb 10, 2021

hawkw requested a review from a team February 10, 2021 19:02

hawkw self-assigned this Feb 10, 2021

jonhoo approved these changes Feb 10, 2021

View reviewed changes

hawkw merged commit ccfaffc into master Feb 10, 2021

hawkw deleted the eliza/buffer-fix branch February 10, 2021 19:52

hawkw mentioned this pull request Feb 17, 2021

buffer: replace linkerd-buffer with tower::buffer from upstream linkerd/linkerd2-proxy#922

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

buffer, limit: use `tokio-util`'s `PollSemaphore` #556

buffer, limit: use `tokio-util`'s `PollSemaphore` #556

hawkw commented Feb 10, 2021

buffer, limit: use tokio-util's PollSemaphore #556

buffer, limit: use tokio-util's PollSemaphore #556

Conversation

hawkw commented Feb 10, 2021

buffer, limit: use `tokio-util`'s `PollSemaphore` #556

buffer, limit: use `tokio-util`'s `PollSemaphore` #556