rt: add optional throttling to BasicScheduler #2016

fengalin · 2019-12-22T10:24:40Z

Motivation

This PR is a continuation of #1887.

gst-plugin-threadshare is a framework and a collection of GStreamer
elements which are developped with the Rust programming language and use the
gstreamer-rs bindings to the GStreamer C-based libraries.

In the regular GStreamer model, pipeline elements spawn there own threads to
deal with asynchronous processing. Most of the time, these threads wait for
something to happen which causes multiple overheads. The gst-plugin-threadshare
model allows elements to share threads which reduces context switches and system
calls, thus lowering CPU load.

The framework executor uses tokio. Until release 0.2.0, tokio featured
functions that gave control on the iterations of the executor:

turn would execute one iteration and indicate if tasks were handled by
the executor.
The user could request the executor to park & unpark.

Thanks to these features the framework could benefit from the tokio executor
while implementing a throttling strategy. The throttling strategy consists in
grouping tasks, timers and I/O handling by forcing the thread to sleep during a
short duration. For use cases where a large number of events are to be handled
randomly (e.g. 2,000+ live multimedia streams), this induces an even more
efficient use of the CPU and wider bandwidth.

Since the throttling strategy introduces pauses in the execution thread, timers
need a special treatment so that they are fired close enough to their target
instant. In order to avoid timers from being triggered too late,
gst-plugin-threadshare implemented its own timers management.

See this blog post for a more in-depth explanation of the throttling strategy.

Two solutions are considered for gst-plugin-threadshare to keep up with
tokio new versions (see this thread):

Introduce the throttling strategy as an option in tokio's executor.
Re-introduce an API in tokio's executor so that gst-plugin-threadshare
can keep its throttling strategy.

Solution (1) presents the following benefits:

Other projects could activate the throttling strategy depending on their
use cases.
The timers special handling with regard to throttling would be implemented
in tokio, meaning gst-plugin-threadshare would not implement its own
timers management.

This PR is an implementation of solution (1). Follow this link
for an evaluation of the resulting behavior for the BasicScheduler.

Solution

This PR introduces a new optional duration parameter max_throttling. If the
user provides a non-null duration and selects the basic scheduler, the
throttling strategy is activated. max_throttling is the maximum duration
of the thread pauses and the minimal duration between two consecutive timers &
I/O polls.

When throttling is activated, the BasicScheduler loops on the new
tick_throttling function instead of tick. The tick_throttling function
uses a state machine to control the descent to the state where the thread would
be actually paused if the max_throttling hasn't elapsed yet since the instant the
timers & I/O where last polled.

The purpose of the thread pause is to not be awaken before the intended duration
has elapsed, so the actual pause doesn't rely on a driver's park implementation,
but on a call to std::thread::sleep.

Since timers I/O are never polled more often than max_throttling ms, timers
need a special care. The idea is to consider time frames which start at a polling
instant. Consider a time frame starting now. Since current time frame will last
at least max_throttling ms, we need to fire timers that will elapse before
the first half of current timer frame, otherwise they would be fire further to
their intended instant and always too late.

In order to implement this strategy with minimal impact on existing code, it was
decided to add a new mode to the time::Driver. When this mode is activated,
the park_timeout method fires the timers as described above, polls the I/O
via the inner driver and returns immediately. This way the scheduler can still
trigger tasks that would depend on the timers before going to sleep.

Tests

In order to ensure on par behavior with the regular tick algorithm,
basic_scheduler_throttling was added to rt_common. A new test
rt_basic_throttling.rs takes care of checking the time frame strategy.

Tests were also conducted as part of this evaluation
which covers a tokio only benchmark as well as a comparison of the
implementations used for the gst-plugin-threadshare framework.

fengalin · 2019-12-22T11:58:20Z

Failure is due to elapsed duration checks on MacOS. I know elapsed duration checks are not ideal in CI, but I though they were important here. I already relaxed the boundary compared to the limit I tested on my box, but it doesn't seem enough. Unfortunately, I don't have a MacOS to ascertain this is not a bug in my implementation.

Will try to figure out a solution. Any suggestion welcome! :)

fengalin · 2019-12-23T09:56:32Z

Still no luck with CI on macOS. It's the only platform which show inconsistent
elapsed durations for the rt_basic_throttling tests:

Test	elapsed	Max. boundary
at_root_one_time_frame	(X) 214.6	200
in_spawn_one_time_frame	191.9	200
in_spawn_two_time_frames	191.4	300
at_root_two_time_frames	191.9	300

Note: not always the same test in error + it seems very suspicious to me
that they all show a similar elapsed duration when the two_time_frames
variants should be around twice as long as the one_time_frame tests.

Windows & Linux show expected behavior:

Test	elapsed	Max. boundary
at_root_one_time_frame	100.2	200
in_spawn_one_time_frame	100.1	200
in_spawn_two_time_frames	200.3	300
at_root_two_time_frames	200.4	300

These results were extracted from this CI execution which was triggered by
1178fb9 to investigate these results.

fengalin · 2019-12-24T12:43:52Z

I added a commit to account for the dispersion in timing for CI
on macOS using rt_common timer based tests.

Commit: 23e4dc2, CI run: https://dev.azure.com/tokio-rs/Tokio/_build/results?buildId=4297&view=results

test	Linux	macOS	Windows
basic_scheduler::delay_at_root	51.2	51.4	61.2
basic_scheduler::delay_in_spawn	51.3	160.3	61.4
basic_scheduler_throttling::delay_at_root	51.0	55.3	60.5
basic_scheduler_throttling::delay_in_spawn	51.4	53.3	61.9
threaded_scheduler::delay_at_root	51.4	78.3	59.2
threaded_scheduler::delay_in_spawn	51.3	73.3	61.3

I choose these tests because they were not added by this PR
and some of them don't make use of the feature introduced in
this PR. With CI on Linux and Windows, the timing remains in
the same range. With CI on macOS, the timing raises up to
160.3ms where it should be close to 50ms.

My conclusion is that we can't check timing bounds with CI on
macOS. As a consequence, I decided to discard the checks
on this OS for the new tests in rt_basic_throttling.

ghost · 2020-01-10T06:00:58Z

Could anybody tell which issue is blocking the PR? is there any chance that it can be merged in upcoming release?

sdroege · 2020-01-10T07:44:37Z

Could anybody tell which issue is blocking the PR?

It needs one of the maintainers to actually review it and first of all consider the idea good and something they'd like to have as part of tokio.

Origin: tokio-rs#1887 (comment)

With some tasks, we don't want to take the risk of waiting for the max_throttling time to elapse before being handled. This is the case when a user starts many task in a serialized way waitting for an acknowledgment that each task is indeed spawned. In this case, we want the task to be handled immediately so that we don't risk to wait max_throttling time for each task.

carllerche · 2020-04-28T20:45:43Z

Sorry for the radio silence.

I have read your description a few times and glanced at the source.

I am not disputing that batching/throttling is a useful optimization technique. I am not following why batching/throttling needs to be implemented in the Tokio runtime and not a layer between Tokio and your application. Could you clarify this?

fengalin · 2020-04-28T21:23:31Z

We used to implement this as a layer, but since the removal of of turn and user callable park & unpark, we couldn't any more (see #1887).

We currently use a fork and plan on moving to a dedicate runtime. Sorry for accidentally pushing to this branch and reviving this PR.

Closing it now.

carllerche · 2020-04-28T21:41:18Z

It's fine.

However, what I am trying to understand is why does this capability require any specific runtime support? Batching / throttling of action can happen purely at the futures level. I could be missing something. I am trying to understand better because and make sure that I am not missing anything.

carllerche · 2020-04-28T21:42:18Z

For example, you could implement throttled_spawn(impl Future) that throttles execution of the future without any specific runtime support. Would this handle your case?

carllerche · 2020-04-28T21:43:39Z

It is also possible to implement a fully self-contained executor that implements throttling. For example, LocalSet is a self-contained scheduler: https://github.com/tokio-rs/tokio/blob/master/tokio/src/task/local.rs

sdroege · 2020-04-29T07:45:09Z

While throttling can be implemented on the futures level, this has the problem that the reactor is still woken up whenever e.g. a new UDP packet is received. The main goal here is to throttle the reactor to (as a result) throttle the overall rate of wakeups and batch them all together.

A self-contained executor/reactor (both combined) might be an option, I guess we missed that enough API for implementing that from the outside is exposed in tokio 0.2. That might be worth trying. However as @fengalin mentioned above, we're currently looking into moving to a custom, minimal runtime instead. This issue (and generally quite custom behaviour) is of course one big reason for that, but overall we need more direct control over the runtime behavior than what tokio can provide and also only really need maybe 10% of the features tokio offers :) It's a very specific use-case and it seems unrealistic that a general purpose runtime would be fully adaptable to that, or to expect that such niche features like this one here could be merged into tokio (and increase the maintenance burden for everyone).

fengalin changed the title ~~Fengalin/throttling~~ Optional BasicScheduler throttling Dec 22, 2019

fengalin force-pushed the fengalin/throttling branch 6 times, most recently from fc1e764 to 23e4dc2 Compare December 24, 2019 12:06

fengalin force-pushed the fengalin/throttling branch 4 times, most recently from 99c7b5a to a05771c Compare December 25, 2019 10:06

fengalin changed the title ~~Optional BasicScheduler throttling~~ rt: add optional throttling to BasicScheduler Dec 26, 2019

fengalin force-pushed the fengalin/throttling branch 2 times, most recently from 73852c3 to e759c81 Compare January 8, 2020 16:09

fengalin requested a review from carllerche January 22, 2020 13:58

fengalin force-pushed the fengalin/throttling branch 2 times, most recently from aad2846 to 3136c44 Compare February 28, 2020 15:58

sdroege and others added 2 commits April 11, 2020 19:51

PoC for throttling (not including timers)

22179c7

Origin: tokio-rs#1887 (comment)

Add optional throttling for BasicScheduler

0daa026

fengalin force-pushed the fengalin/throttling branch from 4ef1a66 to 5f95c38 Compare April 11, 2020 17:58

vorot93 added A-tokio Area: The main tokio crate C-enhancement Category: A PR with an enhancement or bugfix. M-runtime Module: tokio/runtime labels Apr 24, 2020

fengalin force-pushed the fengalin/throttling branch from 5f95c38 to 37cb931 Compare April 24, 2020 12:55

fix compilation with rustc pre 1.42.0

2fd1551

fengalin closed this Apr 28, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

rt: add optional throttling to BasicScheduler #2016

rt: add optional throttling to BasicScheduler #2016

fengalin commented Dec 22, 2019

fengalin commented Dec 22, 2019

fengalin commented Dec 23, 2019 •

edited

Loading

fengalin commented Dec 24, 2019

ghost commented Jan 10, 2020

sdroege commented Jan 10, 2020

carllerche commented Apr 28, 2020

fengalin commented Apr 28, 2020

carllerche commented Apr 28, 2020

carllerche commented Apr 28, 2020

carllerche commented Apr 28, 2020

sdroege commented Apr 29, 2020

rt: add optional throttling to BasicScheduler #2016

rt: add optional throttling to BasicScheduler #2016

Conversation

fengalin commented Dec 22, 2019

Motivation

Solution

Tests

fengalin commented Dec 22, 2019

fengalin commented Dec 23, 2019 • edited Loading

fengalin commented Dec 24, 2019

ghost commented Jan 10, 2020

sdroege commented Jan 10, 2020

carllerche commented Apr 28, 2020

fengalin commented Apr 28, 2020

carllerche commented Apr 28, 2020

carllerche commented Apr 28, 2020

carllerche commented Apr 28, 2020

sdroege commented Apr 29, 2020

fengalin commented Dec 23, 2019 •

edited

Loading