-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Removed turn() API on the Runtime from 0.2.0-alpha.6 to 0.2.0 #1887
Comments
Interesting. Re 1. it would be useful to get some benchmarks together on tokio 0.2 to demonstrate this. |
I can prepare something, but that's nothing tokio really can mitigate without throttling calls to I'm unsure however how to prepare such a benchmark. I can show you that tokio 0.2.0-alpha.6 with throttling has a lot more throughput than tokio 0.2.0 stable without throttling, but that's apples and oranges :) I can also show tokio 0.2.0-alpha.6 with vs. without throttling. |
Just a little app that demonstrates a case where throttling is helpful. That would be something to experiment with. |
After docs, I plan on setting up a benchmark suite... stuff like ^^ that demonstrates "real world" patterns would be helpful to add. |
See my blog post I linked above, but I can prepare a new version of that just on top of tokio without other dependencies. That probably helps, and that throttling improves the situation then could be shown by simply adding a I'll take a look at that later today or tomorrow. |
I have a small example, will clean up and put it up somewhere later. But 1000 UDP sockets, one packet every 20ms on each gives about 22% CPU with a single basic runtime and about 23% with two basic runtimes in separate threads. With throttling to call Compared to my benchmarks a 1.5 years ago, and with tokio 0.1 and additional overhead from GStreamer, these are very similar results. With even more sockets the effect will be more visible, I'll create a table with various results later. Note: For both cases I changed the |
I agree that there is probably some strategy we could use to throttle calls to the I/O driver. |
Code can be found here. Run with My measurements before we slightly wrong, I implemented the throttling wrong. Patch can be found at the bottom.
The X is with the default runtime, creates 4 threads here. Patch for throttling below. This does not consider the throttling for the timers (but should): diff --git a/tokio/src/runtime/basic_scheduler.rs b/tokio/src/runtime/basic_scheduler.rs
index c674b961..c5427f20 100644
--- a/tokio/src/runtime/basic_scheduler.rs
+++ b/tokio/src/runtime/basic_scheduler.rs
@@ -71,6 +71,8 @@ struct LocalState<P> {
/// Thread park handle
park: P,
+
+ last_tick: Option<std::time::Instant>,
}
#[derive(Debug)]
@@ -110,7 +112,7 @@ where
pending_drop: task::TransferStack::new(),
unpark: Box::new(unpark),
}),
- local: LocalState { tick: 0, park },
+ local: LocalState { tick: 0, park, last_tick: None },
}
}
@@ -218,7 +220,7 @@ impl Spawner {
impl SchedulerPriv {
fn tick(&self, local: &mut LocalState<impl Park>) {
- for _ in 0..MAX_TASKS_PER_TICK {
+ loop {
// Get the current tick
let tick = local.tick;
@@ -227,10 +229,7 @@ impl SchedulerPriv {
let task = match self.next_task(tick) {
Some(task) => task,
- None => {
- local.park.park().ok().expect("failed to park");
- return;
- }
+ None => break,
};
if let Some(task) = task.run(&mut || Some(self.into())) {
@@ -240,9 +239,21 @@ impl SchedulerPriv {
}
}
+ if let Some(last_tick) = local.last_tick {
+ use std::thread;
+
+ let now = std::time::Instant::now();
+ let diff = now - last_tick;
+ const WAIT: std::time::Duration = std::time::Duration::from_millis(20);
+ if diff < WAIT {
+ thread::sleep(WAIT - diff);
+ }
+ }
+ local.last_tick = Some(std::time::Instant::now());
+
local
.park
- .park_timeout(Duration::from_millis(0))
+ .park()
.ok()
.expect("failed to park");
} |
Or alternatively it would be great if there was API that would allow to replace the runtime or parts of it with a custom runtime, like there was before :) |
I'm not against it. The permutation details need to be figured out. |
Closing due to inactivity. For future reference, I am not necessarily against adding the ability to configure throttling, but I would like to see it demonstrated that doing it at the Tokio level provides measurable benefit over implementing batching logic in userland. |
In tokio 0.2.0-alpha.6 it was still possible to construct a "Runtime" yourself by taking
tokio-reactor
,tokio-timer
andtokio-current-thread
and putting them together. Then it could berun()
orturn()
ed.With tokio 0.2.0 merging everything into a single crate and reorganizing everything this is not possible anymore, which makes it hard to port one of our projects from the alpha version to the stable version.
I should probably start by giving some context. The project in question is a GStreamer plugin,
gst-plugin-threadshare
that allows to use tokio as a scheduler to use fewer kernel threads for lower resource usage and higher throughput. You can also find a blog post by me with some numbers and more details.Now the reason for putting together our own runtime here were the following
epoll()
or similar, i.e. the reactor. By doing so we reduce the number of wakeups (and thus context switches), which is considerably reducing the CPU usage and increasing the throughput a lot. See my blog post for some details. Maybe this is a feature that would also be useful in tokio?1.1 Because of the throttling it was necessary to implement our own timer infrastructure, as tokio's timers don't know anything about the throttling and would usually be triggered much later than needed. By knowing the throttling interval, our own timers would at most trigger half of the interval too early or too late, and not on average half the interval too late. Also back in tokio 0.1 it seemed like the tokio interval timers were actually drifting when throttled.
1.2 The custom timers implementation had to be wrapped around the calls to
turn()
and also needed a way tounpark()
the reactor whenever the list of timers changed in a way that the next wakeup would be earlier.From what I can see, 2. is not necessary anymore nowadays with the
basic_scheduler()
feature of the runtimeBuilder
. 1. is still necessary.What would you suggest for moving forward with this? Adding such a throttling feature to tokio directly? Exposing ways to hook into the runtime behaviour for implementing this outside tokio again somehow? Anything else? :)
The text was updated successfully, but these errors were encountered: