-
-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
completed task shows warning for having lost it's waker #345
Comments
I think the issue is that the task's span is (incorrectly) being kept alive by the child task's span. We'll probably want to change the |
I think I've encountered this issue. I also want to highlight that the task that has completed (I've I also don't think that this issue requires using |
Another option that could be worth considering is giving the task spans some way of explicitly indicating that the task has terminated (e.g. it could
|
I'm sorry for the off-topic discussion but searching for this warning brought me here. I'm new to async in Rust and saw this warning while debugging an issue. Can anyone point me towards a resource for why this warning occurs (I.e. what pattern in async Rust leads to such a scenario and how to fix it)? I'm trying to determine if it's a console bug as indicated in this issue, or a real bug in my own code. |
I'm seeing this when using actix-web. tokio-console reports lost wakers in: ~/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/actix-rt-2.7.0/src/runtime.rs:80:20 which look like regular uses of I also get one in ~/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/hyper-0.14.23/src/common/exec.rs:49:21 and another reported leak is in tokio itself: ~/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-1.23.0/src/task/local.rs:586:35 Is this something I should be worried about? My server leaks about 20 wakers almost instantly, and it usually goes up to 50 before it becomes quite unresponsive. I don't know if that's the cause, or a symptom of some other issue, or just something unrelated. |
I'm seeing this when holding on to a use std::sync::{Arc, Mutex};
use tokio::{sync::mpsc, task::JoinSet};
async fn handler(holder: Arc<Mutex<Vec<mpsc::WeakSender<()>>>>) {
let (handler_tx, _) = mpsc::channel::<()>(1);
let mut holder = holder.lock().unwrap();
holder.push(handler_tx.downgrade()); // Push `WeakSender` to Vec.
drop(holder);
panic!(); // Not required to reproduce.
}
#[tokio::main(flavor = "current_thread")]
async fn main() {
console_subscriber::init();
let mut set = JoinSet::new();
let holder = Arc::new(Mutex::new(Vec::new()));
for _ in 0..16 {
set.spawn(handler(holder.clone()));
}
// Properly await all exited tasks.
while let Some(_) = set.join_next().await {}
std::future::pending::<()>().await;
} When run, I would expect this to leak the 16 |
In the Tokio instrumentation, a tracing span is created for each task which is spawned. Since the new span is created within the context of where `tokio::spawn()` (or similar) is called from, it gets a contextual parent attached. In tracing, when a span has a child span (either because the child was created in the context of the parent, or because the parent was set explicitly) then that span will not be closed until the child has closed. The result in the console subscriber is that a task which spawns another task won't have a `dropped_at` time set until the spawned task exits, even if the parent task exits much earlier. This causes Tokio Console to show an incorrect lost waker warning (#345). It also affects other spans that are entered when a task is spawned (#412). The solution is to modify the instrumentation in Tokio so that task spans are explicit roots (`parent: None`). This will be done as part of enriching the Tokio instrumentation (tokio-rs/tokio#5792). This change adds functionality to the test framework within `console-subscriber` so that the state of a task can be set as an expectation. The state is calculated based on 4 values: * `console_api::tasks::Stats::dropped_at` * `console_api::tasks::Stats::last_wake` * `console_api::PollStats::last_poll_started` * `console_api::PollStats::last_poll_ended` It can then be tested that a task that spawns another task and then ends actually goes to the `Completed` state, even if the spawned task is still running. As of Tokio 1.33.0, this test fails, but the PR FIXME:TBD fixes this and the test should pass from Tokio 1.34 onwards.
In Tokio, tasks are optionally instrumented with tracing spans to allow analysis of the runtime behavior to be performed with tools like tokio-console. The span that is created for each task gets currently follows the default tracing behavior and has a contextual parent attached to it based on the span that is actual when `tokio::spawn` or similar is called. However, in tracing, a span will remain "alive" until all its children spans are closed. This doesn't match how spawned tasks work. A task may outlive the context in which is was spawned (and frequently does). This causes tasks which spawn other - longer living - tasks to appear in `tokio-console` as having lost their waker when instead they should be shown as completed (tokio-rs/console#345). It can also cause undesired behavior for unrelated tracing spans if a subscriber is receiving both the other spans as well as Tokio's instrumentation. To fix this mismatch in behavior, the task span has `parent: None` set on it, making it an explicit root - it has no parent. The same was already done for all spans representing resources in #6107. This change is made within the scope of #5792. Due to a defect in the currently available `tracing-mock` crate, it is not possible to test this change at a tracing level (tokio-rs/tracing#2440). Instead, a test for the `console-subscriber` has been written which shows that this change fixes the defect as observed in `tokio-console` (tokio-rs/console#490).
In the Tokio instrumentation, a tracing span is created for each task which is spawned. Since the new span is created within the context of where `tokio::spawn()` (or similar) is called from, it gets a contextual parent attached. In tracing, when a span has a child span (either because the child was created in the context of the parent, or because the parent was set explicitly) then that span will not be closed until the child has closed. The result in the console subscriber is that a task which spawns another task won't have a `dropped_at` time set until the spawned task exits, even if the parent task exits much earlier. This causes Tokio Console to show an incorrect lost waker warning (#345). It also affects other spans that are entered when a task is spawned (#412). The solution is to modify the instrumentation in Tokio so that task spans are explicit roots (`parent: None`). This will be done as part of enriching the Tokio instrumentation (tokio-rs/tokio#5792). This change adds functionality to the test framework within `console-subscriber` so that the state of a task can be set as an expectation. The state is calculated based on 4 values: * `console_api::tasks::Stats::dropped_at` * `console_api::tasks::Stats::last_wake` * `console_api::PollStats::last_poll_started` * `console_api::PollStats::last_poll_ended` It can then be tested that a task that spawns another task and then ends actually goes to the `Completed` state, even if the spawned task is still running. As of Tokio 1.34.0, this test fails, but the PR tokio-rs/tokio#XXXX fixes this and the test should pass from Tokio 1.35 onwards.
In the Tokio instrumentation, a tracing span is created for each task which is spawned. Since the new span is created within the context of where `tokio::spawn()` (or similar) is called from, it gets a contextual parent attached. In tracing, when a span has a child span (either because the child was created in the context of the parent, or because the parent was set explicitly) then that span will not be closed until the child has closed. The result in the console subscriber is that a task which spawns another task won't have a `dropped_at` time set until the spawned task exits, even if the parent task exits much earlier. This causes Tokio Console to show an incorrect lost waker warning (#345). It also affects other spans that are entered when a task is spawned (#412). The solution is to modify the instrumentation in Tokio so that task spans are explicit roots (`parent: None`). This will be done as part of enriching the Tokio instrumentation (tokio-rs/tokio#5792). This change adds functionality to the test framework within `console-subscriber` so that the state of a task can be set as an expectation. The state is calculated based on 4 values: * `console_api::tasks::Stats::dropped_at` * `console_api::tasks::Stats::last_wake` * `console_api::PollStats::last_poll_started` * `console_api::PollStats::last_poll_ended` It can then be tested that a task that spawns another task and then ends actually goes to the `Completed` state, even if the spawned task is still running. As of Tokio 1.34.0, this test fails, but the PR tokio-rs/tokio#6158 fixes this and the test should pass from Tokio 1.35 onwards.
In Tokio, tasks are optionally instrumented with tracing spans to allow analysis of the runtime behavior to be performed with tools like tokio-console. The span that is created for each task gets currently follows the default tracing behavior and has a contextual parent attached to it based on the span that is actual when `tokio::spawn` or similar is called. However, in tracing, a span will remain "alive" until all its children spans are closed. This doesn't match how spawned tasks work. A task may outlive the context in which is was spawned (and frequently does). This causes tasks which spawn other - longer living - tasks to appear in `tokio-console` as having lost their waker when instead they should be shown as completed (tokio-rs/console#345). It can also cause undesired behavior for unrelated tracing spans if a subscriber is receiving both the other spans as well as Tokio's instrumentation. To fix this mismatch in behavior, the task span has `parent: None` set on it, making it an explicit root - it has no parent. The same was already done for all spans representing resources in #6107. This change is made within the scope of #5792. Due to a defect in the currently available `tracing-mock` crate, it is not possible to test this change at a tracing level (tokio-rs/tracing#2440). Instead, a test for the `console-subscriber` has been written which shows that this change fixes the defect as observed in `tokio-console` (tokio-rs/console#490).
This issue is now fixed in the Tokio instrumentation on |
I think this can be closed now, the new tokio version includes the fix |
* first iteration: add `tracing` feature of tokio, and the console-subscriber * update on tokio version, based on: tokio-rs/console#345 (comment) * use `tracing` library, instead of log. * remove unnecessary logs; add necessary logs; add timeout in `read_message_from_stellar` because it gets stuck * cleanup zombie task * Update README.md add documentation of the `tokio-console`. * remove duplicate `Parachain Block Listener` * https://github.com/pendulum-chain/spacewalk/pull/517/files#r1600267305 * #517 (comment), #517 (comment), * https://github.com/pendulum-chain/spacewalk/actions/runs/9094123182/job/24994504193?pr=517 * remocehttps://github.com/pendulum-chain/spacewalk/actions/runs/9095519314/job/24998905803?pr=517 * https://github.com/pendulum-chain/spacewalk/actions/runs/9096987912/job/25003758078?pr=517 * https://github.com/pendulum-chain/spacewalk/actions/runs/9098087121/job/25007476694?pr=517 * https://github.com/pendulum-chain/spacewalk/actions/runs/9108826563/job/25040418069?pr=517
When instrumenting resources in Tokio, a span is created for each resource. Previously, all resources inherited the currently active span as their parent (tracing default). However, this would keep that parent span alive until the resource (and its span) were dropped. This is often not correct, as a resource may be created in a task and then sent elsewhere, while the originating task ends. This artificial extension of the parent span's lifetime would make it look like that task was still alive (but idle) in any system reading the tracing instrumentation in Tokio, for example Tokio Console as reported in tokio-rs/console#345. In #6107, most of the existing resource spans were updated to make them explicit roots, so they have no contextual parent. However, 2. were missed: - `Sleep` - `BatchSemaphore` This change alters the resource spans for those 2 resources to also make them explicit roots. (so that the span doesn't have a parent). This is necessary, because otherwise a resource that is created inside a task (or some other span) and then sent elsewhere will keep that contextual parent span open (but not necessarily active) for the lifetime of the resource itself.
When instrumenting resources in Tokio, a span is created for each resource. Previously, all resources inherited the currently active span as their parent (tracing default). However, this would keep that parent span alive until the resource (and its span) were dropped. This is often not correct, as a resource may be created in a task and then sent elsewhere, while the originating task ends. This artificial extension of the parent span's lifetime would make it look like that task was still alive (but idle) in any system reading the tracing instrumentation in Tokio, for example Tokio Console as reported in tokio-rs/console#345. In #6107, most of the existing resource spans were updated to make them explicit roots, so they have no contextual parent. However, 2. were missed: - `Sleep` - `BatchSemaphore` This change alters the resource spans for those 2 resources to also make them explicit roots.
…ts (#6727) When instrumenting resources in Tokio, a span is created for each resource. Previously, all resources inherited the currently active span as their parent (tracing default). However, this would keep that parent span alive until the resource (and its span) were dropped. This is often not correct, as a resource may be created in a task and then sent elsewhere, while the originating task ends. This artificial extension of the parent span's lifetime would make it look like that task was still alive (but idle) in any system reading the tracing instrumentation in Tokio, for example Tokio Console as reported in tokio-rs/console#345. In #6107, most of the existing resource spans were updated to make them explicit roots, so they have no contextual parent. However, 2. were missed: - `Sleep` - `BatchSemaphore` This change alters the resource spans for those 2 resources to also make them explicit roots.
with tokio version 1.39.2 and console-subscriber v0.4.0, I see such warnings, It seems a incorrect warning |
@gftea Could you please provide a screenshot showing the tasks that are showing the warnings? And also describe why you believe that it is incorrect? |
According to tokio console, that task has been idle for over an hour and has never been woken, which would imply that it awaited at line 62 ( Do you believe the task has actually finished? |
yes, this is the task that waiting for TERM/KILL signals, so it would be long waiting tasks until service is shutdown by user, and I tested it, after this is reported, I do send kill signal and the task handle shutdown signal correctly |
What crate(s) in this repo are involved in the problem?
tokio-console
What is the issue?
tokio-console shows a warning for a task having lost it's waker. But the task should be finished.
I think it's related to the task spawning an other task
How can the bug be reproduced?
The problem can be reproduced with the following sample application.
Connect to 127.0.0.1:5555 and it will spawn 2 tasks. The first one should be completed after 1 second. But in tokio-console this one will be shown as having lost it's waker.
Logs, error output, etc
Versions
console crates versions:
tokio-console is built from git, version:
Possible solution
No response
Additional context
No response
Would you like to work on fixing this bug?
maybe
The text was updated successfully, but these errors were encountered: