-
-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panics when using tokio_taskdump
#6051
Comments
Thanks for reporting this. I think I know what causes it. |
Previously, we attempted to trace tasks that were already notified, leading to panics. This PR modifies `transition_to_notified_for_tracing` (and its callers) to be fallible, thus avoiding this issue. Fixes tokio-rs#6051
Previously, we attempted to trace tasks that were already notified, leading to panics. This PR modifies `transition_to_notified_for_tracing` (and its callers) to be fallible, thus avoiding this issue. Fixes tokio-rs#6051
Tremendous apologies for the delay in fixing this. We should have a fix merged soon, but if you'd like to test it out in the meantime, check out #6194! |
I thought this was fixed with your change above, but I saw a similar error today with tokio 1.35.0 in a proprietary app.
Looks like the task dump did complete; I guess it was worker threads that panicked rather than the caller of the dump code. Unfortunately this was a rare production thing, and I don't know how to reproduce it. Any suggestions for what info might be useful in understanding the problem, either retroactively (anything I can dig out of those dumps that might be helpful?) or something we can log for next time? This may not be relevant at all, but the dump in question fired because we had some sort of yet-undiagnosed memory leak, then started to bog down with paging in the program binary, then failed container health checks, and have it configured to fire the dump on SIGTERM. I'm planning to set up heap profiling to figure out the memory leak. So far I know nothing—the memory leak could be something totally unrelated in my app or a third-party C++ library or could also be a tokio bug. Who knows. |
The full backtrace would be useful if you have it. Could you file a new bug saying that this assert can still be triggered on 1.35.0? And you are sure that this happened in relation to a taskdump? It's not some unrelated thing going on? |
#6343, with a full stack trace in the attachment. |
Version
Tested in:
Platform
Linux komp 5.4.0-163-generic #180-Ubuntu SMP Tue Sep 5 13:21:23 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Description
Program panics when using
tokio_taskdump
andtokio::time::sleep()
here:https://github.com/tokio-rs/tokio/blob/tokio-1.32.0/tokio/src/runtime/task/state.rs#L118
Details
I used the
examples/dump.rs
file as a template:Ctrl+C
to make it dump the tasks in a loopCommand to run the example:
It panics after a few iterations:
The text was updated successfully, but these errors were encountered: