Tokio Util LongRunningTaskDetector. (https://github.com/tokio-rs/console/issues/150) #6256

zolyfarkas · 2023-12-30T16:48:31Z

Motivation

The motivation for this pull request is to provide a solution for the problem described at: tokio-rs/console#150

Blocking calls in the async workers lead to scalability issues, being able to detect them so that code can be fixed is very important for the scalability of an async-io service.

Solution

LongRunningTaskDetector uses a sampling black box approach to detect when processing in the tokio runtime in blocked for more than a configurable amount.

When blocking is detected the stack traces of the runtime worker threads can be dumped (action is plugable), to allow us to identify what is happening in the worker threads.

On detection, details can contain precise info about the blocking culprit and will look like:

Detected worker blocking, signaling SIGUSR1 worker threads: [123145546047488]
Stack trace for thread "test_blocking_detection_current":123145546047488
   0: std::backtrace_rs::backtrace::libunwind::trace
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/../../backtrace/src/backtrace/libunwind.rs:93:5
   1: std::backtrace_rs::backtrace::trace_unsynchronized
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2: std::backtrace::Backtrace::create
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/backtrace.rs:331:13
   3: std::backtrace::Backtrace::force_capture
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/backtrace.rs:313:9
   4: tokio_util::lrtd::lrtd::signal_handler
             at ./src/lrtd/lrtd.rs:221:21
   5: __sigtramp
   6: ___semwait_signal
   7: <unknown>
   8: std::sys::unix::thread::Thread::sleep
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/sys/unix/thread.rs:241:20
   9: std::thread::sleep
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/std/src/thread/mod.rs:872:5
  10: lrtd::run_blocking_stuff::{{closure}}
             at ./tests/lrtd.rs:12:5
  11: lrtd::test_blocking_detection_current::{{closure}}
             at ./tests/lrtd.rs:57:30
  12: <core::pin::Pin<P> as core::future::future::Future>::poll
             at /rustc/a28077b28a02b92985b3a3faecf92813155f1ea1/library/core/src/future/future.rs:125:9
  13: tokio::runtime::scheduler::current_thread::CoreGuard::block_on::{{closure}}::{{closure}}::{{closure}}
....

This is the output from the unit tests (run_blocking_stuff).

Not sure I put this in the right place, and there is likely some polishing left to do, but this should be good enough for some initial feedback.

… TOKIO-LRTD

tokio-util/src/lrtd/lrtd.rs

Darksonn · 2023-12-30T17:22:51Z

Before you implement a significant large new feature, it is really important that you talk to us first. This is crucial to ensure that you don't waste a lot of time on something that can't be merged.

The proposal here appears to have significant overlap with the taskdump proposal. However, it is special-cased to also include a detection mechanism for blocking tasks, and to trigger the task dump only when that detection triggers. I think that for something like this, it makes more sense to separate the task dump functionality from the detection of blocking tasks, so that the task dumps can be used in other cases as well.

The specific way that this is implemented will not work for the reasons that @sfackler points out. However, when @jswrenn investigated the different ways that task dumps can be implemented, he outlined one approach that is similar to this. However, for the reasons outlined in this document, we used a different implementation in the end. That said, I think it makes sense to add something similar to this in the future to handle "very stuck tasks", as is necessary for getting stack traces into tasks that are blocking the thread like you are doing here. Though, I'm not sure if we are ready to do that right now. At least, there are other improvements in the taskdump area that would be higher priority to me.

You may want to open a thread on our discord channel, or a new issue, for discussion of the best way forward here.

zolyfarkas · 2023-12-30T18:20:34Z

Thanks for your feedback, and for the extra info.

This is a "RFC PR" of some fun I had on a flight today. Even if this will not be merged, I would not consider it a waste of time :-).

I agree with separating the thread stack dump from the detection mechanism. I was not able to quickly find a good way to dump stack traces in rust for a bunch of threads, but if anyone can point me to a solution I would be happy to update the implementation. (or merge a PR against this PR)

Darksonn · 2023-12-30T19:14:35Z

This is the proposal that @jswrenn suggested for that:

Worker Pausing

Finally, I will implement the ability to trace currently-polling tasks. To do this, I will introduce a mechanism for pausing Tokio's worker threads, Handle::in_pause, that consumes a closure and runs it while Tokio's worker threads are paused. I have thought of two possible mechanisms for achieving this.

With ptrace

Linux's ptrace API provides a simple and reliable means for pausing a process's threads.

Upon invoking Handle::in_pause, Tokio will spawn-and-join a thread that:

locks the Shared structure at handle.blocking_spawner.inner.shared

forks the process

child:

uses ptrace to pause every PID in shared.worker_threads

notifies parent that all threads are paused

waits to receive notification from parent that tracing is complete

terminates

parent:

waits for child's notification that all workers are paused

traces the internal state all workers

notifies the child that tracing is complete

waits for the child to terminate

returns the trace

Special configuration during deployment is required for the ptrace API to be available. You must do either of the following:

globally allow applications to ptrace their parents by writing 1 to /proc/sys/kernel/yama/ptrace_scope

set the CAP_SYS_PTRACE capability on your application's binary, to allow it to ptrace arbitrary processes

A prototype implementation of this approach is viewable here: commit b999c71

With Signal Interrupts

Alternatively, we can pause Tokio's workers by registering a signal handler that blocks until tracing is complete. Upon invoking Handle::in_pause, Tokio will spawn-and-join a thread that sends some particular signal to each of Tokio's worker threads, traces the runtime, then notifies the signal handlers to unblock and return.

The primary downside of this approach is the risk of choosing a signal kind that's already handled by the user's application. Special care will need to be taken to avoid deadlocks, too.

This was taken from the latest edit on Feb 15 of the top-level post in #5457. (It has since been edited.)

Darksonn

If you want to move forward with the taskdump component of this, then I think that the best way to do that would be to modify the taskdumps in the main Tokio crate to have special handling for the case where there is a very stuck task that isn't returning from poll. In that case, it would use signals or ptrace to get a stacktrace into that task, instead of the usual mechanism for tasks that are well-behaved.

tokio-util/Cargo.toml

… TOKIO-LRTD

Darksonn · 2024-01-01T19:21:10Z

It looks like you moved the long-running task detector into the Tokio crate, and removed the code for capturing taskdumps of blocking tasks.

My previous comment was mostly referring to taskdumps of blocking tasks. I don't think the long-running task detector is a good fit for the main Tokio crate. We generally try to keep debugging features outside of Tokio when that is possible, and in this case, that is possible. Perhaps it would be a better fit for tokio-metrics, or an entirely new crate instead?

zolyfarkas · 2024-01-01T20:50:14Z

My previous comment was mostly referring to taskdumps of blocking tasks.

I am still experimenting with the task dump code, but based on what I am seeing I am not sure it is practical for LRTD due to the fact that LRTD does detect "very stuck tasks" (and it is important that it does) and from what I see at that is not something supported?

I think a "traditional stack trace" obtained via a productionized version of what I use in one of the unit tests is the way to go. (This can happen at a later stage)

We generally try to keep debugging features outside of Tokio when that is possible, and in this case, that is possible. Perhaps it would be a better fit for tokio-metrics, or an entirely new crate instead?

I agree, lrtd is an independent component, I can easily move it to a separate crate, does not really fit into a "metrics" crate.
One minor issue is that I will not be able to use FastRand then, and I will likely have to copy something like FastRand instead of just referencing it.

Darksonn · 2024-01-01T20:56:45Z

Right, we do not support detection of very stuck tasks right now, and as the article points out, it isn't as critical as you can debug them with gdb. But I do think that it would be an interesting feature to have (with one of the implementations outlined in #6256 (comment)). If nothing else, we probably need to do something more intelligent than just hanging the task dump forever like we do today.

One minor issue is that I will not be able to use FastRand then, and I will likely have to copy something like FastRand instead of just referencing it.

If it's not in the main Tokio crate, then I don't object to the use of the rand crate. Or, of course, you can just copy its implementation.

zolyfarkas · 2024-01-01T22:40:11Z

@Darksonn I am thinking of best location for now is tokio-util as a lrtd feature. Not sure it is worth creating a separate crate for this... What do you think?

The implementation allows a custom handler to be invoked on detection, so any future utility/approach can be easily used.

Darksonn · 2024-01-01T23:27:41Z

I think that the reality is that none of our crates have anything similar to your lrtd utility, and there's no great place to put it. I don't think tokio-util is a good fit. It's a crate with utilities for writing applications, not debugging tools.

Your original motivation was to fix tokio-rs/console#150. Perhaps it makes more sense to pursue a lint in tokio-console? A lint in that tool would be much less "ad hoc" and more easily discoverable, since anyone using tokio-console would get it automatically. Furthermore, tokio-console is already my standard recommendation for debugging blocking tasks.

Alternatively, the tokio-metrics crate provides something called task metrics, and it would be useful to have a task metric for whether a specific task is blocking. Of course, I acknowledge that this would require an entirely different implementation, and that you have already spent time on the current one ... but something using a different implementation based on task metrics would be a good fit for tokio-metrics.

As it stands right now, I don't think I can accept the current implementation strategy for detecting long running tasks anywhere in Tokio. If you want to publish it, you will need to make your own crate for it. Therefore, I will close this PR.

If you want to pursue an implementation of capturing taskdumps for blocking tasks, then please create a new PR for that. An implementation of that is welcome and I would be happy to review it.

zolyfarkas · 2024-01-02T12:01:53Z

It's a crate with utilities for writing applications, not debugging tools.

I will provide some extra context here on how we have been using LRTD (for python and c++) for the past few years. We are using it continuously in production, and some of the decisions (on top of emitting debug detail) one might take on blocking detection is to report the node unhealthy so that the container can be recycled sooner. The emitted stack trace details enable owners to quickly identify the issue and push out a fix, further enhancing the scalability and responsiveness of their services. The blocking event counts are used to test the health of the canary tier, etc... (I see here your point about tokio-metrics)

Lrtd is designed to have very small(configurable) overhead so that it can be safely run in production, it is a little bit more than a debug tool.

@Darksonn Right now I am thinking tokio-metrics, let me know what you think...

zolyfarkas · 2024-01-02T18:20:10Z

Will continue this in: tokio-rs/tokio-metrics#57

carllerche · 2024-01-02T18:51:56Z

Thanks for the proof of concept. I do think solving the "detect blocking tasks" feature would be nice to get.

I looked through this, and if I understand correctly, it will detect if all runtime threads are 100% blocked, but if a single thread is blocked, it won't. It also doesn't detect tasks that "block" for short periods (say 100ms). Ideally, we will develop a strategy that handles both of these cases, and I believe it will require work in the Tokio runtime itself.

At a high level, an epoch-based strategy will work best. Each worker thread has a monotonically increasing number that represents its epoch. The worker thread will increment its epoch every so often ¹. There will then be a "monitoring" thread that periodically checks that each worker's epoch increases regularly. This strategy would at least detect that there are tasks with slow poll times with minimal overhead. Identifying the exact ones that are slow would require some additional work.

However, your utility to detect a blocked runtime is generally useful. It would be better to release it as your crate instead of pulling it in tokio-util or tokio-metrics, as we have limited maintenance bandwidth and want a more comprehensive solution in the long term. If you'd like to work on the runtime to detect slow polls, please try pinging me on Discord, and we can talk more about that.

probably some combination of every N polled tasks & checking elapsed duration. ↩

zolyfarkas added 13 commits December 28, 2023 12:02

First "working" POC.

1ea1bb3

First "working" POC to implement tokio-rs/console#150

03f3bcc

Merge branch 'TOKIO-LRTD' of https://github.com/zolyfarkas/tokio into…

9dcefab

… TOKIO-LRTD

[cleanup] remove unused method.

77c61f5

Add ability to hook actions on detection.

e6c9c6c

[add] document action traits + minor improvement.

144cfff

[add] More documentation + test.

6f00e2e

[add] one more comment.

fb4cc02

[cleanup] fix all warnings + code cleanup.

5247795

Fix tests + remove unsafe stuff.

73c8bc5

FIX formatting

6e77b5e

Cleanup unnecessary drop.

df172e8

[fix] doctest

86bbf4c

sfackler reviewed Dec 30, 2023

View reviewed changes

tokio-util/src/lrtd/lrtd.rs Outdated Show resolved Hide resolved

Darksonn added A-tokio-util Area: The tokio-util crate M-taskdump --cfg tokio_taskdump labels Dec 30, 2023

Merge branch 'master' into TOKIO-LRTD

f5e1ab0

Fix formatting issue

f41f470

Darksonn reviewed Dec 30, 2023

View reviewed changes

tokio-util/Cargo.toml Outdated Show resolved Hide resolved

tokio-util/Cargo.toml Outdated Show resolved Hide resolved

zolyfarkas added 8 commits December 31, 2023 05:05

remove nix dependency

e978a2c

Separate out blocking detection from thread state dump.

ab43c31

Move lrtd into tokio (out of tokio-util)

97572a9

Move lrtd into tokio (out of tokio-util)

78deaf8

Merge branch 'TOKIO-LRTD' of https://github.com/zolyfarkas/tokio into…

8a66f5b

… TOKIO-LRTD

Fix some merge issue

067b0c4

Fix lint issues.

a9f009b

Cleanup lint issues.

22b4df3

zolyfarkas added 3 commits January 1, 2024 09:44

FIX more clippy lints

3de7422

undo some unintended changes

a9caaa9

make is_current_threaded public

3561be9

Darksonn closed this Jan 1, 2024

zolyfarkas added a commit to zolyfarkas/tokio-metrics that referenced this pull request Jan 2, 2024

Moved LRTD from tokio-rs/tokio#6256

01bf082

zolyfarkas mentioned this pull request Jan 2, 2024

Tokio LongRunningTaskDetector. (https://github.com/tokio-rs/console/issues/150) tokio-rs/tokio-metrics#57

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tokio Util LongRunningTaskDetector. (https://github.com/tokio-rs/console/issues/150) #6256

Tokio Util LongRunningTaskDetector. (https://github.com/tokio-rs/console/issues/150) #6256

zolyfarkas commented Dec 30, 2023 •

edited

Loading

Darksonn commented Dec 30, 2023

zolyfarkas commented Dec 30, 2023

Darksonn commented Dec 30, 2023

Worker Pausing

With ptrace

With Signal Interrupts

Darksonn left a comment

Darksonn commented Jan 1, 2024

zolyfarkas commented Jan 1, 2024

Darksonn commented Jan 1, 2024

zolyfarkas commented Jan 1, 2024

Darksonn commented Jan 1, 2024

zolyfarkas commented Jan 2, 2024 •

edited

Loading

zolyfarkas commented Jan 2, 2024

carllerche commented Jan 2, 2024

Tokio Util LongRunningTaskDetector. (https://github.com/tokio-rs/console/issues/150) #6256

Tokio Util LongRunningTaskDetector. (https://github.com/tokio-rs/console/issues/150) #6256

Conversation

zolyfarkas commented Dec 30, 2023 • edited Loading

Motivation

Solution

Darksonn commented Dec 30, 2023

zolyfarkas commented Dec 30, 2023

Darksonn commented Dec 30, 2023

Worker Pausing

With ptrace

With Signal Interrupts

Darksonn left a comment

Choose a reason for hiding this comment

Darksonn commented Jan 1, 2024

zolyfarkas commented Jan 1, 2024

Darksonn commented Jan 1, 2024

zolyfarkas commented Jan 1, 2024

Darksonn commented Jan 1, 2024

zolyfarkas commented Jan 2, 2024 • edited Loading

zolyfarkas commented Jan 2, 2024

carllerche commented Jan 2, 2024

Footnotes

zolyfarkas commented Dec 30, 2023 •

edited

Loading

zolyfarkas commented Jan 2, 2024 •

edited

Loading