Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic with tokio::time on shutdown #2789

Closed
Marwes opened this issue Aug 24, 2020 · 16 comments · Fixed by #3228
Closed

Panic with tokio::time on shutdown #2789

Marwes opened this issue Aug 24, 2020 · 16 comments · Fixed by #3228
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. M-time Module: tokio/time

Comments

@Marwes
Copy link
Contributor

Marwes commented Aug 24, 2020

Version

Current master 9d58b70

Platform

Linux markus-XPS-15-9570 4.15.0-112-generic #113-Ubuntu SMP Thu Jul 9 23:41:39 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

Description

Were trying to create a minimal reproduction of an intermittent hang during shutdown that I am seeing. But instead I found this panic. This may or may not be related to my real issue but I imagine it ought to be fixed as well.

#[tokio::test(threaded_scheduler)]

async fn hang_on_shutdown() {
    let (sync_tx, sync_rx) = std::sync::mpsc::channel::<()>();
    tokio::spawn(async move {
        tokio::task::block_in_place(|| sync_rx.recv().ok());
    });

    tokio::spawn(async {
        tokio::time::delay_for(std::time::Duration::from_secs(5)).await;
        drop(sync_tx);
    });
    tokio::time::delay_for(std::time::Duration::from_secs(1)).await;
}
running 1 test
thread 'tokio-runtime-worker' panicked at 'elapsed=18446744073709551615; when=1003', tokio/src/time/wheel/mod.rs:223:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/std/src/panicking.rs:475
   1: std::panicking::begin_panic_fmt
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/std/src/panicking.rs:429
   2: tokio::time::wheel::Wheel<T>::set_elapsed
             at ./src/time/wheel/mod.rs:223
   3: tokio::time::wheel::Wheel<T>::poll
             at ./src/time/wheel/mod.rs:160
   4: tokio::time::driver::Driver<T>::process
             at ./src/time/driver/mod.rs:168
   5: <tokio::time::driver::Driver<T> as tokio::park::Park>::park
             at ./src/time/driver/mod.rs:275
   6: <tokio::park::either::Either<A,B> as tokio::park::Park>::park
             at ./src/park/either.rs:28
   7: tokio::runtime::park::Inner::park_driver
             at ./src/runtime/park.rs:205
   8: tokio::runtime::park::Inner::park
             at ./src/runtime/park.rs:137
   9: <tokio::runtime::park::Parker as tokio::park::Park>::park
             at ./src/runtime/park.rs:93
  10: tokio::runtime::thread_pool::worker::Core::shutdown
             at ./src/runtime/thread_pool/worker.rs:570
  11: tokio::runtime::thread_pool::worker::Shared::shutdown
             at ./src/runtime/thread_pool/worker.rs:794
  12: tokio::runtime::thread_pool::worker::Context::run
             at ./src/runtime/thread_pool/worker.rs:335
  13: tokio::runtime::thread_pool::worker::run::{{closure}}
             at ./src/runtime/thread_pool/worker.rs:305
  14: tokio::macros::scoped_tls::ScopedKey<T>::set
             at ./src/macros/scoped_tls.rs:63
  15: tokio::runtime::thread_pool::worker::run
             at ./src/runtime/thread_pool/worker.rs:302
  16: tokio::runtime::thread_pool::worker::Launch::launch::{{closure}}
             at ./src/runtime/thread_pool/worker.rs:281
  17: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at ./src/runtime/blocking/task.rs:41
  18: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at ./src/runtime/task/core.rs:173
  19: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at ./src/loom/std/unsafe_cell.rs:14
  20: tokio::runtime::task::core::Core<T,S>::poll
             at ./src/runtime/task/core.rs:158
  21: tokio::runtime::task::harness::Harness<T,S>::poll::{{closure}}
             at ./src/runtime/task/harness.rs:107
  22: core::ops::function::FnOnce::call_once
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:233
  23: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:318
  24: std::panicking::try::do_call
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:373
  25: __rust_try
  26: std::panicking::try
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:337
  27: std::panic::catch_unwind
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:394
  28: tokio::runtime::task::harness::Harness<T,S>::poll
             at ./src/runtime/task/harness.rs:89
  29: tokio::runtime::task::raw::poll
             at ./src/runtime/task/raw.rs:104
  30: tokio::runtime::task::raw::RawTask::poll
             at ./src/runtime/task/raw.rs:66
  31: tokio::runtime::task::Notified<S>::run
             at ./src/runtime/task/mod.rs:169
  32: tokio::runtime::blocking::pool::Inner::run
             at ./src/runtime/blocking/pool.rs:250
  33: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}::{{closure}}
             at ./src/runtime/blocking/pool.rs:230
  34: tokio::runtime::context::enter
             at ./src/runtime/context.rs:72
  35: tokio::runtime::handle::Handle::enter
             at ./src/runtime/handle.rs:76
  36: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at ./src/runtime/blocking/pool.rs:229
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
test hang_on_shutdown ... test hang_on_shutdown has been running for over 60 seconds
@Marwes Marwes added A-tokio Area: The main tokio crate C-bug Category: This is a bug. labels Aug 24, 2020
@Darksonn Darksonn added the M-time Module: tokio/time label Aug 24, 2020
@blasrodri
Copy link
Contributor

I checked and

-#[tokio::test(threaded_scheduler)]
+#[tokio::test(basic_scheduler)]

makes it work

@blasrodri
Copy link
Contributor

blasrodri commented Aug 26, 2020

I've gathered some results. Unfortunately, they do not give me any clarity on where the bug(s) is.

Note: I add some debugs:

diff --git a/tokio/src/time/wheel/mod.rs b/tokio/src/time/wheel/mod.rs
index a2ef27fc..dcfd500e 100644
--- a/tokio/src/time/wheel/mod.rs
+++ b/tokio/src/time/wheel/mod.rs
@@ -221,7 +221,7 @@ where
 
     fn set_elapsed(&mut self, when: u64) {
         assert!(
-            self.elapsed <= when,
+            dbg!(self.elapsed) <= dbg!(when),
             "elapsed={:?}; when={:?}",
             self.elapsed,
             when

Test I'm running:

#[tokio::test(threaded_scheduler)]
// #[tokio::test(basic_scheduler)]
async fn hang_on_shutdown() {
    let (sync_tx, sync_rx) = std::sync::mpsc::channel::<()>();
    tokio::spawn(async move {
        tokio::task::block_in_place(|| sync_rx.recv().ok());
    });

    tokio::spawn(async {
        tokio::time::delay_for(std::time::Duration::from_secs(5)).await;
        drop(sync_tx);
    });
    tokio::time::delay_for(std::time::Duration::from_secs(1)).await;
}

Output on different versions

0ccc09a

Panics!

    Finished test [unoptimized + debuginfo] target(s) in 0.03s
     Running target/debug/deps/tokio_test-6db8d7b23010a853

running 1 test
thread 'tokio-runtime-worker' panicked at 'elapsed=18446744073709551615; when=1006', /Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:223:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    Finished test [unoptimized + debuginfo] target(s) in 0.03s
     Running target/debug/deps/tokio_test-6db8d7b23010a853

running 1 test
thread 'tokio-runtime-worker' panicked at 'elapsed=18446744073709551615; when=1003', /Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:223:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
   Compiling tokio v0.3.0 (/Users/bruger/projects/foss/tokio/tokio)
   Compiling tokio-test v0.1.0 (/Users/bruger/projects/foss/tokio-test)
    Finished test [unoptimized + debuginfo] target(s) in 3.11s
     Running target/debug/deps/tokio_test-6db8d7b23010a853

running 1 test
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 965
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 965
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 4096
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 4096
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 4992
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 4992
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 5001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 5001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 18446744073709551615
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 18446744073709551615
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1003
thread 'tokio-runtime-worker' panicked at 'elapsed=18446744073709551615; when=1003', /Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:223:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
    Finished test [unoptimized + debuginfo] target(s) in 0.03s
     Running target/debug/deps/tokio_test-6db8d7b23010a853

running 1 test
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 4096
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 4096
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 4992
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 4992
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 5001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 5001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 18446744073709551615
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 18446744073709551615
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1003
thread 'tokio-runtime-worker' panicked at 'elapsed=18446744073709551615; when=1003', /Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:223:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

21f7260 (v0.22.2)

doesn't panic, but hangs.

   Compiling tokio-macros v0.2.5 (/Users/bruger/projects/foss/tokio/tokio-macros)
   Compiling tokio v0.2.22 (/Users/bruger/projects/foss/tokio/tokio)
   Compiling tokio-test v0.1.0 (/Users/bruger/projects/foss/tokio-test)
    Finished test [unoptimized + debuginfo] target(s) in 5.26s
     Running target/debug/deps/tokio_test-7ef8d298bcf6e80a

running 1 test
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 961
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 961
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 4096
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 4096
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 4100
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 4100
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 4992
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 4992
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 4997
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 4997
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 5001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 5001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 5001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 5001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 5002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 5002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 5002
    Finished test [unoptimized + debuginfo] target(s) in 0.03s
     Running target/debug/deps/tokio_test-7ef8d298bcf6e80a

running 1 test
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 965
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 965
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1002
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 18446744073709551615
test hang_on_shutdown ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

    Finished test [unoptimized + debuginfo] target(s) in 0.03s
     Running target/debug/deps/tokio_test-7ef8d298bcf6e80a

running 1 test
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 965
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 965
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1006
test hang_on_shutdown ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

    Finished test [unoptimized + debuginfo] target(s) in 0.02s
     Running target/debug/deps/tokio_test-7ef8d298bcf6e80a

running 1 test
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 961
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 961
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] self.elapsed = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:224] when = 1006
test hang_on_shutdown ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured; 0 filtered out

@Marwes
Copy link
Contributor Author

Marwes commented Sep 15, 2020

Looked into this a bit. It appears that when the timer wheel shuts down it is supposed to be completely drained by adding a long "elapsed" time

let mut poll = wheel::Poll::new(u64::MAX);
. This then gets set in set_elapsed
fn set_elapsed(&mut self, when: u64) {
assert!(
self.elapsed <= when,
"elapsed={:?}; when={:?}",
self.elapsed,
when
);
if when > self.elapsed {
self.elapsed = when;
}
}
.

However, in the threaded scheduler there is a race such that this "poll" function gets entered again after this, but this time it receives the normal and the assertion is triggered (not sure why all this happens though).

     Running target/debug/deps/rt_threaded-a20c4a90995b564f

running 1 test
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    0,
    1,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    1,
    1,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    1,
    1,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    1,
    960,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    960,
    961,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    961,
    1002,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    1002,
    1003,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    1003,
    1003,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    1003,
    4096,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    4096,
    4992,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    4992,
    5002,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    5002,
    18446744073709551615,
)
[tokio/src/time/wheel/mod.rs:223] (self.elapsed, when) = (
    18446744073709551615,
    1004,
)
thread 'tokio-runtime-worker' panicked at 'elapsed=18446744073709551615; when=1004', tokio/src/time/wheel/mod.rs:224:9
stack backtrace:
   0: rust_begin_unwind
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/std/src/panicking.rs:475
   1: std::panicking::begin_panic_fmt
             at /rustc/7e6d6e5f535321c2223f044caba16f97b825009c/library/std/src/panicking.rs:429
   2: tokio::time::wheel::Wheel<T>::set_elapsed
             at ./src/time/wheel/mod.rs:224
   3: tokio::time::wheel::Wheel<T>::poll
             at ./src/time/wheel/mod.rs:160
   4: tokio::time::driver::Driver<T>::process
             at ./src/time/driver/mod.rs:168
   5: <tokio::time::driver::Driver<T> as tokio::park::Park>::park
             at ./src/time/driver/mod.rs:275
   6: <tokio::park::either::Either<A,B> as tokio::park::Park>::park
             at ./src/park/either.rs:28
   7: tokio::runtime::park::Inner::park_driver
             at ./src/runtime/park.rs:205
   8: tokio::runtime::park::Inner::park
             at ./src/runtime/park.rs:137
   9: <tokio::runtime::park::Parker as tokio::park::Park>::park
             at ./src/runtime/park.rs:93
  10: tokio::runtime::thread_pool::worker::Core::shutdown
             at ./src/runtime/thread_pool/worker.rs:570
  11: tokio::runtime::thread_pool::worker::Shared::shutdown
             at ./src/runtime/thread_pool/worker.rs:794
  12: tokio::runtime::thread_pool::worker::Context::run
             at ./src/runtime/thread_pool/worker.rs:335
  13: tokio::runtime::thread_pool::worker::run::{{closure}}
             at ./src/runtime/thread_pool/worker.rs:305
  14: tokio::macros::scoped_tls::ScopedKey<T>::set
             at ./src/macros/scoped_tls.rs:63
  15: tokio::runtime::thread_pool::worker::run
             at ./src/runtime/thread_pool/worker.rs:302
  16: tokio::runtime::thread_pool::worker::Launch::launch::{{closure}}
             at ./src/runtime/thread_pool/worker.rs:281
  17: <tokio::runtime::blocking::task::BlockingTask<T> as core::future::future::Future>::poll
             at ./src/runtime/blocking/task.rs:41
  18: tokio::runtime::task::core::Core<T,S>::poll::{{closure}}
             at ./src/runtime/task/core.rs:173
  19: tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
             at ./src/loom/std/unsafe_cell.rs:14
  20: tokio::runtime::task::core::Core<T,S>::poll
             at ./src/runtime/task/core.rs:158
  21: tokio::runtime::task::harness::Harness<T,S>::poll::{{closure}}
             at ./src/runtime/task/harness.rs:107
  22: core::ops::function::FnOnce::call_once
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:233
  23: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:318
  24: std::panicking::try::do_call
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:373
  25: __rust_try
  26: std::panicking::try
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:337
  27: std::panic::catch_unwind
             at /home/markus/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:394
  28: tokio::runtime::task::harness::Harness<T,S>::poll
             at ./src/runtime/task/harness.rs:89
  29: tokio::runtime::task::raw::poll
             at ./src/runtime/task/raw.rs:104
  30: tokio::runtime::task::raw::RawTask::poll
             at ./src/runtime/task/raw.rs:66
  31: tokio::runtime::task::Notified<S>::run
             at ./src/runtime/task/mod.rs:169
  32: tokio::runtime::blocking::pool::Inner::run
             at ./src/runtime/blocking/pool.rs:260
  33: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}::{{closure}}
             at ./src/runtime/blocking/pool.rs:240
  34: tokio::runtime::context::enter
             at ./src/runtime/context.rs:74
  35: tokio::runtime::handle::Handle::enter
             at ./src/runtime/handle.rs:34
  36: tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}
             at ./src/runtime/blocking/pool.rs:239
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
^[[5~

@Marwes
Copy link
Contributor Author

Marwes commented Sep 16, 2020

This doesn't happen if spawn_blocking is used instead of block_in_place so that is a workaround at least.

@blasrodri
Copy link
Contributor

This doesn't happen if spawn_blocking is used instead of block_in_place so that is a workaround at least.

Agree. This works fine:

#[cfg(test)]
#[tokio::test(threaded_scheduler)]
async fn hang_on_shutdown() {
    let (sync_tx, sync_rx) = std::sync::mpsc::channel::<()>();
    tokio::task::spawn_blocking(move || sync_rx.recv().ok());

    tokio::spawn(async {
        tokio::time::delay_for(std::time::Duration::from_secs(1)).await;
        drop(sync_tx);

    });

    tokio::time::delay_for(std::time::Duration::from_secs(2)).await;
}

@jeromegn
Copy link

I'm getting this specific panic too. Not on shutdown though. It happens for no apparent reason.

panic: elapsed=180943502; when=180943498

0   std                             0x55f3e059553c      std::sys_common::backtrace::__rust_end_short_backtrace (backtrace.rs:153)
1   <unknown>                       0x55f3e0597719      rust_begin_unwind (panicking.rs:475)
2   std                             0x55f3e05976cb      std::panicking::begin_panic_fmt (panicking.rs:429)
3   tokio                           0x55f3e048e457      tokio::time::wheel::Wheel<T>::poll
4   tokio                           0x55f3e048783e      tokio::time::driver::Driver<T>::process (mod.rs:164)
5   tokio                           0x55f3e0485a10      [inlined] tokio::time::driver::Driver<T>::park (mod.rs:271)
6   tokio                           0x55f3e0485a10      tokio::park::either::Either<T>::park (either.rs:28)
7   tokio                           0x55f3e048f6ab      [inlined] tokio::runtime::park::Inner::park_driver (park.rs:201)
8   tokio                           0x55f3e048f6ab      [inlined] tokio::runtime::park::Inner::park (park.rs:133)
9   tokio                           0x55f3e048f6ab      tokio::runtime::park::Parker::park (park.rs:93)
10  tokio                           0x55f3e04a1aa9      tokio::runtime::thread_pool::worker::Context::park_timeout (worker.rs:426)
11  tokio                           0x55f3e04a14af      [inlined] tokio::runtime::thread_pool::worker::Context::park (worker.rs:402)
12  tokio                           0x55f3e04a14af      tokio::runtime::thread_pool::worker::Context::run (worker.rs:334)
13  tokio                           0x55f3e0491e53      [inlined] tokio::runtime::thread_pool::worker::run::{{closure}} (worker.rs:309)
14  tokio                           0x55f3e0491e53      tokio::macros::scoped_tls::ScopedKey<T>::set (scoped_tls.rs:63)
15  tokio                           0x55f3e04a0be6      tokio::runtime::thread_pool::worker::run (worker.rs:306)
16  tokio                           0x55f3e049d96b      [inlined] tokio::runtime::thread_pool::worker::Launch::launch::{{closure}} (worker.rs:285)
17  tokio                           0x55f3e049d96b      [inlined] tokio::runtime::blocking::task::BlockingTask<T>::poll (task.rs:41)
18  tokio                           0x55f3e049d96b      [inlined] tokio::runtime::task::core::Core<T>::poll::{{closure}} (core.rs:173)
19  tokio                           0x55f3e049d96b      [inlined] tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut (unsafe_cell.rs:14)
20  tokio                           0x55f3e049d96b      tokio::runtime::task::core::Core<T>::poll (core.rs:158)
21  tokio                           0x55f3e04a6833      [inlined] tokio::runtime::task::harness::Harness<T>::poll::{{closure}} (harness.rs:107)
22  core                            0x55f3e04a6833      [inlined] core::ops::function::FnOnce::call_once (function.rs:227)
23  std                             0x55f3e04a6833      std::panic::AssertUnwindSafe<T>::call_once (panic.rs:308)
24  std                             0x55f3e0494a13      [inlined] std::panicking::try::do_call (panicking.rs:373)
25  std                             0x55f3e0494a13      [inlined] std::panicking::try (panicking.rs:337)
26  std                             0x55f3e0494a13      [inlined] std::panic::catch_unwind (panic.rs:379)
27  tokio                           0x55f3e0494a13      tokio::runtime::task::harness::Harness<T>::poll (harness.rs:89)
28  tokio                           0x55f3e0486d0e      [inlined] tokio::runtime::task::raw::RawTask::poll (raw.rs:66)
29  tokio                           0x55f3e0486d0e      [inlined] tokio::runtime::task::Notified<T>::run (mod.rs:169)
30  tokio                           0x55f3e0486d0e      tokio::runtime::blocking::pool::Inner::run (pool.rs:230)
31  tokio                           0x55f3e0492a1d      [inlined] tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}}::{{closure}} (pool.rs:210)
32  tokio                           0x55f3e0492a1d      tokio::runtime::context::enter (context.rs:72)
33  tokio                           0x55f3e0483196      [inlined] tokio::runtime::handle::Handle::enter (handle.rs:76)
34  tokio                           0x55f3e0483196      [inlined] tokio::runtime::blocking::pool::Spawner::spawn_thread::{{closure}} (pool.rs:209)
35  std                             0x55f3e0483196      std::sys_common::backtrace::__rust_begin_short_backtrace (backtrace.rs:137)
36  std                             0x55f3e0497e8b      [inlined] std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}} (mod.rs:458)
37  std                             0x55f3e0497e8b      [inlined] std::panic::AssertUnwindSafe<T>::call_once (panic.rs:308)
38  std                             0x55f3e0497e8b      [inlined] std::panicking::try::do_call (panicking.rs:373)
39  std                             0x55f3e0497e8b      [inlined] std::panicking::try (panicking.rs:337)
40  std                             0x55f3e0497e8b      [inlined] std::panic::catch_unwind (panic.rs:379)
41  std                             0x55f3e0497e8b      [inlined] std::thread::Builder::spawn_unchecked::{{closure}} (mod.rs:457)
42  core                            0x55f3e0497e8b      core::ops::function::FnOnce::call_once{{vtable.shim}} (function.rs:227)
43  alloc                           0x55f3e059be6a      [inlined] alloc::boxed::Box<T>::call_once (boxed.rs:1042)
44  alloc                           0x55f3e059be6a      [inlined] alloc::boxed::Box<T>::call_once (boxed.rs:1042)
45  std                             0x55f3e059be6a      std::sys::unix::thread::Thread::new::thread_start (thread.rs:87)
46  <unknown>                       0x7ff4686716db      start_thread
47  <unknown>                       0x7ff467d76a3f      __clone
48  <unknown>                       0x0                 <unknown>

We're not using any block_in_place, but we're using a few spawn_blocking, sometimes in combination with block_on like so:

tokio::task::spawn_blocking(move || {
  tokio::runtime::Handle::current().block_on(async {
    //...
  })
})

but I'm not sure this is the cause.

@lovasoa
Copy link

lovasoa commented Nov 23, 2020

Hello,
I just got a crash report from an user of dezoomify-rs who seems to be affected by this bug : lovasoa/dezoomify-rs#91

Here is the crash report generated by human-panic :

name = 'dezoomify-rs'
operating_system = 'unix:Debian'
crate_version = '2.6.1'
explanation = '''
Panic occurred in file '/home/runner/.cargo/registry/src/git.luolix.top-1ecc6299db9ec823/tokio-0.2.23/src/time/wheel/mod.rs' at line 223
'''
cause = 'elapsed=18446744073709551615; when=56977'
method = 'Panic'
backtrace = '''

   0: 0x56086d6bea77 - tokio::time::wheel::Wheel<T>::poll::h2116cfbd2fa0846e
   1: 0x56086d6c757a - tokio::time::driver::Driver<T>::process::h3a3a3fe415090c44
   2: 0x56086d6c7d4d - <tokio::time::driver::Driver<T> as tokio::park::Park>::park::hda1fea46742c1431
   3: 0x56086d6c1092 - <tokio::runtime::park::Parker as tokio::park::Park>::park::h4d2a30281af6a33f
   4: 0x56086d6b7e86 - tokio::runtime::thread_pool::worker::Context::run::h1cbcfea6abdf1996
   5: 0x56086d6c5dd3 - tokio::macros::scoped_tls::ScopedKey<T>::set::hef45728da1fd4508
   6: 0x56086d6b7b76 - tokio::runtime::thread_pool::worker::run::h4b98dc581b7ed45e
   7: 0x56086d2beca3 - tokio::runtime::task::core::Core<T,S>::poll::hd0da54221e1d7050
   8: 0x56086d2d37f2 - <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once::hd6d9269f37e02876
   9: 0x56086d2ae486 - tokio::runtime::task::harness::Harness<T,S>::poll::h1517c781cc3f9f2c
  10: 0x56086d6b5830 - tokio::runtime::blocking::pool::Inner::run::hfd468efbdbf0f354
  11: 0x56086d6bffa0 - tokio::runtime::context::enter::ha89c39b700754910
  12: 0x56086d6c9316 - std::sys_common::backtrace::__rust_begin_short_backtrace::h4d77fd4853978be2
  13: 0x56086d6c4b4b - core::ops::function::FnOnce::call_once{{vtable.shim}}::hf771a3de431ab82d
  14: 0x56086d85393a - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h670c50864ac2cb92
                at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/alloc/src/boxed.rs:1042
                 - <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once::h2511952749086d81
                at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/alloc/src/boxed.rs:1042
                 - std::sys::unix::thread::Thread::new::thread_start::h5ad4ddffe24373a8
                at /rustc/18bf6b4f01a6feaf7259ba7cdae58031af1b7b39/library/std/src/sys/unix/thread.rs:87
  15: 0x7f31400e2fa3 - start_thread
  16: 0x7f313fff74cf - clone
  17:        0x0 - <unresolved>'''

It also seems to be happening during shutdown.

@Darksonn
Copy link
Contributor

I don't really have context on this issue, but we recently had #3080 merged which may or may not fix it. Can someone familiar with the issue test it again?

@blasrodri
Copy link
Contributor

blasrodri commented Nov 28, 2020

@Darksonn

Using: tokio = { version = "0.3.4", features = ["full"] }

The following hangs and panics sometimes with:

running 1 test
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 0
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 960
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 962
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 962
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 1001
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 1984
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 1984
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 18446744073709551615
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] self.elapsed = 18446744073709551615
[/Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:214] when = 1001
thread 'tokio-runtime-worker' panicked at 'elapsed=18446744073709551615; when=1001', /Users/bruger/projects/foss/tokio/tokio/src/time/wheel/mod.rs:213:9
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
#[tokio::test(flavor = "multi_thread")]
async fn hang_on_shutdown() {
    let (sync_tx, sync_rx) = std::sync::mpsc::channel::<()>();
    tokio::spawn(async move {
        tokio::task::block_in_place(|| sync_rx.recv().ok());
    });

    tokio::spawn(async {
        tokio::time::sleep(std::time::Duration::from_secs(2)).await;
        drop(sync_tx);
    });
    tokio::time::sleep(std::time::Duration::from_secs(1)).await;
}

The following always completes:

#[tokio::test]
async fn hang_on_shutdown() {
    let (sync_tx, sync_rx) = std::sync::mpsc::channel::<()>();
    tokio::spawn(async move {
        tokio::task::block_in_place(|| sync_rx.recv().ok());
    });

    tokio::spawn(async {
        tokio::time::sleep(std::time::Duration::from_secs(2)).await;
        drop(sync_tx);
    });
    tokio::time::sleep(std::time::Duration::from_secs(1)).await;
}

@Darksonn
Copy link
Contributor

I don't think the PR has been released yet. Can you use a git dependency?

@blasrodri
Copy link
Contributor

I don't think the PR has been released yet. Can you use a git dependency?

Same issue using ae67851f11b7cc1f577de8ce21767ce3e2c7aff9

teor2345 added a commit to teor2345/zebra that referenced this issue Dec 4, 2020
See tokio-rs/tokio#2789 for details. We were seeing this panic during
normal operation, not just at shutdown.
teor2345 added a commit to ZcashFoundation/zebra that referenced this issue Dec 4, 2020
See tokio-rs/tokio#2789 for details. We were seeing this panic during
normal operation, not just at shutdown.
@teor2345
Copy link
Contributor

teor2345 commented Dec 6, 2020

I don't really have context on this issue, but we recently had #3080 merged which may or may not fix it. Can someone familiar with the issue test it again?

#3080 appears to make the panic much more frequent. On some machines, we see it every ~20 minutes using tokio 0.3.5 (with #3080), but every ~72 hours using tokio 0.3.4 (without #3080).

For more details, see ZcashFoundation/zebra#1452.

Unfortunately, the panic is right in the middle of some complex async code using tower services. So it's hard to track down the exact cause.

@carllerche
Copy link
Member

@bdonlan this still happens maybe you have some thoughts?

@bdonlan
Copy link
Contributor

bdonlan commented Dec 7, 2020

This panic indicates something is trying to roll back the last-evaluated time of the timer wheel. It looks like the expiration deadline is being set to a crazy high value somehow, and... then we're not rejecting this in the poll() function?

I'll take a more detailed look tomorrow.

@carllerche
Copy link
Member

@teor2345 I believe your case is actually different than this one. This issue is a shutdown issue.

@carllerche
Copy link
Member

@teor2345 could you create a new issue w/ info containing features you have enabled and backtrace at least.

bdonlan pushed a commit to bdonlan/tokio that referenced this issue Dec 7, 2020
All core threads share the same park, and invoke park() followed by shutdown -
in parallel.  This is quite possibly a bug in the worker (in that it invokes
shutdown when it's not done with the parker), but for now work around it by
bypassing the timer parker once shutdown is called.

Fixes: tokio-rs#2789
bdonlan pushed a commit to bdonlan/tokio that referenced this issue Dec 7, 2020
Previously, the runtime shutdown logic would first hand control over all cores
to a single thread, which would sequentially shut down all tasks on the core
and then wait for them to complete.

This could deadlock when one task is waiting for a later core's task to
complete. For example, in the newly added test, we have a `block_in_place` task
that is waiting for another task to be dropped. If the latter task adds its
core to the shutdown list later than the former, we end up waiting forever for
the `block_in_place` task to complete.

Additionally, there also was a bug wherein we'd attempt to park on the parker
after shutting it down that was fixed as part of the refactors above.

This change restructures the code to bring all tasks to a halt (and do any
parking needed) before we collapse to a single thread to avoid this deadlock.

There was also an issue in which cancelled tasks would not unpark the
originating thread, due to what appears to be some sort of optimization gone
wrong. This has been fixed to be much more conservative in selecting when not
to unpark the source thread (this may be too conservative; please take a look
at the changes to `release()`).

Fixes: tokio-rs#2789
bdonlan pushed a commit to bdonlan/tokio that referenced this issue Dec 7, 2020
Previously, the runtime shutdown logic would first hand control over all cores
to a single thread, which would sequentially shut down all tasks on the core
and then wait for them to complete.

This could deadlock when one task is waiting for a later core's task to
complete. For example, in the newly added test, we have a `block_in_place` task
that is waiting for another task to be dropped. If the latter task adds its
core to the shutdown list later than the former, we end up waiting forever for
the `block_in_place` task to complete.

Additionally, there also was a bug wherein we'd attempt to park on the parker
after shutting it down that was fixed as part of the refactors above.

This change restructures the code to bring all tasks to a halt (and do any
parking needed) before we collapse to a single thread to avoid this deadlock.

There was also an issue in which cancelled tasks would not unpark the
originating thread, due to what appears to be some sort of optimization gone
wrong. This has been fixed to be much more conservative in selecting when not
to unpark the source thread (this may be too conservative; please take a look
at the changes to `release()`).

Fixes: tokio-rs#2789
carllerche pushed a commit that referenced this issue Dec 8, 2020
Previously, the runtime shutdown logic would first-hand control over all cores
to a single thread, which would sequentially shut down all tasks on the core
and then wait for them to complete.

This could deadlock when one task is waiting for a later core's task to
complete. For example, in the newly added test, we have a `block_in_place` task
that is waiting for another task to be dropped. If the latter task adds its
core to the shutdown list later than the former, we end up waiting forever for
the `block_in_place` task to complete.

Additionally, there also was a bug wherein we'd attempt to park on the parker
after shutting it down which was fixed as part of the refactors above.

This change restructures the code to bring all tasks to a halt (and do any
parking needed) before we collapse to a single thread to avoid this deadlock.

There was also an issue in which canceled tasks would not unpark the
originating thread, due to what appears to be some sort of optimization gone
wrong. This has been fixed to be much more conservative in selecting when not
to unpark the source thread (this may be too conservative; please take a look
at the changes to `release()`).

Fixes: #2789
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-tokio Area: The main tokio crate C-bug Category: This is a bug. M-time Module: tokio/time
Projects
None yet
8 participants