Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate "no callback found" #1763

Closed
RalfJung opened this issue Apr 8, 2021 · 10 comments · Fixed by #1771
Closed

Investigate "no callback found" #1763

RalfJung opened this issue Apr 8, 2021 · 10 comments · Fixed by #1771

Comments

@RalfJung
Copy link
Member

RalfJung commented Apr 8, 2021

#1761 managed to somehow trigger an ICE in Miri:

normalized stderr:
thread 'rustc' panicked at 'no callback found', src/thread.rs:714:55
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::option::expect_failed
   3: core::option::Option<T>::expect
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/library/core/src/option.rs:349:21
   4: miri::thread::EvalContextExt::run_timeout_callback
             at ./src/thread.rs:714:13
   5: miri::eval::eval_main::{{closure}}
             at ./src/eval.rs:230:21
   6: miri::eval::eval_main
             at ./src/eval.rs:218:38
   7: <miri::MiriCompilerCalls as rustc_driver::Callbacks>::after_analysis::{{closure}}
             at ./src/bin/miri.rs:53:40
   8: rustc_interface::passes::QueryContext::enter::{{closure}}
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_interface/src/passes.rs:755:42
   9: rustc_middle::ty::context::tls::enter_context::{{closure}}
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_middle/src/ty/context.rs:1726:50
  10: rustc_middle::ty::context::tls::set_tlv
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_middle/src/ty/context.rs:1710:9
  11: rustc_middle::ty::context::tls::enter_context
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_middle/src/ty/context.rs:1726:9
  12: rustc_interface::passes::QueryContext::enter
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_interface/src/passes.rs:755:9
  13: <miri::MiriCompilerCalls as rustc_driver::Callbacks>::after_analysis
             at ./src/bin/miri.rs:35:9
  14: rustc_interface::queries::<impl rustc_interface::interface::Compiler>::enter
  15: rustc_span::with_source_map
  16: rustc_interface::interface::create_compiler_and_run
  17: scoped_tls::ScopedKey<T>::set
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md

note: rustc 1.53.0-nightly (9b0edb7fd 2021-03-27) running on x86_64-apple-darwin

note: compiler flags: -Z miri-disable-isolation -C prefer-dynamic

query stack during panic:
end of query stack



The actual stderr differed from the expected stderr.
Actual stderr saved to /var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/compiletestOPSWb6/concurrency/libc_pthread_cond.stage-id.stderr
To update references, rerun the tests and pass the `--bless` flag
To only update this specific test, also pass `--test-args concurrency/libc_pthread_cond.rs`

error: 1 errors occurred comparing output.
status: exit status: 101
command: "/Users/runner/work/miri/miri/./target/x86_64-apple-darwin/release/miri" "tests/run-pass/concurrency/libc_pthread_cond.rs" "-L" "/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/compiletestOPSWb6" "--target=mips64-unknown-linux-gnuabi64" "--error-format" "json" "-C" "prefer-dynamic" "-o" "/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/compiletestOPSWb6/concurrency/libc_pthread_cond.stage-id" "-A" "unused" "--edition" "2018" "-Dwarnings" "-Dunused" "--sysroot" "/Users/runner/Library/Caches/org.rust-lang.miri" "-Zmiri-disable-isolation" "-L" "/var/folders/24/8k48jl6d249_n_qfxwsl6xvm0000gn/T/compiletestOPSWb6/concurrency/libc_pthread_cond.stage-id.aux"
stdout:
------------------------------------------

------------------------------------------
stderr:
------------------------------------------
thread 'rustc' panicked at 'no callback found', src/thread.rs:714:55
stack backtrace:
   0: _rust_begin_unwind
   1: core::panicking::panic_fmt
   2: core::option::expect_failed
   3: core::option::Option<T>::expect
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/library/core/src/option.rs:349:21
   4: miri::thread::EvalContextExt::run_timeout_callback
             at ./src/thread.rs:714:13
   5: miri::eval::eval_main::{{closure}}
             at ./src/eval.rs:230:21
   6: miri::eval::eval_main
             at ./src/eval.rs:218:38
   7: <miri::MiriCompilerCalls as rustc_driver::Callbacks>::after_analysis::{{closure}}
             at ./src/bin/miri.rs:53:40
   8: rustc_interface::passes::QueryContext::enter::{{closure}}
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_interface/src/passes.rs:755:42
   9: rustc_middle::ty::context::tls::enter_context::{{closure}}
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_middle/src/ty/context.rs:1726:50
  10: rustc_middle::ty::context::tls::set_tlv
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_middle/src/ty/context.rs:1710:9
  11: rustc_middle::ty::context::tls::enter_context
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_middle/src/ty/context.rs:1726:9
  12: rustc_interface::passes::QueryContext::enter
             at /Users/runner/.rustup/toolchains/master/lib/rustlib/src/rust/compiler/rustc_interface/src/passes.rs:755:9
  13: <miri::MiriCompilerCalls as rustc_driver::Callbacks>::after_analysis
             at ./src/bin/miri.rs:35:9
  14: rustc_interface::queries::<impl rustc_interface::interface::Compiler>::enter
  15: rustc_span::with_source_map
  16: rustc_interface::interface::create_compiler_and_run
  17: scoped_tls::ScopedKey<T>::set
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.

error: internal compiler error: unexpected panic

note: the compiler unexpectedly panicked. this is a bug.

note: we would appreciate a bug report: https://github.com/rust-lang/rust/issues/new?labels=C-bug%2C+I-ICE%2C+T-compiler&template=ice.md

note: rustc 1.53.0-nightly (9b0edb7fd 2021-03-27) running on x86_64-apple-darwin

note: compiler flags: -Z miri-disable-isolation -C prefer-dynamic

query stack during panic:
end of query stack

------------------------------------------

test [ui] run-pass/concurrency/libc_pthread_cond.rs ... FAILED

The panic ultimately triggered here:

this.machine.threads.get_ready_callback().expect("no callback found");

I have not seen this before, so looks like it is unlikely but possible to happen. Would be good to at least come up with a plausible theory for a chain of events that lead here. Cc @vakaras

@vakaras
Copy link
Contributor

vakaras commented Apr 8, 2021

The only idea that comes to my mind is that host's clock was shifted back just at the right moment…

@RalfJung
Copy link
Member Author

For the system clock that would actually be "legal".

@RalfJung
Copy link
Member Author

But yeah, I think the only way this can happen is if schedule sees some timeout with a wait duration of 0, but then get_ready_callback doesn't find any.

(Btw, do we compute the wait time on each iteration through the main interpreter loop? That seems like a lot, TBH.)

@vakaras
Copy link
Contributor

vakaras commented Apr 10, 2021

For the system clock that would actually be "legal".

I guess in that case instead of unwrapping we should go back to the scheduler without doing anything:

let (thread, callback) = if let Some((thread, callback)) = this.machine.threads.get_ready_callback() {
    (thread, callback)
} else {
    // get_ready_callback can return None if the computer's clock was
    // shifted after calling the scheduler and before the call
    // to get_ready_callback. In this case, just do nothing, which
    // effectively just returns to the scheduler.
    return Ok(());
};

Btw, do we compute the wait time on each iteration through the main interpreter loop?

Yes.

@RalfJung
Copy link
Member Author

I guess in that case instead of unwrapping we should go back to the scheduler without doing anything:

Yeah, that sounds reasonable.

Yes.

Hm, now I wonder if it's a problem (in particular perf-wise) to keep hitting the clock API so rapidly. At least we only do it when there are timeouts waiting to happen, but then we do it once for each timeout.

@vakaras
Copy link
Contributor

vakaras commented Apr 10, 2021

Hm, now I wonder if it's a problem (in particular perf-wise) to keep hitting the clock API so rapidly. At least we only do it when there are timeouts waiting to happen, but then we do it once for each timeout.

I guess, the scheduler, in general, could be called each 10th step or so. I remember that at some point there was an idea to change the scheduler to pick a different thread after some number of steps. However, this probably was not implemented, right?

@RalfJung
Copy link
Member Author

One naive approach would be to move the "can we continue running the current thread" check above the timeout -- but I think the comment there explains that we cannot do that, even though I do not fully understand it.

@vakaras
Copy link
Contributor

vakaras commented Apr 10, 2021

I think we could move the continue the current thread check above the timeout check if we changed the implementation of conditional variables to immediately return E_TIMEOUT in case it was scheduled to fire in the past. (This is what that comment means as far as I remember.)

@RalfJung
Copy link
Member Author

Wouldn't that effect also be achieved by making pthread_cond_timedwait implicitly yield?

@vakaras
Copy link
Contributor

vakaras commented Apr 11, 2021

I think that this should also work.

bors added a commit that referenced this issue Apr 11, 2021
Attempt to fix #1763

As discussed on issue #1763, just try asking the scheduler to try again.

Fixes #1763
@bors bors closed this as completed in 72ca2a7 Apr 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants