Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Spurious miri library rwlock test failure? #133421

Closed
jieyouxu opened this issue Nov 24, 2024 · 6 comments · Fixed by #133435
Closed

Spurious miri library rwlock test failure? #133421

jieyouxu opened this issue Nov 24, 2024 · 6 comments · Fixed by #133435
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc C-bug Category: This is a bug. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Comments

@jieyouxu
Copy link
Member

jieyouxu commented Nov 24, 2024

Failed in #133068 (comment) on the x86_64-gnu-aux job.

error: Undefined Behavior: trying to retag from <54379001> for SharedReadWrite permission at alloc19481375[0x10], but that tag does not exist in the borrow stack for this location
##[error]   --> /checkout/library/core/src/ptr/non_null.rs:375:18
    |
375 |         unsafe { &*self.as_ptr().cast_const() }
    |                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    |                  |
    |                  trying to retag from <54379001> for SharedReadWrite permission at alloc19481375[0x10], but that tag does not exist in the borrow stack for this location
    |                  this error occurs as part of retag at alloc19481375[0x0..0x30]
    |
    = help: this indicates a potential bug in the program: it performed an invalid operation, but the Stacked Borrows rules it violated are still experimental
    = help: see https://github.com/rust-lang/unsafe-code-guidelines/blob/master/wip/stacked-borrows.md for further information
help: <54379001> was created by a SharedReadWrite retag at offsets [0x0..0x29]
   --> /checkout/library/core/src/ptr/mod.rs:799:5
    |
799 |     r
    |     ^
help: <54379001> was later invalidated at offsets [0x18..0x20] by a write access
   --> std/src/sys/sync/rwlock/queue.rs:396:13
    |
396 |             node.prev = AtomicLink::new(None);
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    = note: BACKTRACE (of the first span) on thread `unnamed-995`:
    = note: inside `core::ptr::NonNull::<sys::sync::rwlock::queue::Node>::as_ref::<'_>` at /checkout/library/core/src/ptr/non_null.rs:375:18: 375:46
note: inside `sys::sync::rwlock::queue::find_tail_and_add_backlinks`
   --> std/src/sys/sync/rwlock/queue.rs:279:18
    |
279 |             next.as_ref().prev.set(Some(current));
    |                  ^^^^^^^^
note: inside `sys::sync::rwlock::queue::RwLock::unlock_queue`
   --> std/src/sys/sync/rwlock/queue.rs:646:33
    |
646 |             let tail = unsafe { find_tail_and_add_backlinks(to_node(state)) };
    |                                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `sys::sync::rwlock::queue::RwLock::lock_contended`
   --> std/src/sys/sync/rwlock/queue.rs:437:21
    |
437 |                     self.unlock_queue(next);
    |                     ^^^^^^^^^^^^^^^^^^^^^^^
note: inside `sys::sync::rwlock::queue::RwLock::write`
   --> std/src/sys/sync/rwlock/queue.rs:356:13
    |
356 |             self.lock_contended(true)
    |             ^^^^^^^^^^^^^^^^^^^^^^^^^
note: inside `sync::rwlock::RwLock::<usize>::write`
   --> std/src/sync/rwlock.rs:362:13
    |
362 |             self.inner.write();
    |             ^^^^^^^^^^^^^^^^^^
note: inside closure
   --> std/src/sync/rwlock/tests.rs:533:46
    |
533 |                     let mut write_guard = rw.write().unwrap();
    |                                              ^^^^^^^

note: some details are omitted, run with `MIRIFLAGS=-Zmiri-backtrace=full` for a verbose backtrace

error: aborting due to 1 previous error

�[1m�[31merror�[0m�[1m:�[0m test failed, to rerun pass `-p std --lib`

Caused by:
  process didn't exit successfully: `/checkout/obj/build/x86_64-unknown-linux-gnu/stage0-tools-bin/cargo-miri runner /checkout/obj/build/x86_64-unknown-linux-gnu/stage1-std/miri/aarch64-apple-darwin/debug/deps/std-fa3c0a26ad940e73 'time::' 'sync::' 'thread::' 'env::' -Z unstable-options --format json` (exit status: 1)
�[1m�[36mnote�[0m�[1m:�[0m test exited abnormally; to see the full output pass --nocapture to the harness.

cc @RalfJung do you have any idea what could be triggering this? Maybe similar to #133200?


See related failures ( CI-ABA-ptr-provenance-lockless-queue-fail CI spurious failure: related to #121950 ): https://github.com/rust-lang/rust/pulls?q=is%3Apr+label%3ACI-ABA-ptr-provenance-lockless-queue-fail+

@jieyouxu jieyouxu added A-testsuite Area: The testsuite used to check the correctness of rustc C-bug Category: This is a bug. labels Nov 24, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 24, 2024
@jieyouxu
Copy link
Member Author

Judging from test failure location, probably due to #121950?

@jieyouxu jieyouxu added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Nov 24, 2024
@RalfJung
Copy link
Member

This does sound like another instance of #121950, but it's odd that it would now happen more often.

@jieyouxu
Copy link
Member Author

jieyouxu commented Nov 24, 2024

AFAIK these spurious miri test failures from rwlock tests are only occuring more frequently recently, I have never seen this failure on older PRs.

@RalfJung
Copy link
Member

The original #121950 occurred around once every 1000 executions, which is about once every 3 months on a bors run (assuming a bors run every 2h).

We could try to do some statistics on these newly failing tests to see how often they fail (but I don't have the time for that right now).

@jieyouxu jieyouxu removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 24, 2024
@jieyouxu
Copy link
Member Author

It doesn't happen super often (completely dwarved by the other msvc-related failures we have 😆), so I'm inclined to just leave this as a known issue in case someone else runs into it.

@jieyouxu jieyouxu changed the title Spurious miri test failure? Spurious miri library rwlock test failure? Nov 24, 2024
@RalfJung
Copy link
Member

RalfJung commented Nov 24, 2024

For the test_downgrade_atomic test, failure rate seems to be around 2/2048. I guess we got "lucky" hitting it so quickly after it got added.

The failure above is in test_downgrade_observe. That test takes quite a while to run in Miri so it's hard to get a good failure rate. It seems to occur around 4 times out of 64, so a lot more often than the others.

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this issue Nov 25, 2024
…=tgross35

miri: disable test_downgrade_observe test on macOS

Due to rust-lang#121950, this test can fail on Miri. The test is also quite slow on Miri (taking more than 30s) due to the high iteration count (a total of 2000), so let's reduce that a little.

Fixes rust-lang#133421
jhpratt added a commit to jhpratt/rust that referenced this issue Nov 26, 2024
…=tgross35

miri: disable test_downgrade_observe test on macOS

Due to rust-lang#121950, this test can fail on Miri. The test is also quite slow on Miri (taking more than 30s) due to the high iteration count (a total of 2000), so let's reduce that a little.

Fixes rust-lang#133421
@bors bors closed this as completed in c4e2b0c Nov 27, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this issue Nov 27, 2024
Rollup merge of rust-lang#133435 - RalfJung:test_downgrade_observe, r=tgross35

miri: disable test_downgrade_observe test on macOS

Due to rust-lang#121950, this test can fail on Miri. The test is also quite slow on Miri (taking more than 30s) due to the high iteration count (a total of 2000), so let's reduce that a little.

Fixes rust-lang#133421
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-testsuite Area: The testsuite used to check the correctness of rustc C-bug Category: This is a bug. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants