Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect when pthread_mutex_t is moved #3784

Merged
merged 1 commit into from
Sep 5, 2024

Conversation

Mandragorian
Copy link
Contributor

@Mandragorian Mandragorian commented Aug 2, 2024

What I am not sure about this PR is how to support storing the additional mutex data like its address and kind. If I understand correctly the concurrency::sync::Mutex struct is to be used by any mutex implementation. This possibly means that different implementation might want to store different data in the mutex. So any additional data should be implementation defined somehow. Solutions that come to mind:

  • Store the additional data as Box<dyn Any> and the implementations can downcast their data when they fetch them.
  • Have each shim implementation define a static mut map between MutexIDs and the additional data.

Let me know

Fixes #3749

@Mandragorian Mandragorian marked this pull request as draft August 2, 2024 20:35
@Mandragorian Mandragorian force-pushed the detect_moved_mutexes branch 2 times, most recently from 9ec55eb to 7b044ba Compare August 2, 2024 20:58
@Mandragorian Mandragorian marked this pull request as ready for review August 2, 2024 21:23
@Mandragorian
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added the S-waiting-on-review Status: Waiting for a review to complete label Aug 7, 2024
@Mandragorian
Copy link
Contributor Author

Hi!

Just wanted to quickly check if this fell through the cracks. (But no pressure if you are busy!). Thanks!

@RalfJung
Copy link
Member

RalfJung commented Aug 11, 2024 via email

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for giving this a shot! I like the AdditionalMutexData part, but the way it gets used in unix/sync should be changed a bit -- see the comments in the review.

@@ -214,6 +246,8 @@ pub(super) trait EvalContextExtPriv<'tcx>: crate::MiriInterpCxExt<'tcx> {
} else {
let new_index = this.machine.sync.mutexes.push(Default::default());
assert_eq!(next_index, new_index);
let data = initialize_data()?;
this.machine.sync.mutexes[new_index].data = data;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of initializing with default values and then changing them, please immediately push the right values.

src/shims/unix/sync.rs Show resolved Hide resolved
pub kind: MutexKind,

/// The address of the mutex.
pub address: Pointer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we really only care about the address, this should be a u64. Currently you are also storing provenance but we don't actually care about that.

@@ -118,13 +118,29 @@ fn mutex_get_id<'tcx>(
ecx: &mut MiriInterpCx<'tcx>,
mutex_op: &OpTy<'tcx>,
) -> InterpResult<'tcx, MutexId> {
let address = ecx.deref_pointer_as(mutex_op, ecx.libc_ty_layout("pthread_mutex_t"))?.ptr();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
let address = ecx.deref_pointer_as(mutex_op, ecx.libc_ty_layout("pthread_mutex_t"))?.ptr();
// FIXME: might be worth changing mutex_get_or_create_id to take the mplace
// rather than the `OpTy`.
let address = ecx.deref_pointer_as(mutex_op, ecx.libc_ty_layout("pthread_mutex_t"))?.ptr();

@bors
Copy link
Collaborator

bors commented Aug 17, 2024

☔ The latest upstream changes (presumably #3823) made this pull request unmergeable. Please resolve the merge conflicts.

@RalfJung
Copy link
Member

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Aug 18, 2024
@Mandragorian Mandragorian force-pushed the detect_moved_mutexes branch 2 times, most recently from 4e17711 to cd25fd2 Compare August 30, 2024 01:02
@RalfJung
Copy link
Member

(please write @rustbot ready when this is ready for review again)

@Mandragorian
Copy link
Contributor Author

Yeah, I wanted to post a comment first with a few things I am not sure are correct. I realized it was late after I pushed the changes so I decided to do that today. sorry if it caused confusion. Here is what I wanted to say:

To check for the static initializers I have to ignore a "header" in the pthread_mutex_t where miri stores the MutexId. The way the code works now, initialization of the additional data happens lazily, after the compare exchange operation. So at that point the mutex id is stored, and doing a bitwise comparison with the static initializers will fail. I could call the closure before checking for initialization in get_or_create_id, but then this would happen every time a mutex is locked/unlocked which could be slow. So i decided to ignore 4 + mutex_id_offset bytes from the contents of a mutex when checking if it is equal to a static initializer.

This also solves another issue. When i was reading the full contents of the mutex, miri thought there was a race condition. I did not spend too much time debugging this. But I believe it is because when trying to lock, for example, a mutex that was initialized with a static initializer, in thread A, all the bytes of pthread_mutex_t (including the id) would be read. Then thread B would try to also lock the mutex, and call get_or_create_id. The previous unsynchronized read then clashed with the atomic read in compare_exchange. I couldn't find the proper way to fix this. I think I might need to somehow use acquire_clock but I couldn't exactly understand how it works. Since I am now ignoring the mutex id part, I think this sidesteps the issue.

Finally, instead of adding a forece_create boolean argument, I opted for defining a new function that creates the mutex eagerly. I thought that adding more logic in get_or_create_id might make it too complex. But I a have no strong feelings about this. The current way also removes the need for mutex_reset_id. Otherwise, I think we need to call it to initialize the mutex id memory, otherwise miri complains that there was a read of uninit data, I am assuming at compare_exchange.

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Aug 30, 2024
@RalfJung
Copy link
Member

To check for the static initializers I have to ignore a "header" in the pthread_mutex_t where miri stores the MutexId. The way the code works now, initialization of the additional data happens lazily, after the compare exchange operation. So at that point the mutex id is stored, and doing a bitwise comparison with the static initializers will fail. I could call the closure before checking for initialization in get_or_create_id, but then this would happen every time a mutex is locked/unlocked which could be slow. So i decided to ignore 4 + mutex_id_offset bytes from the contents of a mutex when checking if it is equal to a static initializer.

Ah yes, this has to be done quite carefully to avoid races.

I'd say for now just scratch this sanity check. Checking just some of the bytes is a bit too ugly IMO, after all we don't have to catch these kinds of bugs anyway. So let's just use the "zero ID" as indicator that this must be initialized and not do a separate check for now; we can explore how to add such a check separately in the future.

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a separate create_id leads to some duplication in the API surface -- we'll need this for all types that have static and explicit initializers. But on the other hand, having the "explicit initialization" separate from "initialize from static initializer" is also nice, so yeah let's go with that.

src/concurrency/sync.rs Show resolved Hide resolved
src/shims/unix/sync.rs Show resolved Hide resolved
}

fn mutex_set_kind<'tcx>(
/// Checks if a mutex has been moved since its first use and returns an error if it has.
fn mutex_check_moved<'tcx>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only called once, please inline it at the call site.

src/shims/unix/sync.rs Outdated Show resolved Hide resolved
)
/// Returns the kind of a mutex, based on its contents.
/// The contents must not contain the initial "header" where the mutex id is stored.
fn kind_from_contents<'tcx>(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind_from_static_initializer or so would be better IMO, making it clear that we're parsing target-specific constants here but never our own data.

@@ -90,79 +90,120 @@ fn mutex_id_offset<'tcx>(ecx: &MiriInterpCx<'tcx>) -> InterpResult<'tcx, u64> {
Ok(offset)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a FIXME to the sanity check logic above that we should also check the other static initializers for platforms where they exist?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still needed after you checked that there are no other other initializers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah it needs to be checked for Linux.

ecx.eval_path(&["libc", "PTHREAD_MUTEX_INITIALIZER"]).assert_mem_place().ptr();
if contents == get_contents(ecx, default_initializer)? {
return Ok(MutexKind::new(ecx.eval_libc_i32("PTHREAD_MUTEX_DEFAULT")));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still makes the assumption that the platform-specific field that encodes the mutex kind in the static initializer is after the "header". So it's not enough to be sure we are matching platform behavior, and it's not clean enough IMO to keep it as a half-measure.

I also just realized that this function is also called after the new ID has been written, so a full comparison with pthread_mutex_t would anyway not work.

So I think the best option is to have a match &*ecx.tcx.sess.target.os inside this function and then determine for each platform what we have to read to find the initial mutex kind -- basically exactly what mutex_get_kind did before this PR, but only doing it here when we "parse" a static initializer, rather than doing it each time the mutex is used. On Linux this would read 4 bytes at offset if ecx.pointer_size().bytes() == 8 { 16 } else { 12 }. For the other supported targets we'll have to check whether they have any static initializers other than PTHREAD_MUTEX_DEFAULT. And if any platform adds such initializers in the future, Miri will just ignore them until we realize that... maybe one day we can do the comparison with PTHREAD_MUTEX_DEFAULT but it will require reworking get_or_create_id a bit more so that create_obj is called before the new ID is written.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked and none of the other targets has any other static initializers.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. I am not sure I interpreted all of your comment correctly, so let me know.

@@ -573,8 +623,6 @@ pub trait EvalContextExt<'tcx>: crate::MiriInterpCxExt<'tcx> {
throw_ub_format!("destroyed a locked mutex");
}

// Destroying an uninit pthread_mutex is UB, so check to make sure it's not uninit.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please preserve this comment, it is still relevant.

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rustbot author

ecx.eval_path(&["libc", "PTHREAD_MUTEX_INITIALIZER"]).assert_mem_place().ptr();
if contents == get_contents(ecx, default_initializer)? {
return Ok(MutexKind::new(ecx.eval_libc_i32("PTHREAD_MUTEX_DEFAULT")));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked and none of the other targets has any other static initializers.

@rustbot rustbot added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Aug 31, 2024
@Mandragorian
Copy link
Contributor Author

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Sep 2, 2024
ecx.eval_path(&["libc", "PTHREAD_MUTEX_INITIALIZER"]).assert_mem_place().ptr();
if contents == get_contents(ecx, default_initializer)? {
// Only linux has static initializers other than PTHREAD_MUTEX_DEFAULT.
if &*ecx.tcx.sess.target.os != "linux" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this an exhaustive match, like the old mutex_get_kind. When we add a new target we need to check that this is still true.

ecx.machine.layouts.i32,
)?
.to_i32()?;
return Ok(MutexKind::new(kind));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check that kind is a valid mutex kind.

In fact, probably it would make sense to make this an enum? Then all the other mutex functions won't have to compare this with libc constants any more...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re making this an enum:

I don't disagree, but I am a bit unsure on how to do this. In my mind ideally the MutexKind would be a type known only to the pthreads implementation, but currently it lives with the generic mutex implementation. At least by making this an i32 it would allow other mutex APIs other than pthreads to possibly reuse it. However if we make it an enum it might be too coupled with pthreads terminology.

As the code is now, we can't make the AdditionalMutexData generic so that each mutex implementation can define its own type to hold whatever data they want.

On the other hand this might be abstracting way to soon since there is no other implementation that needs additional data. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As the code is now, we can't make the AdditionalMutexData generic so that each mutex implementation can define its own type to hold whatever data they want.

We could use dyn Any trickery.

But IMO it's not worth it. We can just say the enum lists all mutex kinds of all implementations. 🤷 You can mark the enum #[non_exhaustive] to drive home that point, and force all matches to end with _ => panic!("pthread mutex with invalid mutex kind") or so.

@Mandragorian
Copy link
Contributor Author

@rustbot ready

@@ -69,16 +69,13 @@ declare_id!(MutexId);
/// The mutex kind.
/// The meaning of this type is defined by concrete mutex implementations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That line of the comment seems outdated now.

@@ -0,0 +1,72 @@
//@ignore-target-windows: No pthreads on Windows
//@[unlock_register]only-target-linux: recursive initializers are non-standard.
//@revisions: lock trylock unlock_register unlock_detect init
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please declare the revisions before using them.

Also, I am confused by the names. lock really is static_initializer, right? init is initializer_function. And why is unlock_register only for Linux? Recursive mutexes exist on all platforms, they just don't have a static initializer on non-Linux.

Maybe it just needs comments explaining what these variants are testing, but currently I am quite confused. I expected only 2 tests -- one that uses the static initializer and one that uses pthread_mutex_init.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the tests:

In my mind what needs to be tested is the ability of the pthread_mutex_* functions to record and then later detect a moved pthread_mutex_t. So for each such function I call it two times. If the move is caught it means that the function both recorded the original address and detected the move. At least that's the intention.

For the unlock function however I needed a slight different approach. Static initializers are only supported on linux, and unlock can only be called on recursive mutexes if they are unlocked. So I used two different revisions, one to test that unlock correctly registers the address (with a static initializer), and one to test that it detects a move, when called after init.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks.

However, given our implementation, we know this is all the same codepath, so it doesn't seem worth testing all combinations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, removed the extra tests.

@Mandragorian
Copy link
Contributor Author

@rustbot ready

Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! LGTM, just two comment nits. Then please squash the commits.

tests/fail-dep/concurrency/libc_pthread_mutex_move.rs Outdated Show resolved Hide resolved
tests/fail-dep/concurrency/libc_pthread_mutex_move.rs Outdated Show resolved Hide resolved
@Mandragorian
Copy link
Contributor Author

Thanks! LGTM, just two comment nits. Then please squash the commits.

done thanks!

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2024

Thanks. :-)

@bors r+

@bors
Copy link
Collaborator

bors commented Sep 5, 2024

📌 Commit aba9bb2 has been approved by RalfJung

It is now in the queue for this repository.

@bors
Copy link
Collaborator

bors commented Sep 5, 2024

⌛ Testing commit aba9bb2 with merge 7b422fe...

@RalfJung
Copy link
Member

RalfJung commented Sep 5, 2024

I only just noticed that you even suggested this initially. :D

Store the additional data as Box<dyn Any> and the implementations can downcast their data when they fetch them.

If you want to do this in a follow-up PR, let me know. :)

@bors
Copy link
Collaborator

bors commented Sep 5, 2024

☀️ Test successful - checks-actions
Approved by: RalfJung
Pushing 7b422fe to master...

@bors bors merged commit 7b422fe into rust-lang:master Sep 5, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Waiting for a review to complete
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pthreads synchronization primitives: detect mutex/rwlock/... being moved to a different location
4 participants