Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clarify how Rust atomics correspond to C++ atomics #97516

Merged
merged 2 commits into from
Jun 22, 2022

Conversation

RalfJung
Copy link
Member

@cbeuw noted in rust-lang/miri#1963 that the correspondence between C++ atomics and Rust atomics is not quite as obvious as one might think, since in Rust I can use get_mut to treat previously non-atomic data as atomic. However, I think using C++20 atomic_ref, we can establish a suitable relation between the two -- or do you see problems with that @cbeuw? (I recall you said there was some issue, but it was deep inside that PR and Github makes it impossible to find...)

Cc @thomcc; not sure whom else to ping for atomic memory model things.

@rust-highfive
Copy link
Collaborator

Hey! It looks like you've submitted a new PR for the library teams!

If this PR contains changes to any rust-lang/rust public library APIs then please comment with r? rust-lang/libs-api @rustbot label +T-libs-api -T-libs to request review from a libs-api team reviewer. If you're unsure where your change falls no worries, just leave it as is and the reviewer will take a look and make a decision to forward on if necessary.

Examples of T-libs-api changes:

  • Stabilizing library features
  • Introducing insta-stable changes such as new implementations of existing stable traits on existing stable types
  • Introducing new or changing existing unstable library APIs (excluding permanently unstable features / features without a tracking issue)
  • Changing public documentation in ways that create new stability guarantees
  • Changing observable runtime behavior of library APIs

@rustbot rustbot added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label May 29, 2022
@rust-highfive
Copy link
Collaborator

r? @kennytm

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label May 29, 2022
@cbeuw
Copy link
Contributor

cbeuw commented May 29, 2022

My comment with more context: rust-lang/miri#1963 (comment)

I think this formulation is pretty clever, and I can't think of an edge case where it wouldn't work.

This doesn't answer how things should behave with &mut, but that doesn't affect the correctness of ref -> atomic_ref correspondence, and C++ doesn't address what happens when you reuse the same location for multiple atomic_refs without overlapping lifetimes either, so we are mostly in the same boat.

@RalfJung
Copy link
Member Author

This doesn't answer how things should behave with &mut

There are no atomic accesses on &mut, so this question does not need answering I think.

C++ doesn't address what happens when you reuse the same location for multiple atomic_refs without overlapping lifetimes either

I would assume this is allowed by C++?

@cbeuw
Copy link
Contributor

cbeuw commented May 29, 2022

There are no atomic accesses on &mut, so this question does not need answering I think.

If calling atomic methods on the same thread don't count as atomic - they don't exhibit any weak behaviours anyway - then yes.

I would assume this is allowed by C++?

You can definitely express it in normal C++. You can also reuse a location used by an atomic_ref for non-atomic read and writes. Though both reuses can only happen after all threads have ended so I guess there's no ambiguity on what you'll read.

One thing is different though in C++: I've been told on the mailing list that there is no way to get two &int16_ts from a int32_t, where as std::mem::transmute::<&mut u32, &mut [u16; 2]> is valid in Rust (you can't do it with unions either which is what Wine's SRW did, because it's technically UB to access anything but the last written to member of the union). So location reuse is restricted to same-size in C++ but mixed-size is fine in Rust. Again, you can only do this after all threads previously atomically accessing the location have ended, so it should be fine

Copy link
Member

@thomcc thomcc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat formulation, and does seem to correspond to how I believed things work. I'll give it a day or so for other people to weigh in, in case they disagree, but otherwise r=me.

CC @Amanieu, @m-ou-se, @talchas, @chorman0773

@thomcc thomcc added the A-concurrency Area: Concurrency label May 29, 2022
@thomcc
Copy link
Member

thomcc commented May 29, 2022

I think one open question is whether or not this should go somewhere in the reference instead. As it is, it might be a bit confusing to people not familiar with C++, or who don't need to know the formal model's details.

@chorman0773
Copy link
Contributor

One issue:
https://eel.is/c++draft/atomics#ref.generic.general-3.sentence-2

While any atomic_­ref instances exist that reference the *ptr object, all accesses to that object shall exclusively occur through those atomic_­ref instances

Atomic*::as_mut_ptr (unstable) might be inconsistent with it, but I'm not sure

Any use of the returned raw pointer requires an unsafe block and still has to uphold the same restriction: operations on it must be atomic.

Might save it, but that means that the following is definately undefined behaviour (and the safety comment is wrong):

let x = AtomicI32::new(0);

let y = &x;

let p = x.as_mut_ptr();
// SAFETY: The following line is safe because the current thread is the only one with access to `x`, so no data race occurs
unsafe{p.write(1);}
``

@RalfJung
Copy link
Member Author

There are no atomic accesses on &mut, so this question does not need answering I think.

If calling atomic methods on the same thread don't count as atomic - they don't exhibit any weak behaviours anyway - then yes.

No, it's just not possible. All atomic operations take &self. If you call them on an &mut Atomic, then a short-lived shared reference is created, and the atomic operation is called on that.

One thing is different though in C++: I've been told on the mailing list that there is no way to get two &int16_ts from a int32_t, where as std::mem::transmute::<&mut u32, &mut [u16; 2]> is valid in Rust (you can't do it with unions either which is what Wine's SRW did, because it's technically UB to access anything but the last written to member of the union). So location reuse is restricted to same-size in C++ but mixed-size is fine in Rust. Again, you can only do this after all threads previously atomically accessing the location have ended, so it should be fine

Ah, fair. OTOH -fno-strict-aliasing is a commonly supported dialect of C and it does allow these casts. So C++ compilers in principle already have to say what happens then.

Might save it, but that means that the following is definately undefined behaviour (and the safety comment is wrong):

Yeah, I would say that is ruled out by the current safety comment. (FWIW I think we should allow such code eventually, but that needs a more careful survey of the memory model literature.)

@lengyijun
Copy link
Contributor

I think one open question is whether or not this should go somewhere in the reference instead. As it is, it might be a bit confusing to people not familiar with C++, or who don't need to know the formal model's details.

I like this pr.
I think we can emphasis that this note is only for C++ programmers.
We may emphasis that readers can skip this paragraph if readers are not familiar with C++.

@thomcc
Copy link
Member

thomcc commented May 30, 2022

Well, it's more because we say "The rust model is equivalent to the C++ model" (or however we phrase it), but then the rust model, on its face, seems like it provides capabilities that are inaccessible to the C++ model, and this elaborates on how that is not the case, explaining what it means exactly that we follow the C++20 memory model.

If it were just explaining the equivalent of the Rust construct to C++ programmers, I'd say this would be too niche.

@thomcc
Copy link
Member

thomcc commented May 30, 2022

Actually, this is really documenting something about the language, and while it's essentially already true, it probably still needs sign-off from someone other than me, I think t-lang (probably FCP too).

Does this work?

r? rust-lang/lang
@rustbot label +T-lang -T-libs

@rustbot rustbot added T-lang Relevant to the language team, which will review and decide on the PR/issue. and removed T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 30, 2022
@joshtriplett joshtriplett added the I-lang-nominated Nominated for discussion during a lang team meeting. label May 30, 2022
@RalfJung
Copy link
Member Author

Well, it's more because we say "The rust model is equivalent to the C++ model" (or however we phrase it)

Well, currently we say "the Ordering corresponds to the one in C++", and that's about it... whatever that even means, without also following the model in general.^^

@JakobDegen
Copy link
Contributor

JakobDegen commented May 30, 2022

It is not clear to me that this is a consequence of the C++ spec, but I think we should also explicitly say that racey (ie not otherwise synchronized) overlapping atomic accesses that do not use the same size and "base address" are generally UB. This means users can't be doing transmute::<&AtomicU16, &[AtomicU8; 2]> and then using all three references in racey ways. Can this be clarified as a part of this PR?

If this turns out to be controversial, I can open a UCG issue to discuss (I found some examples of misusing such accesses that turn out to be problematic).

@thomcc
Copy link
Member

thomcc commented May 30, 2022

It's hard for me to imagine we can really support mixed size access as not UB when the x86 manual literally says not to do it, but at the same time it might be needed for various use cases we don't want to completely forbid. (Specifically to provide access to the "sub-basement" mentioned in https://gankra.github.io/blah/tower-of-weakenings/ :p)

But perhaps it's worth discussing more separately? I think that's at least a little bit more controversial than these changes, and would hate to block it on that.

@chorman0773
Copy link
Contributor

chorman0773 commented Jun 2, 2022

Actually, one another thing I'm unsure about. When you take a &mut Atomic*, or otherwise destroy the &Atomic*, what happens to the modification order of the Atomic?
I don't believe [atomics.ref.generic] specifies what happens to the modification order of the referent when you destroy the last std::atomic_ref to an object, then create a new one fresh.

Two possible options:

  • The modification order persists, but its on a non-atomic object which doesn't make sense in the AM world.
  • The modification order vanishes, which has some interesting side effects with release-acquire ordering, and sequential consistency (it can remove both synchronizes-with edges and coherence-ordered before relationships).

@RalfJung
Copy link
Member Author

RalfJung commented Jun 2, 2022

I can do the same in C++ by creating an atomic_ref to some regular data, using it atomically, destroying the atomic_ref, creating a new atomic_ref, and using that atomically. If the C++ spec fails to say what happens in that case, that is a C++ spec bug.

@chorman0773
Copy link
Contributor

I can do the same in C++ by creating an atomic_ref to some regular data, using it atomically, destroying the atomic_ref, creating a new atomic_ref, and using that atomically. If the C++ spec fails to say what happens in that case, that is a C++ spec bug.

Yes, you can. I did explicitly mention that I don't believe that it does

I don't believe [atomics.ref.generic] specifies what happens to the modification order of the referent when you destroy the last std::atomic_ref to an object, then create a new one fresh.

@RalfJung
Copy link
Member Author

RalfJung commented Jun 2, 2022

Okay, then I don't think that affects this PR. We're not going to start fixing C++ spec bugs here. :)

(FWIW I think the modification order has to persist for this to make any sense.)

@chorman0773
Copy link
Contributor

chorman0773 commented Jun 2, 2022

One reason I'm concerned about this is that if its the former, then the rules in [intro.races] seem to imply the following result which does not make intuitive sense:

_Alignas(std::atomic_ref<int>::required_alignment) int atomic;
std::atomic_ref ref{atomic};
ref.store(1,std::memory_order::relaxed);
ref->~std::atomic_ref(); // ends the lifetime of ref
atomic = 2;
::new(&ref) std::atomic_ref{atomic}; // creates a new one in-place. This is basically equivalent to dropping the first reference, and replacing it with a new one
int val = ref->fetch_add(1, std::memory_order::relaxed);  // val is 1, since [atomics.order] requires that a read-modify-write operation reads the last value written in modification order before the corresponding write.

This happens because the value stored in the atomic = 2; line does not happen while atomic is an atomic object, so it won't participate in the modification order.

The corresponding rust is:

let mut atomic = AtomicI32::new(0); // value in new isn't required, since `atomic` was default-init in C++
atomic.store(1,Ordering::Relaxed);
atomic = AtomicI32::new(2); // or `*atomic.get_mut() = 2;`
let val = atomic.fetch_add(1, Ordering::Relaxed); // reads the 1 stored 2 lines above

@RalfJung
Copy link
Member Author

RalfJung commented Jun 2, 2022

Whatever it is makes equally little sense in C++ and Rust though. So it doesn't affect this PR.

(IMO the entire notion of 'atomic object' makes no sense, so I am not surprised there are problems. So one day I hope we can have our own memory model that avoids such notions, but still interoperate with C++. But for now I fail to see how your comments are not entirely off-topic for this particular clarification I am proposing.)

@chorman0773
Copy link
Contributor

Whatever it is makes equally little sense in C++ and Rust though. So it doesn't affect this PR.

I do agree, it seems as though atomic_ref didn't quite make it through in a way that made sure it was handled thouroughly. They tried to handwave away the mixed atomic-access problem by saying it's UB to access it non-atomically, but it fails to handle the "End the lifetime of the last atomic_ref object, do the non-atomic access, then create a new one".

(IMO the entire notion of 'atomic object' makes no sense, so I am not surprised there are problems. So one day I hope we can have our own memory model that avoids such notions, but still interoperate with C++. But for now I fail to see how your comments are not entirely off-topic for this particular clarification I am proposing.)

I do also agree that atomic object makes no sense (I did away with that in XIR and just refered to "atomic operations to a memory location"). However, it might be good to ensure that the clarification doesn't mandate suprising behaviour like this. Otherwise, it may be problematic across the board (both for implementations and users).

@RalfJung
Copy link
Member Author

RalfJung commented Jun 2, 2022

Is there any concrete change you are proposing for this PR? If not, I suggest to stop this off-topic discussion.

I don't think we should start listing C++ atomics errata in our library docs. Feel free to open a PR against the reference if you think this needs to be carefully spelled out there.

@chorman0773
Copy link
Contributor

chorman0773 commented Jun 2, 2022

Is there any concrete change you are proposing for this PR? If not, I suggest to stop this off-topic discussion.

I'm not sure there's an easy change here, without ripping up the whole MT Model, and substituting something else (which is fun to say the least). It's more of something that should be considered. C++ isn't sure how this functions, and given that it can be observed in safe code, it's a potential problem (either has suprising results for both users and implementors, or is a potential footgun when combined with sequential-consistency or release-acquire synchronization) either way.

It might be better to leave this unspecified until this is resolved.

@cbeuw
Copy link
Contributor

cbeuw commented Jun 2, 2022

@chorman0773 if you want to discuss further about C++ atomic_ref issues you could use this mailing list thread https://lists.isocpp.org/std-discussion/2022/05/1662.php

@joshtriplett
Copy link
Member

Added one suggestion to leave the door open for future improvements in this area; otherwise, happy to merge this.

@joshtriplett joshtriplett removed the I-lang-nominated Nominated for discussion during a lang team meeting. label Jun 21, 2022
Co-authored-by: Josh Triplett <josh@joshtriplett.org>
@RalfJung
Copy link
Member Author

@bors r=joshtriplett

@bors
Copy link
Contributor

bors commented Jun 21, 2022

📌 Commit 4768bfc has been approved by joshtriplett

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 21, 2022
@RalfJung
Copy link
Member Author

@bors rollup=always

JohnTitor added a commit to JohnTitor/rust that referenced this pull request Jun 22, 2022
clarify how Rust atomics correspond to C++ atomics

`@cbeuw` noted in rust-lang/miri#1963 that the correspondence between C++ atomics and Rust atomics is not quite as obvious as one might think, since in Rust I can use `get_mut` to treat previously non-atomic data as atomic. However, I think using C++20 `atomic_ref`, we can establish a suitable relation between the two -- or do you see problems with that `@cbeuw?` (I recall you said there was some issue, but it was deep inside that PR and Github makes it impossible to find...)

Cc `@thomcc;` not sure whom else to ping for atomic memory model things.
bors added a commit to rust-lang-ci/rust that referenced this pull request Jun 22, 2022
Rollup of 10 pull requests

Successful merges:

 - rust-lang#95446 (update CPU usage script)
 - rust-lang#96768 (Use futex based thread parker on Fuchsia.)
 - rust-lang#97454 (Add release notes for 1.62)
 - rust-lang#97516 (clarify how Rust atomics correspond to C++ atomics)
 - rust-lang#97818 (Point at return expression for RPIT-related error)
 - rust-lang#97895 (Simplify `likely!` and `unlikely!` macro)
 - rust-lang#98005 (Add some tests for impossible bounds)
 - rust-lang#98226 (Document unstable `--extern` options)
 - rust-lang#98356 (Add missing period)
 - rust-lang#98363 (remove use of &Alloc in btree tests)

Failed merges:

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 25b8449 into rust-lang:master Jun 22, 2022
@rustbot rustbot added this to the 1.63.0 milestone Jun 22, 2022
@RalfJung RalfJung deleted the atomics branch June 22, 2022 17:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-concurrency Area: Concurrency S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.