-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make AcqRel universally usable as ordering mode #2503
Conversation
👍 For this RFC. I know that I have in the past used |
I don't think this is a good idea: let _ = atomic.swap(new_value, AcqRel); Has different characteristics than: atomic.store(new_value, Rel); I think allowing |
@Diggsey I’m not seeing your point. Those are two entirely distinct operations, regardless of what ordering you supply them. |
AFAIK it does not -- it literally means "this is a read-modify-write operation; the read part is EDIT: Ah, I see what you mean. However, I feel the fact that it says |
No, @Diggsey is correct. Another thing to consider is the consistency with If you want to solve the issue with not all combinations being legal or making sense, a vastly superior solution would be to check it statically at compile time, or to trap the "noop" choices at runtime. |
I am pretty sure that is not correct. A EDIT: In fact, you can see this in the code: pub fn compare_and_swap(&self,
current: $int_type,
new: $int_type,
order: Ordering) -> $int_type {
match self.compare_exchange(current,
new,
order,
strongest_failure_ordering(order)) {
Ok(x) => x,
Err(x) => x,
}
}
fn strongest_failure_ordering(order: Ordering) -> Ordering {
match order {
Release => Relaxed,
Relaxed => Relaxed,
SeqCst => SeqCst,
Acquire => Acquire,
AcqRel => Acquire,
__Nonexhaustive => __Nonexhaustive,
}
} So, |
Sorry, my bad, you're right. I'm used to the GCC builtins that have separate success and failure orderings, where this case never comes up, so my understanding has been incomplete. EDIT: After rereading the motivation, I see that you're actually concerned with read-modify-write with accidentally weaker ordering than required. To that I would argue that if you're explicitly weakening the ordering from SeqCst, you already need to be reasoning properly about how the ordering affects the semantics of your code. Tbh I have never seen a case where AcqRel would be incorrect if SeqCst is correct, but that doesn't mean there aren't any, and to promote blindly stuffing AcqRel everywhere without thinking instead of SeqCst is playing with with fire. Such a developer shouldn't mess around with atomics in the first place, no point enabling even more half-assery in that area. |
But if it succeeds it will give that ordering. Using TBH, I'm not a fan of the "single argument" version of compare-and-swap to begin with, as I think for this kind of code, the explicitness of the two-argument version is always preferable. I really don't see the case for making a situation that is both rare (most programs should probably not be using atomics directly) and incredibly error prone (not sure I need to argue this point) less explicit. It's not like you can just stick |
I don't agree with this proposal for another reason. In my vocabulary, "release" means that the order between the previous instructions and the current release instruction should be preserved; and "acquire" means that the order between the current acquire instruction and later instructions should be preserved. So to me, a "release load", if exists, should mean that the instructions before it should be ordered before the release load. Similarly for "acquire store". Probably that's the reason why release load and acquire store cause panic at run time: there's no way to enforce those orderings, so we bail out. |
I'd prefer if the compiler simply emitted warnings when an invalid ordering is passed to an atomic operation (i.e. it would panic). |
From the RFC:
I was trying to find out about this. The general feeling seems to be that an acquire-release with pure load/stores is not allowed in C++. But is it really the case? Looking at: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/n4713.pdf page 1202, 1.2:
And similarly for loads. So they should be allowed, but have only either the release or acquire effect. |
@pssalmeida to quote from that paper you linked:
And similarly for load. I'm not sure what "page 1202, 1.2:" is referring to, there's nothing relevant on that page. |
@Diggsey well, that seems to clarify that it is indeed forbidden. But the point 1.2 in page 1202 that I quoted above, if not contradictory (according to some definition of "store" which I have not bothered to look-up), it is misleading to say the least. It seems to imply that those orderings (release, acq_rel, and seq_cst) are all valid for a store, and they perform a release operation. I.e., just as seq_cst performs a release for a store, so would acq_rel. But this was just about that concern regarding C++. Regarding Rust, we could do things differently, but I was not advocating it. Given that these things are incredibly tricky, and that using AcqRel in a pure load/store may be misleading to people reading code, I think the more explicit the better. So, maybe it is better to stick to what we have, have panic in those cases, and force people to consider each ordering at each place, instead of going for AcqRel everywhere as a default with no much thought. |
@jeehoonkang I do not follow. I am not proposing to allow "Release loads". I am proposing to provide a way to say "give me the release/acquire variant of this operation, please" without having to think about whether this is a load, store or RMW (read-modify-write). So I don't understand how your comment relates to the proposal.
Actually, my own thinking is the other way around -- release/acquire is what you need to do ownership transfer. If you are explicitly strengthening the ordering to
That doesn't help when one accidentally writes So, I understand some folks here are opposed to the fact that @pssalmeida, you argue that this is actually good because it is "explicit". I could find myself agreeing if getting things wrong would reliably be caught by the compiler, but that is not the case. Case in point: When I wrote the original version of rust-lang/rust#52349, I was somewhat worried that I had accidentally used the wrong mode at some point. The irony is, I was fairly certain (as certain as I can get without a formal proof) that using release/acquire consistently is sufficient. I feel savvy enough with release/acquire that I could do what @le-jzr calls "reasoning properly about the code" without too much effort. Everything in People are wary of using orderings other than So, can we get agreement that there is a problem here that is worth solving? If we can get that, I think there might be other ways to solve it than repurposing |
@RalfJung I would 100% be in support of any way to enforce valid uses of memory orderings at compile time, I'm only opposed to the specific solution proposed here of changing the meaning of those orderings. Some ideas:
|
IMO the thing that actually needs doing here is correctly the docs for |
@parched that's not correct, |
The problem is not just whether the ordering is valid for the operation. Your first two proposed solutions fail to address this case. The last would address it, but it doesn't actually express the intent I would like to express: I want to say "make all loads acquire and all stores release". I have reasonable intuition for what my program will and will not do if this is the case. There is no reason to make a load have release-semantics. |
This just seems to mirror my complaint against the single argument version of the function existing at all - I agree it's dangerous, even with
But this is my point: there are reasons to want loads to have release semantics, or a store to have acquire semantics. For example, this pseudocode:
This code is broken because the load in thread 1 can be reordered inside the mutex from the perspective of other threads, and this can result in a deadlock. This is an issue I've encountered in real code. To fix it you need the "mutex unlock" to have Acquire ordering, or the "wait for flag" to have Release ordering. |
No, this is not about the single-argument form. We'd need a three-argument form that gives the mode for all of success read, success write, fail read separately. Though I think having a way to just say "release or acquire, whatever applies" is simpler and easier to both read and write. |
That doesn't make sense? The operation is atomic, there are not separate ordering for the read vs the write, there are just separate orderings for whether the operation succeeds or fails. If you specify |
@Diggsey This is going to be just about terminology: From my understanding, saying "an acquire write" or "a release read" is non-sensical (it doesn't mean anything). Write operations can only be relaxed/release/seqcst, while read operations can only be relaxed/acquire/seqcst. CAS/CMPXCHG operations on failure perform just a read (can be relaxed/acquire/seqcst), while on success atomically perform both a read (can be relaxed/acquire/seqcst) and a write (can be relaxed/release/seqcst). In particular, if the success case in a CMPXCHG has acqrel ordering, that means the read part has acquire ordering and the write part has release ordering. And if the success case is marked with acquire, that means the read part has acquire ordering and the write part has relaxed ordering. There are two parts to the CMPXCHG operation, but they both happen atomically at the same time. What you mean by "an acquire write" is probabaly an acquire swap operation. And "a release load" would have to be a CAS loop because we need an actual write operation in order to obtain release ordering: let mut val = a.load(Relaxed);
loop {
let actual = a.compare_and_swap(val, val, Release);
if actual == val {
break;
} else {
val = actual;
}
} @jeehoonkang ^ Did I get this right? |
@stjepang from my point of view, the lack of "an acquire write" or "a release read" is an architecture- specific curiosity. It's easy to define semantics for those hypothetical operations. Just look at how load is implemented on x86: in actuality, all memory orderings are equivalent here, they all translate into a single "mov" instruction (putting aside compiler transformations for the moment). The fact is that the C11 memory model implements a super-set of the functionality actually supported on any given architecture. The reason this works is that the compiler is only required to generate code with at least the specified ordering guarantees, it is free to be more restricted than necessary. If we're going to extend load/store to accept |
Interesting. Lifeness guarantees in weak memory concurrency are still a very active area of research, and maybe this is why. However, I don't feel like the solution is to add extra ordering constraints here. The solution is to declare this (moving a release write down below an acquire read of a different location) a compiler bug. This RFC dos not propose to add any new operations, or change the semantics of existing operations. I think that's a much bigger change. All I am asking for is a good way to write code using release-acquire concurrency. I don't think the problem you are mentioning is related to the problem I am trying to solve.
That's not correct. A read-modify-write operation has a "read end" and a "write end". The "read end" is relevant to answer "does this operation synchronize with the earlier store that is being read here", and the "write end" is relevant to answer "will a future operation that reads from this write synchronize with this operation". That is why the "success" order can be The "write part" of an operation never has |
I think you did.
No, it is not easy. I invite you to try. Notice that we are talking about a programming language model here, not about a CPU architecture model. On the CPU, an acquire load translates to a fence and a load (at least on ARM), but in the language, an acquire load ins NOT equivalent to a fence and a load. As I discussed above, release/acquire are used to turn reads-from edges into synchronizes-with edges. Both ends of the reads-from edge have to "opt-in" to doing this. When the read end of the edge opts in, we call it an acquire read. When the write end of the egde opts in, we call it a release write. It just makes no sense at all to talk about acquire writes or release reads.
This is a very bad argument. Not only (as I just mentioned above) can you not conflate language-level and CPU-level memory models, also using x86 as example is a bad idea because (as you mentioned) every operation in x86 is (strictly stronger than) release/acquire. (Not all orderings are equivalent though, However, the same distinction remains meaningful when compiling x86 code because LLVM can and will optimize your code based on these ordering modes before translating it to the target architecture. |
Can you explain why you think this, I'm genuinely interested if I'm missing something here? As far as I can tell this is working exactly as the memory orderings are defined?
You kind of defined it for me: the compiler/CPU moving an acquire write down below any other atomic operation is forbidden. Normally there would be another half to this about when the store is visible to other threads: an Acquire store simply does not have this component beyond the basic requirements of an atomic store.
I'm not conflating them - I'm arguing that this distinction is important: that just because an operation doesn't exist in the CPU-level model, doesn't prevent it from being added to a language-level model. |
It's a compiler transformation that changes a program from "working" to "deadlocking". Your program has no undefined behavior. I don't see anything that gives the compiler the license to do that. The burden of proof for a reordering to be legal is on the compiler: It most show that doing this reordering will not break programs.
That's not at all how these memory models are defined. The set of legal transformations is a consequence of how the model is defined. A model defines, given a program, which behaviors are possible. "Can this function return 0?" is the kind of question that a model answers. Once the model is defined, one can then try and prove that a certain transformation is correct, by showing that the new program (after the transformation) will not do anything that the old program (before the transformation) did not also possibly do. For example: static DATA : AtomicUsize = AtomicUsize::new(0);
static FLAG : AtomicUsize = AtomicUsize::new(0);
fn main() {
rayon::join(
|| {
// Initialize data
DATA.store(1, Ordering::Relaxed);
// Signal that we are done
FLAG.store(1, Ordering::Release); },
|| {
// Wait until the other thread is done
while FLAG.load(Ordering::Acquire) == 0 {}
// Print data
println!("{}", DATA.load(Ordering::Relaxed));
}
);
} The memory model defines that the only possible output of this program is I already explained above how, when load A reads-from store B, that may or may not give rise to a synchronizes-with edge between these two events. This synchronizes-with edge then gives rise to a happens-before relationship, which has effects on other events. There just is logically no place in this for an "acquire write". "acquire" is the property of the load end of this-load-reads-from-that-store that makes them synchronize. (This is also getting off-topic because I do not want to define a new memory model in this RFC, just change the interface to talk with an existing memory model.) |
Oh, I completely missed that it was marked as |
This seems like a neat feature, but atomic orderings have almost always been explained in Rust as "it's the exact same as C++" because it's such a complicated topic that we don't want to add any extra complication to the story. While this new ordering can be explained in terms of other orderings, I fear that it'd be a roadblock in an already complicated story, and I'm not sure that it justifies its weight. The main point against this to me is that it's not present in any other language, and we don't really want to be trailblazers in the area of atomic orderings. Is there a point in favor, however, of why we should include this when it's not included anywhere else (and maybe not on track to be included anywhere else?) |
So basically you are saying "let's stick with a sub-par API because it is what everyone else uses"? I don't have much to say to that, except that that makes it really hard to improve the situation. :( You said yourself that it is such a complicated topic, yet at the same time you are giving an argument to thwart any attempt to improve that situation. |
That's what I'm saying bluntly, yes, but I think it misses the rationale for why I'm saying it. Diverging from a model which we have stated that we match exactly means confusion for Rust users as they have to figure out something new that's not already in the model and also runs the risk of clashing with future extensions to the model. It does not seem worth the increase in convenience to run this risk for an extremely core part of Rust. |
It's not really diverging from the model, it is just a slight extension of the API we use to communicate with the model -- an extension that can be easily translated down to the C++ model (and even the C++ API), but the translation is error-prone, hence the suggestions to automate it. But I guess you are not actually talking about just the model, but the API as well. And I get the argument, I just don't agree with the weights you are assigning to the different aspects here. Fair enough. |
Ah yes that's true, I'm considering the model and the API one and the same because they have a 1:1 correspondance today (roughly at least). I think it's definitely a good thing this proposal is purely sugar, but I don't think such sugar belongs in the standard library personally at least. I'm curious to hear what others on @rust-lang/libs think, though, and the best way to figure that out is probably... @rfcbot fcp close |
Team member @alexcrichton has proposed to close this. The next step is review by the rest of the tagged team members: No concerns currently listed. Once a majority of reviewers approve (and none object), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up! See this document for info about what commands tagged team members can give me. |
Would it be possible at all to do this experimentally as an eRFC? |
What would be the value in an eRFC? I don't think there's any uncertainty about what the proposed change would look like in practice, there's just a fair amount of disagreement with it, on a number of different points. An eRFC is not supposed to be a way to just sneak features in with less scrutiny. I'd like to see an RFC for some kind of static analysis instead. If there's a mistake that is too easy to make (eg. using |
I don't mean it that way. It would be a way to get it into the hands of users who could give feedback of whether it is confusing or not. |
That's exactly what the existing stability system is. |
Except that the existing system doesn't allow the degree of experimentation where we implement features just to see what they are like... with no implications for what the final decision will be... |
A coworker of mine recently ran the Relacy Race Detector on our C++ codebase and it did highlight a number of issues in a couple wait-free algorithms (which were benign on x86, but formally undefined). It's a bit awkward to use, in C++, due to the heavy degree of modification of the source code necessary but I think that Rust could offer a much superior interface based on procedural macros. In any case, I heavily encourage all interested parties to have a look. It's all the more interesting that only does it boast a "no false-positive" property, but it also produces detailed traces of "wrong" executions to help visualize the issue with the faulty orderings, teaching the user at the same time it fixes their mistake, much as rustc diagnostics. |
🔔 This is now entering its final comment period, as per the review above. 🔔 |
The final comment period, with a disposition to close, as per the review above, is now complete. By the power vested in me by Rust, I hereby close this RFC. |
Rendered