Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic atomic #1474

Closed
wants to merge 3 commits into from
Closed

Generic atomic #1474

wants to merge 3 commits into from

Conversation

mahkoh
Copy link
Contributor

@mahkoh mahkoh commented Jan 22, 2016

Add compiler support for generic atomic operations.

@nrc nrc added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jan 22, 2016
@dgrunwald
Copy link
Contributor

Rendered View

When a call to one of the corresponding atomic intrinsics cannot be translated to a native operation, it is translated to a call to the corresponding language item.

Presumable we'd want libstd to define those language items? If so, how would they be implemented?
Do they unconditionally panic? Do they use a StaticMutex?

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 23, 2016

They behave like the equivalent atomic operations.

If so, how would they be implemented?

There are many ways to implement this.

@dgrunwald
Copy link
Contributor

There are many ways to implement this.

But you can't have multiple crates provide the same language item; so one of those ways has to be picked in libstd. Or do you want this feature to be only usable in applications, not in libraries?

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 23, 2016

But you can't have multiple crates provide the same language item; so one of those ways has to be picked in libstd.

The RFC does not dictate how the functions are implemented. Only how they behave.

One intrinsic is added:

```rust
fn has_native_atomic_ops<T>() -> bool;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be reasonable to make this intrinsic const?

@main--
Copy link

main-- commented Jan 24, 2016

The design seems a bit ... incomplete? When I read the part about introducing new (unsafe) langitems to do the heavy lifting, I expected a nice and safe interface type with proper trait bounds (e.g. load probably requires Copy) just like the Atomic<T> from the motivation section (that C++ has as well). As this is pretty much the usecase for this (where Atomic<T> is probably little more than safe wrappers around these langitems), I think this should be included in the RFC.

As C++'s version of this is template-based, I wonder if there's a nice way to do this using impl specialization (RFC #1210) instead of langitems.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 24, 2016

When I read the part about introducing new (unsafe) langitems to do the heavy lifting, I expected a nice and safe interface type with proper trait bounds (e.g. load probably requires Copy) just like the Atomic from the motivation section (that C++ has as well).

The language items are an implementation detail of the corresponding intrinsics. User code never calls the language items directly. It calls the intrinsics. The intrinsics don't have any such bounds since they are low-level. Just like copy_nonoverlapping doesn't have a Copy bound. Atomic<T> is not part of the RFC, it just serves as motivation.

I wonder if there's a nice way to do this using impl specialization (RFC #1210) instead of langitems.

Not unless there is a trait that encodes has_native_atomic_ops.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 24, 2016

@main--
Copy link

main-- commented Jan 24, 2016

@mahkoh Okay, so all this RFC does is handle the case when code using any std::intrinsics::atomic* would (under current rules) fail to compile, right? That still leaves the question why Atomic<T> isn't part of this RFC. I can't think of a case where one would prefer to use the intrinsics instead.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 24, 2016

so all this RFC does is handle the case when code using any std::intrinsics::atomic* would (under current rules) fail to compile, right?

Yes.

That still leaves the question why Atomic isn't part of this RFC.

How and if a type such as Atomic<T> is available in the libraries that ship with the compiler does not concern me. As you can see from the tags, this is a T-lang issue, not a T-libs issue. Feel free to open another RFC that handles such high-level details.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 24, 2016

Updated

@huonw
Copy link
Member

huonw commented Jan 24, 2016

Echo concerns of others, I feel like the signature given isn't necessarily enough: it seems to me that making the given operations atomic for an arbitrary T could require more data than just the T for some implementation choices (e.g. a Mutex associated with each T), which a pointer *const T doesn't give at all. I'm possibly missing something, so could you expand on why you think it's OK?

Library concerns/"high level details" definitely aren't orthogonal/irrelevant to this RFC: language features need to be designed to be appropriate for the various library features that want to be built on top of them.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 24, 2016

so could you expand on why you think it's OK?

The implementations of the functions are trivial and already described in the RFC.

static LOCK: Lock = ...;

fn atomic_load<T>(src: *const T) -> T {
    let _guard = LOCK.lock();
    ptr::read(src)
}

language features need to be designed to be appropriate for the various library features that want to be built on top of them.

Only implementation details of the various intrinsics that already exist are introduced. No library will never interact with them.

@huonw
Copy link
Member

huonw commented Jan 25, 2016

The implementations of the functions are trivial and already described in the RFC.

Yes, that implementation is mentioned as an example of how to implement it, but there's more to being OK than some implementation existing. I mentioned an alternative in my comment which seems like it would be significantly better along several major axes, such as not globally serializing all of these atomic operations.

If we were going to have a signature that basically forced the operation to have that implementation, it might make more sense to make the mutex itself a language item (or init/lock/unlock functions), which would give more flexibility, e.g. there could be a mutex for each type T, instead of a single global one.

Only implementation details of the various intrinsics that already exist are introduced. No library will never interact with them.

So... you're saying no library will ever include an implementation of these language items? This is clearly untrue, and so the feature needs to be flexible enough for libraries that do implement the language item (which will be a small set) to do it in the best way for them.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 25, 2016

I mentioned an alternative in my comment which seems like it would be significantly better along several major axes, such as not globally serializing all of these atomic operations.

More flexible implementations are also possible e.g. by computing a hash of the address.

So... you're saying no library will ever include an implementation of these language items? This is clearly untrue

Congratulations, you managed to twist my words.

the feature needs to be flexible enough for libraries that do implement the language item (which will be a small set) to do it in the best way for them.

What's that? The most space efficient way? The fastest way? The one that minimizes the cost on both axes? For every given implementation you can find pathological cases, e.g., type-based locks will perform very poorly in an application that uses only one type compared to using address-based locks.

An efficient higher level implementation requires impl specialization so that Atomic<T> contains a Mutex<T> for large T and Cell<T> for small T. This is orthogonal to this RFC.

@huonw
Copy link
Member

huonw commented Jan 25, 2016

More flexible implementations are also possible e.g. by computing a hash of the address.

This isn't very convincingly "OK" to me: it seems like it would quickly leak a pile of memory if atomic values end up being relatively dynamic in memory. This is certainly true for e.g. concurrent queues, but these use values for which hardware has the appropriate instructions, so the precedent from such examples may not be relevant. I have no experience about what is done elsewhere (e.g. C++), so this could well be reasonable in practice.

Congratulations, you managed to twist my words.

I'm not really sure what your original point is supposed to be. This is a language feature which picks up functions from libraries, and so it seems pretty reasonable to consider how libraries might want to implement those functions.

What's that? The most space efficient way? The fastest way? The one that minimizes the cost on both axes? For every given implementation you can find pathological cases, e.g., type-based locks will perform very poorly in an application that uses only one type compared to using address-based locks.

Yes, that's basically my point: the current signature doesn't give much way to choose which trade-offs you want. (With current Rust, AFAICT, the signature can basically allows you to choose between a global lock and a concurrent map from addresses/typeids/something else to a lock.)

An efficient higher level implementation requires impl specialization so that Atomic contains a Mutex for large T and Cell for small T. This is orthogonal to this RFC.

I'm not really sure it is: once such an Atomic<T> is possible, what's the purpose of the feature in this RFC (i.e. what's the purpose of allowing the intrinsics to work with more types)? There is of course precedent of having short-term solutions until a more generic one works, but looking at the big picture is reasonable.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 25, 2016

This isn't very convincingly "OK" to me: it seems like it would quickly leak a ton of memory

How does it leak memory? You simply use a static array of N booleans (N = 16 or 32 or whatever), each of which is used as a spinlock. Which boolean you choose depends on the address of the atomic.

I'm not really sure what your original point is supposed to be.

Higher level libraries (such as the ones implementing Atomic<T>) aren't even aware of this implementation detail since they only interact with the atomic intrinsics.

Yes, that's basically my point: the current signature doesn't give much way to choose which trade-offs you want.

You have not yet shown a more flexible signature.

I'm not really sure it is: once such an Atomic is possible, what's the purpose of the feature in this RFC (i.e. what's the purpose of allowing the intrinsics to work with more types)?

Then remove it. This RFC merely introduces implementation details of the atomic intrinsics.

@huonw
Copy link
Member

huonw commented Jan 25, 2016

How does it leak memory? You simply use a static array of N booleans (N = 16 or 32 or whatever), each of which is used as a spinlock. Which boolean you choose depends on the address of the atomic.

That does seem more reasonable than the impression I got from your initial comment, thanks for being more explicit.

Higher level libraries (such as the ones implementing Atomic) aren't even aware of this implementation detail since they only interact with the atomic intrinsics.

I was and am talking about the low-level ones that have to choose how implement these details, since as you've said repeatedly, other libraries generally don't care about this detail and are hence not particularly relevant.

You have not yet shown a more flexible signature.

One could add #[lang] types as well, to have hardcoded form of specialisation. The signature would then be

#[lang = "unsupported_atomic_load"]
fn atomic_load<T>(x: *const UnsupportedAtomic<T>) { ... }

UnsupportedAtomic<T> could be chosen to be T if the library implementing all this wants to go with one of the high-contention versions, or (AtomicBool, T) if a per-value spin-lock is desirable. (The compiler would ensure that Atomic<T> chooses between the correct internals.)

(However, this is clearly complicated, especially for a temporary measure.)

Then remove it. This RFC merely introduces implementation details of the atomic intrinsics.

That's not answering my question, but I infer from the dismissal that I'm not missing any other important trick this is useful for, and this RFC is purely a stop-gap until we can do Atomic<T> with impl specialisation.

However, in that case, why can't the temporary hole be filled by a trait like trait Atom { fn load ... fn store ... fn compare_and_swap ... }, with Atomic<T: Atom>? This clearly doesn't cover all use-cases and requires manual/macro implementations, but can easily be implemented in libraries, without any compiler changes. It seems like all it needs is the trait, and a pair of macros atom_with_intrinsics! & atom_with_lock!, and then making a type work with Atomic is a single macro invocation. Of course, without impl specialisation, the locking versions probably can't automatically do better than a global scheme like that forced by the design in this RFC, but that's clearly OK for you. (And then, I suspect this could be made to be backwards compatible with impl specialisation in future, i.e. once that works, the Atom trait becomes the driver for that.)

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 25, 2016

The compiler would ensure that Atomic chooses between the correct internals.

This sounds like Atomic<T> must be special-cased in the compiler.

But if you're willing to leak those implementation details then I've already hinted above at a simple solution that uses impl specialization:

trait AtomicOps<T> {
    fn load(&self) -> T;
    ....
}

impl<T> AtomicOps<T> for Atomic<T> { ... }
impl<T: HasNativeAtomicOps> AtomicOps<T> for Atomic<T> { ... }

where HasNativeAtomicOps is a marker trait implemented by the compiler.

but can easily be implemented in libraries

It cannot because it requires knowledge of how the type is represented in memory to decide whether it can use native atomic operations. Such knowledge is, in general, unavailable outside the compiler.

@huonw
Copy link
Member

huonw commented Jan 25, 2016

This sounds like Atomic must be special-cased in the compiler.

Yes, but this is, I believe, relatively simple, at least compared to impl specialisation.

But if you're willing to leak those implementation details then I've already hinted above at a simple solution that uses impl specialization:

I don't understand your point: if that worked, this whole discussion would be totally moot. We currently do not have impl specialisation, and so this doesn't work (and I don't think just making a single trait a lang item to allow specialisation will work: there's a pile of impl selection infrastructure in the compiler that needs to understand it to find the right code to run). In any case, that's exactly what we've both been talking about with impl specialisation, and exactly what I was referring to when I mentioned that the manual trait design could be expanded to automatically use impl specialisation when it works.

It cannot because it requires knowledge of how the type is represented in memory to decide whether it can use native atomic operations. Such knowledge is, in general, unavailable outside the compiler.

Yes, I was quite explicit that this requires manual intervention and hence won't be perfect, but it allows you to progress without (waiting for) hacks in the compiler. You can deduce which types have atomic operations for most types in a simple manner, and, anyway, I suspect the vast majority of atomic values will be the obvious ones: primitives and pointers. This serves as a 90% (or 95%, or more) solution, until the 100% solution of impl specialisation works.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 25, 2016

This serves as a 90% (or 95%, or more) solution, until the 100% solution of impl specialisation works.

The 90% case already works. This RFC is for the 10% case where the compiler crashes because the type doesn't have native operations.

@huonw huonw added the T-compiler Relevant to the compiler team, which will review and decide on the RFC. label Jan 25, 2016
@arielb1
Copy link
Contributor

arielb1 commented Jan 25, 2016

Allowing Atomic<T> for arbitrary values of T (e.g. including T=Vec<u32>) is not something we are likely to implement at the compiler level. Implementing it at the library level requires specialized struct fields which are not a planned item.

If lrs wants to play with this kind of things, I see no problems with it. By the way, you don't really need specialization here - can use type_name and casts as in

#![crate_type="rlib"]
#![feature(core_intrinsics)]
use std::intrinsics;
use std::ptr;

pub unsafe fn load_u32(u: *mut u32) -> u32 {
    atomic_load(u)
}

pub unsafe fn load_u64(u: *mut u64) -> u64 {
    atomic_load(u)
}
pub unsafe fn atomic_load<T>(t: *mut T) -> T {
    if intrinsics::type_name::<u32>() as *const _
            == intrinsics::type_name::<T>() as *const _{
        intrinsics::atomic_load(t)
    } else {
        global_lock();
        let result = ptr::read(t);
        global_unlock();
        result
    }
}
extern {
    fn global_lock();
    fn global_unlock();
}

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 25, 2016

pub unsafe fn atomic_load<T>(t: *mut T) -> T {
    if intrinsics::type_name::<u32>() as *const _
            == intrinsics::type_name::<T>() as *const _{
        ptr::read(t)

That's just undefined behavior.

if intrinsics::type_name::<u32>() as *const _
            == intrinsics::type_name::<T>() as *const _{

That doesn't help since the very point is that it should work for most types <= ptr size.

Allowing Atomic for arbitrary values of T (e.g. including T=Vec) is not something we are likely to implement at the compiler level.

Another case where rustc is worse than gcc/clang.

@arielb1
Copy link
Contributor

arielb1 commented Jan 25, 2016

That's just undefined behavior.

That was just me being lazy to illustrate the point. Fixed to use atomic_load. Actually trying it, I see that rustc happily emits bad IR. I consider that a bug - we should rather emit an unreachable.

Another case where rustc is worse than gcc/clang.

I agree that fully general untyped templates give C++ a big advantage in some places. We think the disadvantages of it are greater.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 25, 2016

I agree that fully general untyped templates give C++ a big advantage in some places. We think the disadvantages of it are greater.

Nobody is talking about C++. Unless you think that C also has untyped templates, your comment doesn't make much sense.

@mahkoh
Copy link
Contributor Author

mahkoh commented Jan 25, 2016

Superseded by #1477

@mahkoh mahkoh closed this Jan 25, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants