Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stabilize volatile copy and set functions #2728

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
232 changes: 232 additions & 0 deletions text/0000-volatile-copy-and-set.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,232 @@
- Feature Name: volatile-copy-and-set
- Start Date: 2019-04-17
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000)
- Rust Issue: [rust-lang/rust#00000](https://github.com/rust-lang/rust/issues/00000)

# Summary
[summary]: #summary

Stabilize the `volatile_copy_memory`, `volatile_copy_nonoverlapping_memory`
and `volatile_set_memory` intrinsics as `ptr::copy_volatile`,
`ptr::copy_nonoverlapping_volatile` and `ptr::write_bytes_volatile`,
respectively.

# Motivation
[motivation]: #motivation

`ptr::read_volatile` and `ptr::write_volatile` were stabilized in RFC
[1467](https://github.com/rust-lang/rfcs/pull/1467). The stated motivation
at the time was that this allowed "volatile access to memory-mapped I/O
in stable code", something that was only previously possible using unstable
intrinsics or "by abusing a bug in the `load` and `store` functions on
atomic types which gives them volatile semantics
([rust-lang/rust#30962](https://github.com/rust-lang/rust/pull/30962))."

At the time, the decision was made not to also provide stable
interfaces for the `volatile_copy_memory` or `volatile_set_memory`
intrinsics, as they were "not used often" nor provided in C.
However, when writing low-level code, it is sometimes also useful
to be able to execute volatile copy and set operations.
Copy link
Member

@RalfJung RalfJung Jul 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should state, early in the motivation, that volatile copy and set operations are already possible in Rust, but have to be open-coded. The way this is written now, it sounds as if currently there was no way to do volatile copy or set in Rust.

This RFC, if I am understanding correctly, is entirely a "convenience"-RFC: it's about adding APIs to libcore that could already be implemented in a user library. So this should be clear from the start. Then the motivation should argue why this is an API that should move to libcore.

For such APIs it helps tremendously if there is an (ideally already tested and used) implementation in a user crate that it can point to. Even better if you can point at instances where people tried to implement this but got it wrong.


For example, when booting x86_64 "application processor" (AP) logical
processors, code copies a sequence of instructions that for the AP to
execute into a page in low physical memory, and then sends a startup
inter-processor interrupt (SIPI) to the AP's local interrupt
controller: the target interrupt vector number given in the SIPI is
multiplied by the page size to determine the physical memory address
where the AP should start executing. So a SIPI sent to vector 7 of
an AP causes that processor to begin executing instructions at
physical memory address 0x7000.
Centril marked this conversation as resolved.
Show resolved Hide resolved

That is:

```
extern "C" {
fn copy_proto_page_to_phys_mem(src: usize, phys: u64);
fn send_init_ipi(cpu: u32);
fn send_sipi(cpu: u32, vector: u8);

static INIT_CODE: *const u8;
static INIT_CODE_LEN: usize;
}

// A contrived type for illustration; not actually useful.
pub struct SIPIPage {
// Note that `bytes` is not visible outside of `SIPIPage`.
bytes: [u8; 4096],
}

impl SIPIPage {
// Note that the _only_ operation on the `bytes` field
// of `SIPIPage` is in `new`. The compiler could, in
// theory, elide the `copy`.
Copy link
Contributor

@gnzlbg gnzlbg Sep 24, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it cannot: unsafe code below calls copy_proto_page_to_phys_mem which reads from this field via a pointer to SIPIPage.

EDIT: @comex expands on this below: https://github.com/rust-lang/rfcs/pull/2728/files#r307566402

pub fn new() -> SIPIPage {
let mut bytes = [0; 4096];
unsafe {
core::ptr::copy(INIT_CODE, bytes.as_mut_ptr(), INIT_CODE_LEN);
}
SIPIPage { bytes }
}
}

fn main() {
let proto_sipi_page = SIPIPage::new();
let some_core = 2;
unsafe {
copy_proto_page_to_phys_mem(&proto_sipi_page as *const _ as usize, 0x7000);
send_init_ipi(some_core);
send_sipi(some_core, 7);
}
}
```

Obviously this is an unlikely way of initializing the SIPI page and
a real kernel would not do it this way.

Hoever, this code snippet is specifically constructed such that the
sequence of sending IPIs makes no reference to `proto_sipi_page` and
since the `bytes` field is not visible outside of `new`, this
illustrates a situation in which the compiler _could_ theoretically
elect to elide the copy.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like the problem isn't with the memcpy, but with sending the IPI. The IPI needs to be a release operation (from the hardware's view and the compiler's). I'm not familiar with x86 and you haven't given the implementation of send_sipi so I don't know if it is already. If it's not, you might just need some sort of barrier between the mempcy and sending the IPI.

Copy link

@comex comex Jul 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the motivating example is flawed.

In this case, copy_proto_page_to_phys_mem is apparently an extern function that's called with a pointer to proto_sipi_page, so the compiler has no choice but to ensure the contents of proto_sipi_page were actually written before the call. The fact that the pointer is first casted to usize doesn't change that – since it's first casted to a raw pointer, Stacked Borrows would still allow copy_proto_page_to_phys_mem to access it. So the example as written is 100% well-defined even without a volatile copy.

On the other hand, if this were done in a manner incompatible with Stacked Borrows, such as (I think?) by transmuting the reference to usize, then making the copy volatile would not help.

For one thing, the example code does not copy directly from INIT_CODE to proto_sipi_page. It copies to bytes, a local variable in the new function; that variable is then moved into a SIPIPage, which is a memcpy, and then moved to the caller, which is another memcpy. If you switched to a volatile copy, you'd be ensuring the data was actually written to some stack allocation representing bytes... but not that bytes was actually copied to proto_sipi_page!

This distinction not just theoretical but, unfortunately, real. If you compile the example in debug mode, the assembly emitted for SIPIPage::new contains a call to core::intrinsics::copy followed by two compiler-inserted calls to memcpy. Ideally LLVM's optimizer would be able to elide both of those memcpys, but if you compile in release mode, it actually only elides one of them. There is still an intermediate copy between bytes and proto_sipi_page.

You could fix the example in practice by directly copying from INIT_CODE to proto_sipi_page. But that wouldn't fix the UB in theory (that would result if you violated Stacked Borrows; as I said, the example as written is fine). If the optimizer for some reason thought that proto_sipi_page was unused, it would be free to, e.g., reuse the stack slot for other data, even if you had previously made volatile writes into it.

More generally, I'm not sure there's any way to redeem the example to actually require volatile memcpy. If the Rust code mapped the physical memory and copied the data directly to that mapping, then you'd finally be at a point where release operations matter (as mentioned by @parched). But those are easily obtained. If the implementation send_sipi is an extern function, calling it is a compiler release operation). If it's based on asm!, you would have to mark the assembly sequence with the memory clobber (to indicate that, in the words of the GCC manual, it "performs memory reads or writes to items other than those listed in the input and output operands"), which is also a compiler release operation. The one catch is that you'd have to make sure you didn't hold a mutable borrow of the data across the call, since then it would be UB for the function or asm block to read from it. However, using volatile to write the data in the first place wouldn't fix that UB. At best it would make it slightly harder for the compiler to exploit.

I'm happy to see volatile memcpy stabilized, but it seems more useful for something like IPC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is why I dislike contrived code examples in places like this. :-)

What do you mean about utility for IPC, precisely? As in shared-memory message passing or something like that?

Copy link

@comex comex Jul 26, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Caveat: In this recent thread, there was no real consensus on whether or not using volatile for shared memory in Rust is recommended - or even legal. But it is what C++ people recommend. If volatile is suited for shared memory, then volatile_copy would be an ideal way to safely copy large buffers into or out of shared memory.


If this sequenced used `core::ptr::copy_volatile` then the compiler
would know that the copy had some externally visible side-effect
and could not be elided.

When writing a multi-processor operating system kernel for x86_64 in
Rust, the programmer would copy the instruction text to some address
and write to the local programmable interrupt controller to send a
SIPI to start AP cores, but from the compiler's perspective, it might
appear that the memory holding the AP startup code is never referred
to again. The compiler could potentially choose to elide the copy
entirely, and the AP might start executing junk instructions from
uninitialized memory. In the worst case, this may silently corrupt
kernel state.

Using a volatile copy can inform the compiler that there is an
externally observable side-effect forcing it to preserve the copy.
Similarly, volatile "write_bytes" allows a program to preserve a
write that has some side-effect (for example, initializing register
state in a device, or clearing a frame buffer).

# Guide-level explanation
[guide-level-explanation]: #guide-level-explanation

Given these operations, one would write, for example, the following:

```
#[no_mangle]
pub unsafe extern "C" fn maybe_called_via_ffi(ptr: *mut u8; len: usize) {
println!("this function has a side-effect, and it is not just the println!");
core::ptr::write_bytes_volatile(ptr, SOME_DATA, SOME_DATA_LEN);
}
```

and assert that the `write_bytes_volatile` call is not be elided.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

`ptr::copy_volatile`, `ptr::copy_nonoverlapping_volatile` and
`ptr::write_bytes_volatile` will work the same way as `ptr::copy`,
`ptr_copy_nonoverlapping` and `ptr::write_bytes` respectively, but
with volatile semantics. As stated in RFC 1467, "the semantics of
a volatile access are already pretty well defined by the C standard.

We further propose enhancing the documentation for these functions
to the same level of the existing volatile functions.

Documentation presently refers to LLVM implementation details
to explain the memory model, etc, here:
http://llvm.org/docs/LangRef.html#volatile-memory-accesses.
We propose modifying existing documentation, and writing new
docuemntation, referring to the memory model in the C standard
instead.

# Drawbacks
[drawbacks]: #drawbacks

Volatile semantics are not well defined by the C standard, but
that is out of the scope of this proposal.

# Rationale and alternatives
[rationale-and-alternatives]: #rationale-and-alternatives

The intrinsics operations already exist and have the semantics
required by operating system implementors and others.

There are several alternatives, each with their own drawbacks:

1. Continue using the unstable `core_intrinsics` feature and use the
existing unstable intrinsics. However, this ties the programmer
to unstable Rust, which is undesirable in some environments.
2. Use the existing copy and set interfaces without volatile qualifiers
and hope that the compiler does not elide the relevant calls. While
likely workable in practice for most likely scenarios, this could
lead to surprising behavior if the compiler ever incorporates
sufficiently advanced analyses that allow it to determine that those
elisions are possible from its perspective. Hope is not a strategy.
3. Use the foreign function interface to call separately written code
in another language that provides the required semantics. This
is inelegant and complicates the build process.
4. Hand-code copy and set loops in terms of the existing `write_volatile`
function. This is inelegant, leads to duplicated code, and opens
up the possibility of bugs. For example, compare:

```
for (i, elem) in some_slice.iter().enumerate() {
unsafe {
core::ptr::write_volatile(&mut dest[i], *elem);
}
}
```
to,
```
unsafe {
core::ptr::copy_volatile(some_slice.as_ptr(), dest.as_mut_ptr(), some_slice.len());
}
```

Finally, it is important that this proposal not tie the Rust language
to the semantics of any given implementation, such as those defined by
LLVM. Futher Rust does not yet have a well-defined memory model we can
refer to for defining volatile behavior, and C does not define volatile
`memset`, `memcpy` or `memmove` functions. However, since the existing
`core::ptr::write_volatile` and `core::ptr::read_volatile` functions
are implemented in terms of well-defined semantics, it makes sense to
use similar semantics here. We therefore specify that
`copy_volatile`, `copy_nonoverlapping_volatile` and
`write_bytes_volatile` adopt semantics similar to those of `read_volatile`
and `write_volatile`: resulting loads and stores cannot be elided, and
the relative order to loads and stores cannot be reordered with respect
to one another, though other operations can be reordered with respect
to volatile operations.
Copy link
Member

@RalfJung RalfJung Jul 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not entirely sure what this long-winding paragraph is supposed to tell me. :) The open questions around volatile are not new, but can't we specify the new operations entirely in terms of the old? That would avoid even having to talk about volatile semantics in this RFC.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure what you mean. Do you mean that we should define the semantics of these operations to match those of write_volatile and read_volatile? Or that we should mention the LLVM semantics? When I put the proto-RFC on internals, I was asked specifically to avoid tying it to LLVM.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean that we should define the semantics of these operations to match those of write_volatile and read_volatile?

That. Volatile accesses are not new to Rust, so this RFC shouldn't have to deal with the complexity of specifying them.


# Prior art
[prior-art]: #prior-art

Other languages support volatile style accesses, notably C and C++.
Interestingly, volatile semantics in those languages are associated with
individual objects, and `volatile` is a type qualifier, not an operaton
attribute. In those systems, any number of operations on a
volatile-qualified datum result in volatile memory semantics; since
any identifier used by the standard library is defined to be reserved
for special treatment by the compiler, this means that the standard
`memcpy`, `memmove` and `memset` operations can all be expected to exhibit
volatile semantics if applied to volatle-qualified objects.

# Unresolved questions
[unresolved]: #unresolved-questions

None.

# Future possibilities
[future-possibilities]: #future-possibilities

A some point, a well-defined memory model for Rust may be stabilized that
would widen the design space and permit revisiting these primitives. For
example, "volatile" currently means that a write cannot be elided, but it
also imposes strict ordering semantics with respect to other volatile
accesses. One can envision a sufficiently rich memory model that one
might be some way to specify an "unelidable" write, but without ordering
constraints.