-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Roadmap #98
Comments
Hello, is there a plan to release a 1.0 version once this roadmap is complete? |
There are no concrete plans to release a 1.0 version. Currently we're working on version 0.7 (our latest alpha release is 0.7.0-alpha.3), and I expect that a lot of the work on this roadmap will go into that release (although not necessarily all of it). |
Does the 0.7 version of |
Yes, and so does 0.6.x. The only speedbump is the |
This describes me well. I considered multiple times using zerocopy and extending zerocopy. Pulling it out of Fuchsia and making accessible here on GitHub was a big step towards that. But then the formation of the "safe transmute" working group in the rust-lang organization made it seem like it would be just better to wait until the most important functionality is in libstd and guaranteed by the language/lib spec more explicitly. @joshlf made a PR for ring that shows how zerocopy benefits ring by reducing its unsafe code that's duplicative of zerocopy. And zerocopy has a higher bar for documenting and validating its correctness than my more informal approach in my own code. OTOH zerocopy is so large that I pretty much just have to take it on faith that y'all care so much about getting all the details right that I don't need to bother looking through it, because I can't (don't have time). Ultimately, I think people just would rather have as much of the functinoality w.r.t. transmuting and casting in the standard library as possible. I think if this is seen as a prototype that helps that happen, more people will be eager to adopt it. |
Just wanted to give you a heads up that we've added documentation that address a lot of this; if you get a chance to take a look, let me know if you have any feedback. ring is exactly the kind of consumer we're hoping to target with our policies - we hope to get to a point where crates like ring feel comfortable taking a dependency on zerocopy - so your feedback here has been really helpful, and it helped shape what we wrote in these PRs. See #405, #485, and #484. These are available on |
We're not using this issue as a tracking issue anymore. |
Overview
This issue describes zerocopy's high-level roadmap both in terms of goals and in terms of concrete steps to achieve those goals.
A slogan often associated with Rust is "Fast, Reliable, Productive. Pick Three." Zerocopy's mission is to make that slogan true by making it so that 100% safe Rust code is just as fast and ergonomic as
unsafe
Rust code.In order to live up to that mission, we need to do the following things:
unsafe
Motivation
A user story
Imagine you are a systems programmer. Any sort of systems software will do, but we need a specific example, so let's say you're writing a networking stack. You care about your software's performance, you care about your software's correctness, and you care about your team's productivity. In order to achieve maximum performance, you want your code to do as few things as possible, and that means avoiding any situation where your data must be converted between representations in the course of processing it. For example, if you are parsing a network packet, you want to operate on the packet in-place: so-called "zero-copy" parsing (hey, that's the name of the crate!).
Your first impulse might be to use
unsafe
code. Perhaps you write a parsing routine like:One of your goals is performance, and this code is fast! But you also care about your code's correctness, and you know that
unsafe
is notoriously difficult to get right (in fact, this implementation is unsound in two ways - can you spot them?). So you decide to be more careful. You spend the day poring over the Rustonomicon and the language reference. You find a fix some bugs in your code, and you even write a pseudo-proof of correctness in a "SAFETY" comment so that others can check your work.One of your goals is correctness, and this code is much more likely to be correct than the previous version! But you also care about your productivity, and you just spent an entire day writing a few lines of code. And what happens when you need to change the code? How much work will it take to convince yourself that a change is still correct? What if other, less experienced developers want to work on this section of code? Will they feel comfortable following your logic and feel confident in their ability to make changes without introducing bugs? So you decide to commit to never using
unsafe
. You modify your code to get rid of it and make whatever changes you need to get it to compile:One of your goals is productivity, and this code is easy to verify, so it was fast to write and will be fast to change in the future! But you also care about performance, and you're doing a lot more bounds checking and copying than you were before. Maybe the optimizer will improve things for you, but there's no way to be sure without benchmarking it, and even if the optimizer is smart enough this time, you might get unlucky with a future change that makes the code just confusing enough to stump the optimizer, leading to unexpected performance cliffs.
You think back on all of these attempts. You wanted fast code, so you used
unsafe
, but that made you worried about correctness. You also wanted correct code, so you spent a long time reasoning about your code's correctness and you wrote down that reasoning so others could check your work, but that took an entire day and resulted in code that would be slow to change in the future. You wanted to be productive, so you got rid of all of theunsafe
, but that made your code slow again. It seems like you just can't win!Moral
The moral of this story is that, when it comes to operations that touch memory directly, the Rust language and standard library are not on their own sufficient to achieve "Fast, Reliable, Productive. Pick Three." While the basic ingredients are all there, putting them together unavoidably requires sacrifices along one of the dimensions of speed, reliability, and productivity. Zerocopy aims to fill this gap. In the Design section, we outline the current state of zerocopy, identify the gaps between zerocopy's current state and its aspirational future, and outline the steps required reach that future.
Design
As mentioned above, zerocopy's mission is to make good on the slogan Fast, Reliable, Productive. Pick Three. by making it so that 100% safe Rust code is just as fast and ergonomic as
unsafe
Rust code. Using zerocopy, you could write the parsing code from the previous section like this:This is already a huge step above what you can do with just the standard library, and illustrates what it's like to have an API that takes care of all of this for you.
Thanks to ergonomics and safety like this, the building blocks that zerocopy provides are already being used in a diverse array of domains. Networking is zerocopy's origin and its bread and butter, but it is also used in embedded security firmware, in software emulation, in hypervisors, in filesystems, in high-frequency trading, and much more. However, it still has a ways to go before it can replace most of the unsafe code in the Rust ecosystem.
Gaps
User model
In order to identify gaps, it's helpful to say a bit about who we hope to reach with zerocopy.
Not looking to use unsafe code
A lot of use of unsafe code is by programmers who conceive of themselves primarily as trying to solve some practical problem. If they think about it at all, they think about unsafe code as a tool, not as an object of contemplation. They may have a vague sense of what the phrase "memory safe" means, and they may even know that pointers need to be aligned. They likely don't know that, in order to be able to convert a type to a byte slice, the type must not contain any uninitialized bytes, and they almost certainly have never heard of pointer provenance.
Often, these users don't know a priori that unsafe code is a tool they should consider. Instead, in trying to solve a particular problem, they may come across a crate or a Google search result which points them towards unsafe, or at least points them towards a crate which makes use of unsafe.
In order to reach users in this camp, we must:
AsBytes
trait should speak primarily about viewing a type as bytes; details about uninitialized bytes should be saved for the "Safety" section of the doc comment.Security-conscious
On the other end of the spectrum, many of our users come from domains which generally have a high bar for correctness - kernels, hypervisors, cryptography, security hardware, etc. These users are extremely wary of taking external dependencies, and only take dependencies when they absolutely need to or when they have a high degree of trust in an external software artifact.
In order to reach users in this camp, we must:
Care about the open-source ecosystem
Many potential users are the authors of crates which are published on crates.io. These users have concerns which are specific to publishing software in an open-source ecosystem. For example:
In order to reach users in this camp, we must have good open-source hygiene. We must:
Memory model instability and zerocopy's future-soundness guarantee
Rust doesn't have a well-defined memory model. As a result, it's possible that code which is sound under today's compiler may become unsound at some point in the future. If zerocopy wants to be a trustworthy replacement for
unsafe
code, and ask its users not to worry about soundness, it needs to promise not only soundness, but soundness under any future compiler behavior and under any future memory model.This work is tracked in #61.
Feature-completeness
Building-block API
Currently, we have a lot of support for combinations of operations. For example, if you want to convert a
&mut [u8]
to a&mut [T]
, and you want to check at runtime that your byte slice has the right size and alignment, you would doRef::new_slice(bytes)?.into_mut_slice()
. If you wanted to do the same, but first zero the bytes of the&mut [u8]
, you'd use thenew_slice_zeroed
constructor. Even though most of the logic is the same, there's an entirely different constructor.This has a few downsides:
&[u8; size_of::<T>()]
to&T
whereT: FromBytes + Unaligned
can in principle be an infallible operation. However, since all of our APIs take the more general&[u8]
type, we have no choice but to perform a bounds check, and thus to return anOption<&T>
instead of just&T
. This forces the user to.unwrap()
or similar, and provides fewer guarantees about codegen.very_long_name_that_describes_exactly_what_they_want
, and there are a ton to choose from.To address these issues, we want to move towards a world in which there are small "building blocks" which can be combined to perform larger operations. Convenience methods for common combinations will probably still be supported, but we may remove some of the less-frequently used bits of the API so long as users can still express the same behavior using the new building blocks. So far, we intend to build:
ByteArray<T>
- a polyfill for[u8; size_of::<T>()]
until the latter type is stable in a generic contextAlign<T, A>
- aT
whose alignment is rounded up to that ofA
ByteArray
,Unalign
, andAlign
types to elide length and alignment checks. A few examples:fn unaligned_ref_from_bytes(bytes: &ByteArray<T>) -> &Unalign<T> where T: FromBytes + Sized
fn mut_from_bytes(bytes: &mut ByteArray<T>) -> Option<&mut T> where T: FromBytes + AsBytes + Sized
fn as_byte_array(&self) -> &ByteArray<Self> where Self: AsBytes + Sized
Another added benefit of these building blocks is that it will make it easier to reason about the soundness of our implementations. Since many of our functions/methods encode complex behavior (exactly what we're talking about in this section), safety arguments are similarly complex. If we were instead able to decompose these into smaller (still unsafe) operations, we could make it easier to reason about the safety of the resulting implementations.
For example, currently, the implementation of
Ref::into_ref
looks like this:Current impl
I'm sure that this is sound, but I've always been a bit nervous about how complex the argument is. By contrast, we can simplify this using the building blocks we intend to introduce. In 2c67380 (this commit hasn't been merged, and may be deleted at some point), we change the above code to:
New impl
I find this implementation much easier to reason about. The safety invariants on
ByteArray::from_slice_unchecked
andFromBytes::ref_from_bytes_unchecked
are straightforward, and it is much more obvious from reading those functions that the lifetimes are propagated correctly. (Note that this commit also adds a requirement toByteSlice
about what anInto<&'a [u8]>
impl is required to return.)Simplify
ByteSlice
's definition and make it un-sealedCurrently,
ByteSlice
has both aDeref<Target=[u8]>
bound and anas_ptr(&self) -> *const u8
method. The latter is probably redundant given the former, and adds another method that we have to document safety invariants for.ByteSlice
's safety invariants are somewhat subtle, so getting rid ofas_ptr
would be very nice.It would also make it easier for others to implement
ByteSlice
for their own types. We've had users request this, but it's currently impossible becauseByteSlice
is sealed. While we are confident that our existing impls ofByteSlice
andByteSliceMut
are sound for our use cases, we would need to formalize the safety requirements for any types to implement these traits before we make them un-sealed. This is probably a good idea anyway because it may surface ways that we can simplify the API.Split
ByteSlice
so thatsplit_at
is in a different trait (#1)Currently,
ByteSlice
has asplit_at(self, mid: usize) -> (Self, Self)
method analogous to the slice method of the same name. Our performance design requires this method to be very cheap, which precludes implementingByteSlice
for types likeVec
, for whichsplit_at
would require allocation.Instead, #1 tracks splitting
ByteSlice
into two traits so a type such asVec
can implement the baseByteSlice
trait without needing to implementsplit_at
. Most of the zerocopy API can operate on this simpler trait, while a few functions and methods would still require the ability to callsplit_at
.Elide length or alignment checks when they can be verified statically
Tracked in #280.
Support types which are not
FromBytes
, but which can be converted from a sequence of zeroesTracked in #30.
Support fallible conversions
Tracked in #5; in progress.
Support conversions in
const fn
Tracked in #115.
Support converting
&[[u8; size_of::<T>()]]
to&[T]
What is says on the tin.
Rename
LayoutVerified
toRef
(#68)What is says on the tin.
LayoutVerified
is descriptive if you understand type theory and the concept of a "witness" (although we probably should have put "witness" in the name...), but it's a meaningless term for most users. We should rename it toRef
or similar - after all, it's just a reference with a few niceties.Miscellaneous features
TryFromBytes
- conditional conversion analogous toFromBytes
#5AsBytes
on a#[repr(transparent)]
type #9KnownLayout
trait and custom DSTs #29FromZeroes
on enums #30FromBytes for MaybeUninit<T>
? #117AsBytes
on unsized types #121AsBytes
forMaybeUninit<T>
whereT
is a ZST? #123AsBytes
on genericrepr(packed)
structs #127FromBytes
/IntoBytes
methods which read from/write to anio::Read
/io::Write
#158transmute_ref!
andtransmute_mut!
macros #159FromBytes
andAsBytes
for raw pointers #170#[repr(transparent)]
wrapper type #196Unalign
to support arbitrary alignment #205Unalign
in-place #206Debug
(and maybe other traits that take&(mut) self
) forUnalign<T>
whereT: !Unaligned
#207Unalign
for unsized types? #209Unalign<Cell<T>>
andCell<Unalign<T>>
#211Align
type #249UnsafeCell
" property into separateImmutable
trait; allowFromZeros
,FromBytes
, andAsBytes
on types withUnsafeCell
s #251API polish
zerocopy::byteorder::NE
typealias forzerocopy::byteorder::NativeEndian
#100Unaligned
without custom derive #110#[must_use]
annotation to some types, functions, and macros #188write_to_prefix
(and similar ones) take ownership of the buffer #195Deref
, change some methods to associated functions? #210as_bytes_mut
toas_mut_bytes
#253Documentation is complete, thorough, and up-to-date (#32)
cargo readme
output matchesREADME.md
#18cargo doc
in CI #33LayoutVerified
toRef
#68AsBytes
derive docs #132FromZeroes
#146Unalign
ABI promises are wrong #164Unalign::update
docs should suggestDerefMut
for unaligned types #262High confidence in correctness and soundness
FromBytes::new_box_slice_zeroed
#64write_to_prefix
generates panic path #200Unalign::update
cause issues for types that are aware of their memory location? Is this a soundness hole? #266Tested and stable on all platforms
cargo miri test
on wasm and riscv target once they're supported #22test_as_bytes_methods
fails on powerpc #23Usable in Cargo and crates.io ecosystem
cargo package
orcargo publish --dry-run
in CI? #105cargo package
contents in CI #192zerocopy
0.6.2 is not semver compatible with0.6.1
: use zerocopy::U32; | ^^^^^^^^^^^^^ noU32
in the root #228Compile-time performance
Known bugs are fixed
test_new_error
fails on i686 #21test_as_bytes_methods
fails on powerpc #23where_clauses_object_safety
future compatibility lint warning #150Code quality
[u8]::fill
method #38Developer experience
test_validate_cast_and_convert_metadata
#397The text was updated successfully, but these errors were encountered: