Skip to content

Commit

Permalink
Update book
Browse files Browse the repository at this point in the history
  • Loading branch information
djkoloski committed Sep 11, 2024
1 parent 3c22867 commit 6c2e1e7
Show file tree
Hide file tree
Showing 20 changed files with 292 additions and 180 deletions.
23 changes: 5 additions & 18 deletions book_src/SUMMARY.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,30 +11,17 @@
- [Archive](./architecture/archive.md)
- [Serialize](./architecture/serialize.md)
- [Deserialize](./architecture/deserialize.md)
- [Alignment](./architecture/alignment.md)
- [Format](./format.md)
- [Wrapper types](./wrapper-types.md)
- [Alignment](./format/alignment.md)
- [Derive macro features](./derive-macro-features.md)
- [Wrapper types](./wrapper-types.md)
- [Remote derive](./derive-macro-features/remote-derive.md)
- [Shared pointers](./shared-pointers.md)
- [Unsized types](./unsized-types.md)
- [Trait objects](./trait-objects.md)
- [Validation](./validation.md)
- [Allocation tracking](./allocation-tracking.md)
- [Feature comparison](./feature-comparison.md)
- [FAQ](./faq.md)

# Advanced rkyv

- [Validation]()
- [Serializer composition]()
- [Alternative writers]()
- [Extending serialization]()
- [Hybrid deserialization]()
- [Effective wrapper types]()
- [Archived vs unarchived types]()
- [Impl duplication]()
- [Streaming serialization]()
- [Schema evolution]()
- [Nightly features]()
- [Copy optimization]()
- [Derive macro features]()

[Contributors](./contributors.md)
13 changes: 13 additions & 0 deletions book_src/allocation-tracking.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# Allocation tracking

rkyv's provided `AllocationTracker` struct wraps an `Allocator` and tracks when memory is allocated
and freed during serialization. It can also calculate synthetic metrics, like the minimum amount of
pre-allocated memory required to serialize a value. And, it can report the maximum alignment of all
serialized types.

You can create a custom serializer with allocation tracking by calling `Serializer::new(..)` and
providing the pieces of your serializer. Normally, the provided allocator would be an `ArenaHandle`,
but instead you should provide it an `AllocationTracker::new(arena_handle)`.

After serializing your value, the serializer can be decomposed with `into_raw_parts`. You can then
retrieve the `AllocationStats` from the allocator by calling `into_stats`.
14 changes: 4 additions & 10 deletions book_src/architecture.md
Original file line number Diff line number Diff line change
@@ -1,15 +1,9 @@
# Architecture

The core of rkyv is built around
[relative pointers](https://docs.rs/rkyv/0.7.1/rkyv/rel_ptr/struct.RelPtr.html) and three core
traits:
[`Archive`](https://docs.rs/rkyv/0.7.1/rkyv/trait.Archive.html),
[`Serialize`](https://docs.rs/rkyv/0.7.1/rkyv/trait.Serialize.html), and
[`Deserialize`](https://docs.rs/rkyv/0.7.1/rkyv/trait.Deserialize.html). Each of these traits has a
corresponding variant that supports unsized types:
[`ArchiveUnsized`](https://docs.rs/rkyv/0.7.1/rkyv/trait.ArchiveUnsized.html),
[`SerializeUnsized`](https://docs.rs/rkyv/0.7.1/rkyv/trait.SerializeUnsized.html), and
[`DeserializeUnsized`](https://docs.rs/rkyv/0.7.1/rkyv/trait.DeserializeUnsized.html).
The core of rkyv is built around relative pointers and three core
traits: `Archive`, `Serialize`, and `Deserialize`. Each of these traits has a
corresponding variant that supports unsized types: `ArchiveUnsized`,
`SerializeUnsized`, and `DeserializeUnsized`.

> A good way to think about it is that sized types are the *foundation* that unsized types are built
> on. That's not a fluke either, rkyv is built precisely so that you can build more complex
Expand Down
32 changes: 18 additions & 14 deletions book_src/architecture/deserialize.md
Original file line number Diff line number Diff line change
@@ -1,31 +1,35 @@
# Deserialize

Similarly to `Serialize`, [`Deserialize`](https://docs.rs/rkyv/0.7.1/rkyv/trait.Deserialize.html)
parameterizes over and takes a deserializer, and converts a type from its archived form back to its
original one. Unlike serialization, deserialization occurs in a single step and doesn't have an
equivalent of a resolver.
Similarly to `Serialize`, `Deserialize` parameterizes over a deserializer, and converts a type from
its archived form back to its original one. Unlike serialization, deserialization occurs in a single
step and doesn't have an equivalent of a resolver.

> `Deserialize` also parameterizes over the type that is being deserialized into. This allows the
> same archived type to deserialize into multiple different unarchived types depending on what's
> being asked for. This helps enable lots of very powerful abstractions, but might require you to
> annotate types when deserializing.
> use a turbofish or annotate types when deserializing.
This provides a more or less a traditional deserialization with the added benefit of being sped up
somewhat by having very compatible representations. It also incurs both the memory and performance
This provides a more or less traditional deserialization with the added benefit of being sped up
by having very compiler-friendly representations. It also incurs both the memory and performance
penalties of traditional deserialization, so make sure that it's what you need before you use it.
Deserialization is not required to access archived data as long as you can do so through the
archived versions.

> Even the highest-performance serialization frameworks will hit a deserialization speed limit
> because of the amount of memory allocation that needs to be performed.
A good use for `Deserialize` is deserializing portions of archives. You can easily traverse the
archived data to locate some subobject, then deserialize just that piece instead of the archive as a
whole. This granular approach provides the benefits of both zero-copy deserialization as well as
traditional deserialization.
A good use for `Deserialize` is deserializing small portions of archives. You can easily traverse
the archived data to locate some subobject, then deserialize just that piece instead of the archive
as a whole. This granular approach provides the benefits of both zero-copy deserialization as well
as traditional deserialization.

## Deserializer
## Pooling

Deserializers, like serializers, provide capabilities to objects during deserialization. Most types
don't bound their deserializers, but some like `Rc` require special deserializers in order to
deserialize memory properly.
don't need to bound their deserializers, but some like `Rc` require special traits in order to
deserialize properly.

The `Pooling` trait controls how pointers which were serialized shared are deserialized. Much like
`Sharing`, `Pooling` holds some mutable state on the deserializer to allow shared pointers to the
same data to coordinate with each other. Using the `Pool` implementation pools these deserialized
shared pointers together, whereas `Unpool` clones them for each instance of the shared pointer.
3 changes: 1 addition & 2 deletions book_src/architecture/relative-pointers.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,5 +39,4 @@ By using relative pointers, we can load data at any position in memory and still
inside of it. Relative pointers don't require write access to memory either, so we can memory map
entire files and instantly have access to their data in a structured manner.

rkyv's implementation of relative pointers is the
[`RelPtr`](https://docs.rs/rkyv/0.7.1/rkyv/rel_ptr/struct.RelPtr.html) type.
rkyv's implementation of relative pointers is the `RelPtr` type.
74 changes: 53 additions & 21 deletions book_src/architecture/serialize.md
Original file line number Diff line number Diff line change
@@ -1,35 +1,67 @@
# Serialize

Types implement [`Serialize`](https://docs.rs/rkyv/0.7.1/rkyv/trait.Serialize.html) separately from
`Archive`. `Serialize` creates a resolver for some object, then `Archive` turns the value and that
resolver into an archived type. Having a separate `Serialize` trait is necessary because although a
type may have only one archived representation, you may have options of what requirements to meet in
order to create one.
Types implement `Serialize` separately from `Archive`. `Serialize` creates a resolver for some
object, then `Archive` turns the value and that resolver into an archived type. Having a separate
`Serialize` trait is necessary because although a type may have only one archived representation,
it may support many different types of _serializers_ which fulfill its requirements.

> The `Serialize` trait is parameterized over the *serializer*. The serializer is just a mutable
> object that helps the type serialize itself. The most basic types like `u32` or `char` don't
> *bound* their serializer type because they can serialize themselves with any kind of serializer.
> More complex types like `Box` and `String` require a serializer that implements
> [`Serializer`](https://docs.rs/rkyv/0.7.1/rkyv/ser/trait.Serializer.html), and even more complex
> types like `Rc` and `Vec` require a serializer that additionally implement
> [`SharedSerializeRegistry`](https://docs.rs/rkyv/0.7.1/rkyv/ser/trait.SharedSerializeRegistry.html)
> or [`ScratchSpace`](https://docs.rs/rkyv/0.7.1/rkyv/ser/trait.ScratchSpace.html).
> More complex types like `Box` and `String` require a serializer that implements `Writer`, and even
> more complex types like `Rc` and `Vec` require a serializer that additionally implements `Sharing`
> or `Allocator`.
Unlike `Serialize`, `Archive` doesn't parameterize over the serializer used to make it. It shouldn't
matter what serializer a resolver was made with, only that it's made correctly.

## Serializer

rkyv provides serializers that provide all the functionality needed to serialize standard library
types, as well as serializers that combine other serializers into a single object with all of the
components' capabilities.
rkyv provides default serializers which can serialize all standard library types, as well as
components which can be combined into custom-built serializers. By combining rkyv's provided
components, serializers can be customized for high-performance, no-std, and custom allocation.

The [provided serializers](https://docs.rs/rkyv/0.7.1/rkyv/ser/serializers/index.html) offer a wide
range of strategies and capabilities, but most use cases will be best suited by
[`AllocSerializer`](https://docs.rs/rkyv/0.7.1/rkyv/ser/serializers/type.AllocSerializer.html).
When using the high-level API, a `HighSerializer` provides a good balance of flexibility and
performance by default. When using the low-level API, a `LowSerializer` does the same without any
allocations. You can make custom serializers using the `Serializer` combinator, or by writing your
own from scratch.

> Many types require *scratch space* to serialize. This is some extra allocated space that they can
> use temporarily and return when they're done. For example, `Vec` might request scratch space to
> store the resolvers for its elements until it can serialize all of them. Requesting scratch space
> from the serializer allows scratch space to be reused many times, which reduces the number of slow
> memory allocations performed while serializing.
rkyv comes with a few primary serializer traits built-in:

### Positional

This core serializer trait provides positional information during serialization. Because types need
to know the relative distance between objects, the `Positional` trait provides the current position
of the "write head" of the serializer. Resolvers will often store the _position_ of some serialized
data so that a relative pointer can be calculated to it during `resolve`.

### Writer

`Writer` accepts byte slices and writes them to some output. It is similar to the standard library's
`Write` trait, but rkyv's `Writer` trait works in no-std contexts. In rkyv, writers are always
_write-forward_ - they never backtrack and rewrite data later. This makes it possible for writers to
eagerly sink bytes to disk or the network without having to first buffer the entire message.

Several kinds of `Writer`s are supported by default:
- `Vec<u8>`
- `AlignedVec`, which is a highly-aligned vector of bytes. This is the writer rkyv uses by default
in most cases.
- `Buffer`, which supports no-std use cases (for example, writing into fixed-size stack memory).
- Types which implement `std::io::Write` can be adapted into a `Writer` by wrapping them in the
`IoWriter` type.

### Allocator

Many types require temporarily-allocated space during serialization. This space is used temporarily,
and then returned to the serializer before serialization finishes. For example, `Vec` might request
a dynamically-sized allocation to store the resolvers for its elements until it finishes serializing
all of them. Allocating memory from the serializer allows the same bytes to be efficiently reused
many times, which reduces the number of slow memory allocations performed during serialization.

### Sharing

rkyv serializes shared pointers like `Rc` and `Arc` and can control whether they are de-duplicated.
The `Sharing` trait provides some mutable state on the serializer which keeps track of which shared
pointers have been serialized so far, and can instruct repeated shared pointers to point to a
previously-serialized instance. This also allows rkyv to preserve shared pointers during zero-copy
access and deserialization.
1 change: 1 addition & 0 deletions book_src/contributors.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,6 @@
Thanks to all the contributors who have helped document rkyv:

- David Koloski ([djkoloski](https://github.com/djkoloski))
- Badewanne3 ([MaxOhn](https://github.com/MaxOhn))

If you feel you're missing from this list, feel free to add yourself in a PR.
41 changes: 41 additions & 0 deletions book_src/derive-macro-features.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
# Derive macro features

rkyv's derive macro supports a number of attributes and configurable options. All of rkyv's macro
attributes are documented on the `Archive` proc-macro. Some of the most important ones to know are:

## `omit_bounds`

rkyv's derive macro performs a "perfect derive" by default. This means that when it generates trait
impls, it adds where clauses requiring each field type to also implement that trait. This can cause
trouble in two primary situations:

1. Recursive type definitions (using e.g. `Box`) cause an overflow and never finish evaluating
2. Private types may be exposed by these derive bounds.

Both of these situations can be fixed by adding `#[rkyv(omit_bounds)]` on the field. This prevents
rkyv from adding the "perfect derive" bounds for that field.

When you do omit the bounds for a particular field, it can lead to insufficient bounds being added
to the generated impl. To add custom bounds back, you can use:

- `#[rkyv(archive_bounds(..))]` to add predicates to all generated impls
- `#[rkyv(serialize_bounds(..))]` to add predicates to just the `Serialize` impl
- `#[rkyv(deserialize_bounds(..))]` to add predicates to just the `Deserialize` impl

See `rkyv/examples/json_like_schema.rs` for a fully-commented example of using `omit_bounds`.

## `with = ..`

This customizes the serialization of a field by applying a
[wrapper type](derive-macro-features/wrapper-types.md).

## `remote = ..`

This performs a [remote derive](derive-macro-features/remote-derive.md) for supporting external
types.

## `attr(..)` and `derive(..)`

`#[rkyv(attr(..))]` is a general-purpose attribute which allows you to pass attributes down to the
generated archived type. This can be especially useful in combination with `#[rkyv(derive(..))]`,
which may be used on types and is sugar for `#[rkyv(attr(derive(..)))]`.
46 changes: 46 additions & 0 deletions book_src/derive-macro-features/remote-derive.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Remote derive

Like serde, rkyv also supports _remote derive_. This allows you to easily generate wrapper types to
serialize types from other crates which don't provide rkyv support. Remote derive uses a local
definition of the type to serialize, and generates a wrapper type you can use to serialize that
type.

Remote derive supports getters, wrapper types, and deserialization back to the original type by
providing a `From` impl. This example is from `rkyv/examples/remote_types.rs`:

```rust
// Let's create a local type that will serve as `with`-wrapper for `Foo`.
// Fields must have the same name and type but it's not required to define all
// fields.
#[derive(Archive, Serialize, Deserialize)]
#[rkyv(remote = remote::Foo)] // <-
#[rkyv(archived = ArchivedFoo)]
// ^ not necessary but we might as well replace the default name
// `ArchivedFooDef` with `ArchivedFoo`.
struct FooDef {
// The field's type implements `Archive` and we don't want to apply any
// conversion for the archived type so we don't need to specify
// `#[rkyv(with = ..)]`.
ch: char,
// The field is private in the remote type so we need to specify a getter
// to access it. Also, its type doesn't implement `Archive` so we need
// to specify a `with`-wrapper too.
#[rkyv(getter = remote::Foo::bar, with = BarDef)]
bar: remote::Bar<i32>,
// The remote `bytes` field is public but we can still customize our local
// field when using a getter.
#[rkyv(getter = get_first_byte)]
first_byte: u8,
}

fn get_first_byte(foo: &remote::Foo) -> u8 {
foo.bytes[0]
}

// Deriving `Deserialize` with `remote = ..` requires a `From` implementation.
impl From<FooDef> for remote::Foo {
fn from(value: FooDef) -> Self {
remote::Foo::new(value.ch, [value.first_byte, 2, 3, 4], 567, value.bar)
}
}
```
23 changes: 23 additions & 0 deletions book_src/derive-macro-features/wrapper-types.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Wrapper types

Wrapper types customize the way that fields of types are archived. In some cases, wrapper types
merely change the default behavior to a preferred alternative. In other cases, wrapper types allow
serializing types which do not have support for rkyv by default.

Annotating a field with `#[rkyv(with = ..)]` will *wrap* that field with the given types when the
struct is serialized or deserialized. There's no performance penalty to wrapping types, but doing
more or less work during serialization and deserialization can affect performance. This excerpt is
from the documentation for `ArchiveWith`:

```rs
#[derive(Archive, Deserialize, Serialize)]
struct Example {
#[rkyv(with = Incremented)]
a: i32,
// Another i32 field, but not incremented this time
b: i32,
}
```

The `Incremented` wrapper is wrapping `a`, and the definition causes that field to be incremented
in its archived form.
Loading

0 comments on commit 6c2e1e7

Please sign in to comment.