Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the enum changes in RFC 2195 #879

Merged
merged 10 commits into from
Sep 4, 2020
170 changes: 149 additions & 21 deletions src/type-layout.md
Original file line number Diff line number Diff line change
Expand Up @@ -174,7 +174,8 @@ layout such as reinterpreting values as a different type.
Because of this dual purpose, it is possible to create types that are not useful
for interfacing with the C programming language.

This representation can be applied to structs, unions, and enums.
This representation can be applied to structs, unions, and enums. The exception
is [zero-variant enums] for which the `C` representation is an error.

#### \#[repr(C)] Structs

Expand Down Expand Up @@ -273,9 +274,9 @@ assert_eq!(std::mem::size_of::<SizeRoundedUp>(), 8); // Size of 6 from b,
assert_eq!(std::mem::align_of::<SizeRoundedUp>(), 4); // From a
```

#### \#[repr(C)] Enums
#### \#[repr(C)] Field-less Enums

For [C-like enumerations], the `C` representation has the size and alignment of
For [field-less enums], the `C` representation has the size and alignment of
the default `enum` size and alignment for the target platform's C ABI.

> Note: The enum representation in C is implementation defined, so this is
Expand All @@ -285,40 +286,165 @@ the default `enum` size and alignment for the target platform's C ABI.
<div class="warning">

Warning: There are crucial differences between an `enum` in the C language and
Rust's C-like enumerations with this representation. An `enum` in C is
Rust's [field-less enums] with this representation. An `enum` in C is
mostly a `typedef` plus some named constants; in other words, an object of an
`enum` type can hold any integer value. For example, this is often used for
bitflags in `C`. In contrast, Rust’s C-like enumerations can only legally hold
the discriminant values, everything else is undefined behaviour. Therefore,
using a C-like enumeration in FFI to model a C `enum` is often wrong.
bitflags in `C`. In contrast, Rust’s [field-less enums] can only legally hold
the discrimnant values, everything else is [undefined behavior]. Therefore,
using a field-less enum in FFI to model a C `enum` is often wrong.

</div>

It is an error for [zero-variant enumerations] to have the `C` representation.
#### \#[repr(C)] Enums With Fields

For all other enumerations, the layout is unspecified.
The representation of a `repr(C)` enum with fields is a `repr(C)` struct with
two fields, also called a "tagged union" in C:

Likewise, combining the `C` representation with a primitive representation, the
layout is unspecified.
- a `repr(C)` version of the enum with all fields removed ("the tag")
- a `repr(C)` union of `repr(C)` structs for the fields of each variant that had
them ("the payload")

> Note: Due to the representation of `repr(C)` structs and unions, if a variant
> has a single field there is no difference between putting that field directly
> in the union or wrapping it in a struct; any system which wishes to manipulate
> such an `enum`'s representation may therefore use whichever form is more
> convenient or consistent for them
poliorcetics marked this conversation as resolved.
Show resolved Hide resolved

```rust
// This Enum has the same layout as ...
poliorcetics marked this conversation as resolved.
Show resolved Hide resolved
#[repr(C)]
enum MyEnum {
A(u32),
B(f32, u64),
C { x: u32, y: u8 },
D,
}

// ... this struct.
#[repr(C)]
struct MyEnumRepr {
tag: MyEnumDiscriminant,
payload: MyEnumFields,
}

// This is the discriminant enum.
#[repr(C)]
enum MyEnumDiscriminant { A, B, C, D }

// This is the variant union.
#[repr(C)]
union MyEnumFields {
A: MyAFields,
B: MyBFields,
C: MyCFields,
D: MyDFields,
}

#[repr(C)]
#[derive(Copy, Clone)]
struct MyAFields(u32);

#[repr(C)]
#[derive(Copy, Clone)]
struct MyBFields(f32, u64);

#[repr(C)]
#[derive(Copy, Clone)]
struct MyCFields { x: u32, y: u8 }

// This struct could be omitted (it is a zero-sized type), and it must be in
// C/C++ headers.
#[repr(C)]
#[derive(Copy, Clone)]
struct MyDFields;
```

> Note: `union`s with non-`Copy` fields are unstable, see [55149].

### Primitive representations

The *primitive representations* are the representations with the same names as
the primitive integer types. That is: `u8`, `u16`, `u32`, `u64`, `u128`,
`usize`, `i8`, `i16`, `i32`, `i64`, `i128`, and `isize`.

Primitive representations can only be applied to enumerations.
Primitive representations can only be applied to enumerations and have
different behavior whether the enum has fields or no fields. It is an error
for [zero-variant enumerations] to have a primitive representation. Combining
two primitive representations together is an error.

> Note: Primitive representations are primarily intended for Rust code that
> wants to interoperate with the idioms of preexisting C/C++ codebases. If you
> have control over both the Rust and C code, such as using C as FFI glue
> between Rust and some third language, then you should use a
> [primitive representation](#primitive-representation-of-enums-with-fields)
> instead.
poliorcetics marked this conversation as resolved.
Show resolved Hide resolved

#### Primitive Representation of Field-less Enums

For [field-less enums], primitive representations set the size and alignment to
be the same as the primitive type of the same name. For example, a field-less
enum with a `u8` representation can only have discriminants between 0 and 255
inclusive.

#### Primitive Representation of Enums With Fields

For [C-like enumerations], they set the size and alignment to be the same as the
primitive type of the same name. For example, a C-like enumeration with a `u8`
representation can only have discriminants between 0 and 255 inclusive.
The representation of a primitive representation enum is a `repr(C)` union of
`repr(C)` structs for each variant with a field. The first field of each struct
in the union is the primitive representation version of the enum with all fields
removed ("the tag") and the remaining fields are the fields of that variant.

> Note: This representation is unchanged if the tag is given its own member in
> the union, should that make manipulation more clear for you (although to
> follow the C++ standard the tag member should be wrapped in a `struct`).

```rust
// This enum has the same layout as ...
#[repr(u8)]
enum MyEnum {
A(u32),
B(f32, u64),
C { x: u32, y: u8 },
D,
}

// ... this union.
#[repr(C)]
union MyEnumRepr {
A: MyVariantA,
B: MyVariantB,
C: MyVariantC,
D: MyVariantD,
}

It is an error for [zero-variant enumerations] to have a primitive
representation.
// This is the discriminant enum.
#[repr(u8)]
#[derive(Copy, Clone)]
enum MyEnumDiscriminant { A, B, C, D }

#[repr(C)]
#[derive(Clone, Copy)]
struct MyVariantA(MyEnumDiscriminant, u32);

#[repr(C)]
#[derive(Clone, Copy)]
struct MyVariantB(MyEnumDiscriminant, f32, u64);

#[repr(C)]
#[derive(Clone, Copy)]
struct MyVariantC { tag: MyEnumDiscriminant, x: u32, y: u8 }

#[repr(C)]
#[derive(Clone, Copy)]
struct MyVariantD(MyEnumDiscriminant);
```

For all other enumerations, the layout is unspecified.
> Note: `union`s with non-`Copy` fields are unstable, see [55149].

Likewise, combining two primitive representations together is unspecified.
Combining the `repr(C)` and a primitive representation is only defined for enums
with fields. The primitive representation modifies the `repr(C)` by changing the
poliorcetics marked this conversation as resolved.
Show resolved Hide resolved
representation of the discriminant enum to have the representation of the chosen
primitive representation. So, if you chose the `u8` representation, then the
discriminant enum would have a size and alignment of 1 byte.

### The alignment modifiers

Expand Down Expand Up @@ -379,12 +505,14 @@ used with any other representation.
[`align_of`]: ../std/mem/fn.align_of.html
[`size_of`]: ../std/mem/fn.size_of.html
[`Sized`]: ../std/marker/trait.Sized.html
[`Copy`]: ../std/marker/trait.Copy.html
[dynamically sized types]: dynamically-sized-types.md
[C-like enumerations]: items/enumerations.md#custom-discriminant-values-for-fieldless-enumerations
[field-less enums]: items/enumerations.md#custom-discriminant-values-for-fieldless-enumerations
[enumerations]: items/enumerations.md
[zero-variant enumerations]: items/enumerations.md#zero-variant-enums
[zero-variant enums]: items/enumerations.md#zero-variant-enums
[undefined behavior]: behavior-considered-undefined.md
[27060]: https://github.com/rust-lang/rust/issues/27060
[55149]: https://github.com/rust-lang/rust/issues/55149
[`PhantomData<T>`]: special-types-and-traits.md#phantomdatat
[Default]: #the-default-representation
[`C`]: #the-c-representation
Expand Down