Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Soundness hole in #[derive(IntoBytes)] on types with #[repr(align)] #1748

Open
Tracked by #671
joshlf opened this issue Sep 24, 2024 · 2 comments · May be fixed by #1752
Open
Tracked by #671

Soundness hole in #[derive(IntoBytes)] on types with #[repr(align)] #1748

joshlf opened this issue Sep 24, 2024 · 2 comments · May be fixed by #1752
Labels
blocking-next-release This issue should be resolved before we release on crates.io bug Something isn't working

Comments

@joshlf
Copy link
Member

joshlf commented Sep 24, 2024

Consider the following type:

#[derive(IntoBytes)]
#[repr(C, align(8))]
struct Foo<T> {
    t: T,
}

#[derive(IntoBytes)] emits an IntoBytes impl for Foo with a T: Unaligned bound. The reasoning is based on the repr(C) layout algorithm, but this reasoning is unsound in the presence of #[repr(align(8))], which #[derive(IntoBytes)] spuriously ignores.

In particular, Foo<u8> satisfies u8: Unaligned, but has size 8 (7 bytes of padding) in order to satisfy its alignment requirement.

We need to either ban #[repr(align(...))] in #[derive(IntoBytes)] or at least ban it when generics are present.

@joshlf joshlf added the blocking-next-release This issue should be resolved before we release on crates.io label Sep 24, 2024
@joshlf joshlf mentioned this issue Sep 24, 2024
85 tasks
@jswrenn jswrenn added the bug Something isn't working label Sep 24, 2024
@joshlf
Copy link
Member Author

joshlf commented Sep 24, 2024

I took a stab at fixing this:

fn derive_into_bytes_struct(ast: &DeriveInput, strct: &DataStruct) -> proc_macro2::TokenStream {
    let reprs = try_or_print!(STRUCT_UNION_INTO_BYTES_CFG.validate_reprs(ast));
    let is_transparent = reprs.contains(&StructRepr::Transparent);
    let is_packed = reprs.contains(&StructRepr::Packed) || reprs.contains(&StructRepr::PackedN(1));
    let is_aligned = reprs.iter().any(|r| matches!(r, StructRepr::Align(_)));
    let num_fields = strct.fields().len();

    let (padding_check, require_unaligned_fields) = if is_transparent || is_packed {
        // No padding check needed.
        // - repr(transparent): The layout and ABI of the whole struct is the
        //   same as its only non-ZST field (meaning there's no padding outside
        //   of that field) and we require that field to be `IntoBytes` (meaning
        //   there's no padding in that field).
        // - repr(packed): Any inter-field padding bytes are removed, meaning
        //   that any padding bytes would need to come from the fields, all of
        //   which we require to be `IntoBytes` (meaning they don't have any
        //   padding).
        (None, false)
    } else if reprs.contains(&StructRepr::C) && !is_aligned && num_fields <= 1 {
        // No padding check needed. A repr(C) struct with zero or one field has
        // no padding so long as it doesn't have an explicit repr(align)
        // attribute.
        (None, false)
    } else if ast.generics.params.is_empty() {
        // Since there are no generics, we can emit a padding check. This is
        // more permissive than the next case, which requires that all field
        // types implement `Unaligned`.
        (Some(PaddingCheck::Struct), false)
    } else if !is_aligned {
        // Based on the allowed reprs, we know that this type must be repr(C) by
        // the time we get here, but the soundness of this impl relies on it, so
        // let's double-check.
        assert!(reprs.contains(&StructRepr::C));
        // We can't use a padding check since there are generic type arguments.
        // Instead, we require all field types to implement `Unaligned`. This
        // ensures that the `repr(C)` layout algorithm will not insert any
        // padding so long as the type doesn't have an explicit repr(align)
        // attribute.
        //
        // TODO(#10): Support type parameters for non-transparent, non-packed
        // structs without requiring `Unaligned`.
        (None, true)
    } else {
        assert!(is_aligned, "this branch is unreachable without a `#[repr(align)]` attribute");

        return Error::new(
            Span::call_site(),
            "`#[repr(align)]` might introduce padding; consider removing it",
        )
        .to_compile_error();
    };

Relative to fbb0f8b, here's the diff:

 fn derive_into_bytes_struct(ast: &DeriveInput, strct: &DataStruct) -> proc_macro2::TokenStream {
     let reprs = try_or_print!(STRUCT_UNION_INTO_BYTES_CFG.validate_reprs(ast));
     let is_transparent = reprs.contains(&StructRepr::Transparent);
-    let is_packed = reprs.contains(&StructRepr::Packed);
+    let is_packed = reprs.contains(&StructRepr::Packed) || reprs.contains(&StructRepr::PackedN(1));
+    let is_aligned = reprs.iter().any(|r| matches!(r, StructRepr::Align(_)));
     let num_fields = strct.fields().len();
 
     let (padding_check, require_unaligned_fields) = if is_transparent || is_packed {
@@ -806,16 +807,17 @@ fn derive_into_bytes_struct(ast: &DeriveInput, strct: &DataStruct) -> proc_macro
         //   which we require to be `IntoBytes` (meaning they don't have any
         //   padding).
         (None, false)
-    } else if reprs.contains(&StructRepr::C) && num_fields <= 1 {
+    } else if reprs.contains(&StructRepr::C) && !is_aligned && num_fields <= 1 {
         // No padding check needed. A repr(C) struct with zero or one field has
-        // no padding.
+        // no padding so long as it doesn't have an explicit repr(align)
+        // attribute.
         (None, false)
     } else if ast.generics.params.is_empty() {
         // Since there are no generics, we can emit a padding check. This is
         // more permissive than the next case, which requires that all field
         // types implement `Unaligned`.
         (Some(PaddingCheck::Struct), false)
-    } else {
+    } else if !is_aligned {
         // Based on the allowed reprs, we know that this type must be repr(C) by
         // the time we get here, but the soundness of this impl relies on it, so
         // let's double-check.
@@ -823,11 +825,20 @@ fn derive_into_bytes_struct(ast: &DeriveInput, strct: &DataStruct) -> proc_macro
         // We can't use a padding check since there are generic type arguments.
         // Instead, we require all field types to implement `Unaligned`. This
         // ensures that the `repr(C)` layout algorithm will not insert any
-        // padding.
+        // padding so long as the type doesn't have an explicit repr(align)
+        // attribute.
         //
         // TODO(#10): Support type parameters for non-transparent, non-packed
         // structs without requiring `Unaligned`.
         (None, true)
+    } else {
+        assert!(is_aligned, "this branch is unreachable without a `#[repr(align)]` attribute");
+
+        return Error::new(
+            Span::call_site(),
+            "`#[repr(align)]` might introduce padding; consider removing it",
+        )
+        .to_compile_error();
     };

Unfortunately, that doesn't work as written: validate_reprs strips out any align reprs before returning on the theory that "no callers care about them":

/// conform to the requirements of `self`, and returns them. Regardless of
/// whether `align` attributes are considered during validation, they are
/// stripped out of the returned value since no callers care about them.
pub(crate) fn validate_reprs(&self, input: &DeriveInput) -> Result<Vec<R>, Vec<Error>> {

We should probably take this opportunity to refactor our repr handling logic since many of our derives now do care about alignment more than they used to. We could also use the opportunity to get rid of this brittle special-cased logic for #[derive(Unaligned), which currently lives inside validate_reprs:

if self.derive_unaligned {
if let Some((meta, _)) =
metas_reprs.iter().find(|&repr: &&(_, R)| repr.1.is_align_gt_one())
{
return Err(vec![Error::new_spanned(
meta,
"cannot derive Unaligned with repr(align(N > 1))",
)]);
}
}

joshlf added a commit that referenced this issue Sep 25, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
@joshlf joshlf linked a pull request Sep 25, 2024 that will close this issue
@joshlf
Copy link
Member Author

joshlf commented Sep 25, 2024

I've begun to draft an overhaul of repr parsing and in-memory representation in #1752. My plan is to follow up with a fix to this issue once that lands.

joshlf added a commit that referenced this issue Sep 25, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 25, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 25, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 25, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 25, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 25, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 25, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
joshlf added a commit that referenced this issue Sep 26, 2024
Represent the result of parsing all `#[repr(...)]` attributes on a type
as a high-level type which is only capable of representing valid
combinations of `#[repr(...)]` attributes and processes them into a
concise representation that's easier for high-level code to work with.

This prepares us to more easily fix #1748.

While we're here, we make a number of other improvements.

1) Errors are now converted to `TokenStream`s as late as possible rather
   than as early as possible, which was the previous behavior. This
   allows us to bail early when deriving an implied trait fails (e.g.,
   deriving `TryFromBytes` when the user wrote `#[derive(FromZeros)]`).
2) Avoid re-computing some repr information in `TryFromBytes` enum
   support.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
blocking-next-release This issue should be resolved before we release on crates.io bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants