Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode with mem limit #602

Closed
wants to merge 6 commits into from
Closed

Decode with mem limit #602

wants to merge 6 commits into from

Conversation

ggwpez
Copy link
Member

@ggwpez ggwpez commented May 28, 2024

Adds a try_alloc function to the Input trait to track allocations. This makes it necessary to implement it in the decode any type, but is not type checked. Currently it is only implemented for vectors.

Some tests are in place to showcase the usage.

This MR adds functionality to restrict the memory usage while decoding a type.
Some tests show how to use it with nested types or tuples, but mostly a PoC to evaluate the approach.
We could also do it like the DecodeLimit and track the memory consumption in the Input stream, but it cannot be enforced by the type system.

PS: I will change it to track the limit in the Input, as it allows to also track the other limits (like recursion depth) at the expense of not being type checked.

Changes
Adds a generic DecodeWithContext that allows a decode_with_context call to pass down some context when decoding.

This new trait is then used together with MemLimit such that Vec implements DecodeWithContext<MemLimit>.
There is a blanket in place that converts DecodeWithContext<MemLimit> to DecodeWithMemLimit, and therebyte, Vec is now DecodeWithMemLimit.

Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
src/codec.rs Outdated Show resolved Hide resolved
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
@serban300 serban300 self-requested a review May 29, 2024 06:33
Signed-off-by: Oliver Tale-Yazdi <oliver.tale-yazdi@parity.io>
Copy link
Contributor

@serban300 serban300 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion and for opening the PR ! The high level idea looks good ! I just left a couple of comments.

Also could you please let me know how you would prefer to move forward from here ? Would you have time to continue this PR ? If not I would be happy to take it since we need it for the Bridges + XCM.

<Compact<u32>>::decode(input)
.and_then(move |Compact(len)| decode_vec_with_len(input, len as usize))
}
}

// Mark vec as MemLimited since we track the allocated memory:
impl<T: Decode> crate::memory_limit::DecodeMemLimit for Vec<T> { }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to do impl<T: crate::memory_limit::DecodeMemLimit> crate::memory_limit::DecodeMemLimit for Vec<T> { } . Otherwise I can call decode_with_mem_limit for example on a Vec<Box<_>> which shouldn't support this yet.


impl MemLimit {
/// Try to allocate a contiguous chunk of memory.
pub fn try_alloc(&mut self, size: usize) -> Result<(), Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather just provide the type and do std::mem::size_of() here which should account for the alignment automatically.

@@ -85,6 +85,11 @@ pub trait Input {
/// This is called when decoding reference-based type is finished.
fn ascend_ref(&mut self) {}

/// Try to allocate a contiguous chunk of memory of `size` bytes.
fn try_alloc(&mut self, size: usize) -> Result<(), Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: It looks like we won't really allocate memory here, but only keep track of it. So I would rename this to something like note_alloc(). Or even something more generic like on_before_decode()

///
/// Should be used as a marker for types that do the memory tracking in their `decode` implementation with [`Input::try_alloc`].
pub trait DecodeMemLimit: Decode + Sized {
fn decode_with_mem_limit<I: Input>(limit: MemLimit, input: &mut I) -> Result<Self, Error> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach seems a bit limiting since we can only do decode_with_mem_limit or decode_with_depth_limit. Looks like we can't combine them. It would be nice to be able to do something more generic. For example something like:

decode_with_check(input: &mut I, check: FnMut(DecodeContext) -> Result<(), Error>)

where DecodeContext would be something like

DecodeContext {
    depth: u32,
    used_memory: usize,
    ...
}

This way we could check both for depth and memory limit at the same time. Maybe for other parameters as well. Also would be nice to be able to keep track of the number of objects of a certain type that we encounter recursively while decoding since this is also something that we needed in Polkadot.

Copy link
Member Author

@ggwpez ggwpez May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can't combine them.

We can. Which is why i chose this approach. These functions are just wrappers, but there is nobody stopping you from doing:

let mut input = MemTrackingInput { inner: DepthTrackingInput { inner: .. }};
T::decode(&mut input);

But it wont have any meaning without requiring T: DecodeMemLimit.

This way we could check both for depth and memory limit at the same time. Maybe for other parameters as well. Also would be nice to be able to keep track of the number of objects of a certain type that we encounter recursively while decoding since this is also something that we needed in Polkadot.

Yea, i had some decode_with_context<C>(..) implemented before, but it makes this not such an opt-in.
For example when some type only wants to track recursion and another only size. If we otherwise enforce this, it would be a big breaking change.

The breaking change would occur in the case where we make the Parameter type of a FRAME extrinsic require this.

/// the most fitting name.
pub enum MemAlignment {
/// Round up to the next power of two.
NextPowerOfTwo,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: Currently when using this with a Vector, we align each element individually.
But a Vec allocates all elements in a chunk through the WASM bump allocator. So what we have to do is round(sum(e...)) over all elements instead of sum(round(e)..).

Otherwise it would round each element size up to the next power of two, but the vector may still allocate more since it aligns with the sum of the sizes of all elements.

@serban300
Copy link
Contributor

I've been thinking about this some more and I'm not sure if this solution is feasible. My main concern is related to enums. For example, for the following enum:

enum TestEnum {
    A {field: String},
    B {field: u32},
    C {field: [u8; 10000]},
}
  • for TestEnum::A we need to account for mem::size_of<TestEnum> + the size of the decoded String.
  • for TestEnum::B and TestEnum::C we need to account just for mem::size_of<TestEnum>

I don't know if we can check the type of each field in an enum variant and call try_alloc() just for some certain types. And even if we could, this process would be manual and seems very error-prone.

Another concern is that for structures like BTreeMap, BTreeSet, etc, we'll only record the memory of the elements, but we won't account for the memory of the internal nodes of the map/set. Altough this is not significant, it's still not ideal.

Another problem would be if we have skipped fields that consume significant memory. For example a [u8; 10000].

Another issue is that here we do impl<T: Decode> DecodeMemLimit for Vec<T> {}. This will track memory only for the structures for which we implement DecodeMemLimit. For custom structures memory tracking will work just to the extent to which they actually call decode() for the primitive types inside them. I think the right approach would be impl<T: DecodeMemLimit> DecodeMemLimit for Vec<T> {}, but then the problem is that there will be a lot of custom structures for which DecodeMemLimit is not implemented and it would be a burden to implement it for all.

Also, since I was looking on these, I think there are some issues with the decode_with_depth_limit as well.

  1. it also does impl<T: Decode> DecodeLimit for T { . This might miss structures.
  2. for example should't we do input.descend_ref() and input.ascend_ref() when decoding an enum ?

I'm not sure. Maybe I'm missing something. @ggwpez , @bkchr what do you think ?

@serban300
Copy link
Contributor

serban300 commented Jul 5, 2024

I've been thinking about this some more and I'm not sure if this solution is feasible. My main concern is related to enums. For example, for the following enum:

enum TestEnum {
    A {field: String},
    B {field: u32},
    C {field: [u8; 10000]},
}
* for `TestEnum::A` we need to account for `mem::size_of<TestEnum>` + the size of the decoded String.

* for `TestEnum::B` and `TestEnum::C` we need to account just for `mem::size_of<TestEnum>`

I don't know if we can check the type of each field in an enum variant and call try_alloc() just for some certain types. And even if we could, this process would be manual and seems very error-prone.

Another concern is that for structures like BTreeMap, BTreeSet, etc, we'll only record the memory of the elements, but we won't account for the memory of the internal nodes of the map/set. Altough this is not significant, it's still not ideal.

Another problem would be if we have skipped fields that consume significant memory. For example a [u8; 10000].

Another issue is that here we do impl<T: Decode> DecodeMemLimit for Vec<T> {}. This will track memory only for the structures for which we implement DecodeMemLimit. For custom structures memory tracking will work just to the extent to which they actually call decode() for the primitive types inside them. I think the right approach would be impl<T: DecodeMemLimit> DecodeMemLimit for Vec<T> {}, but then the problem is that there will be a lot of custom structures for which DecodeMemLimit is not implemented and it would be a burden to implement it for all.

Also, since I was looking on these, I think there are some issues with the decode_with_depth_limit as well.

1. it also does `impl<T: Decode> DecodeLimit for T {` . This might miss structures.

2. for example should't we do `input.descend_ref()` and `input.ascend_ref()` when decoding an enum ?

I'm not sure. Maybe I'm missing something. @ggwpez , @bkchr what do you think ?

We'll limit this to tracking only heap size, but we want to make it type safe.

Tracked as part of: #609

Will close this draft for the moment and will open a separate PR later

@serban300 serban300 closed this Jul 5, 2024
@serban300 serban300 mentioned this pull request Jul 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants