-
Notifications
You must be signed in to change notification settings - Fork 745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reduce size of futures in HTTP API to prevent stack overflows #5104
Reduce size of futures in HTTP API to prevent stack overflows #5104
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Michael :)
@@ -866,14 +866,14 @@ where | |||
panic!("Should always be a full payload response"); | |||
}; | |||
|
|||
let signed_block = block_response.block.sign( | |||
let signed_block = Arc::new(block_response.block.sign( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this test only code? If so do we also have stack overflows during tests?
(Question applies to all the uses in this file)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah this is just for tests, I ended up having to make quite a few changes to make the types consistent. There's a bit of inefficient Arc-ing and de-Arc-ing, but it's just in tests so it shouldn't be too bad
Jimmy got some stack overflows in tests, but only in debug mode
@@ -380,6 +379,10 @@ pub async fn reconstruct_block<T: BeaconChainTypes>( | |||
None | |||
}; | |||
|
|||
// Perf: cloning the block here to unblind it is a little sub-optimal. This is considered an | |||
// acceptable tradeoff to avoid passing blocks around on the stack (unarced), which blows up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// acceptable tradeoff to avoid passing blocks around on the stack (unarced), which blows up | |
// acceptable tradeoff to avoid passing blocks around on the stack (unArc'ed), which blows up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kinda like the lowercase spelling
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Simple changes that seem to make a lot of difference. It was an impressive research effort to get to the bottom of it.
I'll merge this because there doesn't seem to be anything substantial outstanding and @michaelsproul said it's ready for "review/merge". |
Issue Addressed
Closes #5080.
Proposed Changes
Box::pin
on a few futures inpublish_block
. This reduces the size of related futures from 100KB to ~33KB.Arc
s to reduce the amount of data on the stack inpublish_block
and related functions. This reduces the size of the publish block futures from ~33KB to ~16KB.One of the reasons for the stack size blowup in futures is described in this blog post: https://swatinem.de/blog/future-size/.
Even with that knowledge, I still found it quite hard to determine interventions that would be effective, and did a lot of trial and error changing things and re-running the type-size analysis.
The commands I used for testing were:
Followed by post-processing:
Additional Info
There's still more we could do here. Some types like the
PreProcessingSnapshot
end up on the stack during block publishing, and are quite large due to containing unarced blocks and states.There's also an outstanding mystery as to how 100KB futures can cause an 8MB stack to overflow, when only they should only be nested ~1 at a time. Either I've misunderstood what happens when futures poll sub-futures recursively, or there's something in
warp
that creates the amplification.NOTE: I have yet to measure whether the boxing and arcing has an impact on performance. It would be good to monitor metrics for block publication time after deploying this branch.
A backtrace implicating
warp
can be found here: https://gist.github.com/michaelsproul/d0a4b93bb9d21a5be1103acf4c0a3753.Thanks to @dapplion for pairing on this and helping with some of the early progress.