-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Execution driver uses NodeSyncState for certificate execution #3323
Conversation
7c6b4a1
to
ec51269
Compare
Unsure what the result of this is: can e guarantee that all previous certs will be enqueued or we need extra logic for this. I am particularity concerned about the precursor transactions to shared objects, that if not present may block the execution of a shared object cert, slowing everyone down. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a serious amount of stuff in the PR which is quite complex -- due to the nature of what we do. Do try to provide a block interface to drive / request execution rather than a tx by tx interface -- it will make our job down the line easier. I am still unclear if for shared object trasnactions we fetch deps -- but I think not.
.handle_execution_request(pending_transactions.iter().map(|(_, digest)| *digest)) | ||
// zip results back together with seq | ||
.zip(stream::iter(pending_transactions.iter())) | ||
// filter out errors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we not do something about the errors? This is the place where if some dependent previous transaction has not been processed, objects will not be available etc. We should probably print / log something about these? And also ensure we process them using full sync?
// this pattern for limiting concurrency is from | ||
// https://github.com/tokio-rs/tokio/discussions/2648 | ||
let limit = Arc::new(Semaphore::new(MAX_NODE_SYNC_CONCURRENCY)); | ||
let mut stream = Box::pin(stream); | ||
|
||
while let Some(DigestsMessage { digests, peer, tx }) = stream.next().await { | ||
while let Some(DigestsMessage { sync_arg, peer, tx }) = stream.next().await { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
heads up: efficient parallel execution can be best implemented if you pass in blocks of transactions rather than passing the transactions in one by one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's your argument here for efficiency? I can imagine some theoretical benefits (e.g. cache coherency, less synchronization overhead in the channels) that might be obtained from better batching, but I wouldn't expect that to have a noticeable effect here.
Also - keep in mind that even if we pass in TXes in blocks, we are going to want to farm the execution out to multiple tasks for execution parallelism anyway. (This is a TODO right now - i'm waiting until I have a working devnet to do that so I can measure how much of a speedup it is).
})?; | ||
} | ||
trace!(?parent, ?digest, "waiting for parent"); | ||
// Since we no longer hold the semaphore permit, can be sure that our parent will be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure this is good enough to prevent a deadlock, what if something else takes this permit and also blocks? Is that possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you read from the top of process_digest
, you'll see that the only blocking task we do while holding the permit is downloading the cert and effects, after which we drop the permit. Downloading will always complete (successfully or otherwise), so permits cannot be held indefinitely.
.forget_effects(&effects.effects.digest()); | ||
} | ||
CertAndEffects::Validator(cert) => { | ||
self.state.handle_certificate(cert).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work of course if we do not have the dependencies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct - an error will be reported to the caller.
) -> impl Stream<Item = SuiResult> { | ||
let futures: FuturesOrdered<_> = checkpoint_contents | ||
.iter() | ||
.map(|digests| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honesty I would make the base case the case where we send and process the whole block of transaction, and the special case the case where we have a block of 1.
digests: impl Iterator<Item = TransactionDigest>, | ||
) -> impl Stream<Item = SuiResult> { | ||
let futures: FuturesOrdered<_> = digests | ||
.map(|digest| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, better to handle blocks of transactions.
where | ||
A: AuthorityAPI + Send + Sync + 'static + Clone, | ||
{ | ||
async fn handle_digest(&self, follower: &Follower<A>, digests: ExecutionDigests) -> SuiResult { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here again, I would send blocks of transactions from the follower on follower block boundaries.
Yes, my apologies - this code got fairly complex when dealing with the "wait for finality / trustworthy effects" problem, and it hasn't gotten any simpler since then. If I had time I would probably love to rewrite it and make it simpler, but we may be stuck with it for now. |
That's a good point - I was thinking more about checkpoints (since that's what motivated this work), and in that case i'm pretty sure it's impossible to have an orphaned tx in a fragment. I will add support for parent fetching/execution. |
ec51269
to
b6ac1c4
Compare
I thought some more about fetching parents. I think this issue was somewhat underexplored in the old code. Let's restrict the discussion to shared-object TXes that have been scheduled on a validator (checkpoints and nodesync are simple because we have final effects in hand in those cases):
Ok, so, all that said, it seems like to do this correctly, after encountering owned object lock errors, we should execute the cert on other validators until we get an f+1 quorum that all give us identical effects in response. Then we can ask those same validators for the missing certs (recursively). This could potentially cause O(n^2) network traffic, so it seems we should be cautious about doing this. Also, this is a decent chunk of additional code. Alternatively, the entire problem can be avoided by putting the responsibility on the client to push certs to as many validators as it can (which is what QuorumDriver already does) - if it fails to do so, and if gossip also fails to propogate the certs in question, then yes, some additional latency is possible. But I feel like going with the simpler, client-driven model is "the Sui way" and avoids a lot of potential pitfalls. Currently the only user of the execution driver that doesn't also have a true effects to start with is the shared object tx case. So my questions are:
|
Discussed parent syncing with @gdanezis - decision is to add parent syncing so that execution driver is guaranteed to make progress and acts as a backstop to the fallible processes of gossip and quorum driver. |
Re processing transactions in blocks - will do some benchmarking or profiling to determine whether that is necessary. Checkpoint sync (when it is working) may also serve as a good test case since it will eliminate much of the bookkeeping overhead. |
This takes care of the performance TODOs in execution driver (for now, at least).
It also makes it acceptable for checkpoints to pend digests for which it doesn't have certs (they will be fetched).
Note that this does not automatically fetch parent certs, and assumes that they will be enqueued as well (or are already executed). If we're not comfortable with this assumption we can certainly add parent fetching to NodeSyncState.