Proposal: Consider immediate execution #7898

liamsi · 2020-06-17T13:57:05Z

This is a draft (will update shortly):

Summary

Consider enabling immediate execution (either by default, or, somehow make it possible to chose or overwrite the current behaviour for tendermint users).

Problem Definition

Currently, tendermint executes transactions one "block height off". Meaning that In the current execution model, blocks are executed against the app only after they are committed.
Full block verification (incl. state) always needs access to transactions of the previous block:

state(1) = InitialState
state(h+1) <- Execute(state(h), ABCIApp, block(h))

While that seem to be fine for most use-cases there are a few draw backs here:

First of all it's confusing why transactions do not simply get executed in the same block (this is mostly a documentation concern). There is no clear documentation why this decision was made in tendermint. There are a few issues where this was discussed but these discussions are difficult to find and they don't explain the decision well enough.
Dealing with one-offs is a classical source of bugs. That is a concern for app or rather SDK module developers. E.g., a dev who was instrumental in designing and implementing the PoS module in the SDK confirmed that "it actually being quite annoying to deal with the +1 offset". This is mostly about developer usability. It also (unnecessarily?) complicates the app centric point of view (ref: App centric interpretation of concepts #2483).
In the context of IBC, for some zones it might be annoying to essentially "wait" an extra block for the state to actually be updated. Not sure if this is a real issue actually. But I can imagine for some projects that waiting a few extra seconds is at least non-optimal.
For certain fee models, deferred execution is a burden, or makes them hacky/impossible to implement: Further investigate execution celestiaorg/celestia-core#3 (comment)
and Further investigate execution celestiaorg/celestia-core#3 (comment)

Proposal

First, the reasoning behind the current execution model needs to be documented (my understanding is that it is an optimization to reduce latency; s.t. validators can reach consensus on tx ordering quickly and then do the state transitions leisurely while timeout_commit didn't kick in yet). This should be done independent of the proposal to execute earlier.

TODO

longish discussion over at LazyLedger: Further investigate execution celestiaorg/celestia-core#3
Run CheckTx before voting on blocks #2384
app centric interpretation of concepts: App centric interpretation of concepts #2483
invalid/spam tx in blocks (sdk): Prevent Spam Txs cosmos/cosmos-sdk#4695
also a bit related (block pre-processing phase would happen before the block gets proposed in the immediate exec I guess): https://github.com/tendermint/tendermint/issues/2639
kinda related discussions on CheckTx: Run CheckTx before voting on blocks #2384 (comment)

For Admin Use

Not duplicate issue
Appropriate labels applied
Appropriate contributors tagged
Contributor assigned/self-assigned

liamsi · 2020-10-20T16:04:29Z

Note that this would likely make solving this issue easier, too: #3322

tessr · 2020-10-29T13:50:13Z

Evaluating this, and deciding whether we want to switch execution models, is now on the roadmap for the 0.35 milestone. (If we do decide to do this, the implementation may happen as part of the 1.0 milestone, but we'd like to make a decision on it as part of our 0.35 work.)

tac0turtle · 2020-11-03T14:44:12Z

Talking with @liamsi today we have come up with a potential design that could make this optional to applications.

With the proposed preprocess change a proposer could execute the txs and provide an app hash. In the next phase (precommit) there is an ABCI call to the application (checkBlock or preCheck) that provides it with the header and data fields of the block. Validators could execute the txs to get the apphash that was added by the proposer. By keeping the Commit phase ABCI calls the two new ABCI calls become optional. If the application would like to continue with delayed execution they can make these ABCI calls noops.

Diagram:

There are implementation details to be considered if this were to be accepted.

ebuchman · 2020-11-03T22:29:36Z

I'm not sure optionality is going to be an option here, without a massive refactor, given how pervasive assumptions about the execution model are, down to the data structures. So I'd think it would probably be all or nothing on this change.

It's hard to imagine that the performance concerns that initially justified delayed execution really hold a candle to the drastic UX issues this has caused, or to the speedups to come later from aggregate sigs. Not to mention possible economic problems. So figuring out how to move forward with immediate execution seems like the right idea.

That said, this problem has a pretty huge surface area, so it will be a lot to work through.

In the diagram above, it looks like Prevote and Precommit are in the wrong order? Also re the complexity of optionality, I suspect if we do this there will be just one block execution, and there's no more excuse for blocks with invalid txs (ie. txs which don't at least "pay gas"). This means after we discover an executed block is invalid, we have to roll it back (note apps already have to support something similar to this, but not at all identical, for checktx and out-of-gas style roll backs)

Presumably the options for when the execution happens are:

After receiving the proposal, before prevoting. This is simplest but seems to add the most latency
After receiving the proposal, before precommiting. This way execution can happen concurrently while we're prevoting and waiting for prevotes from everyone. But this means we might end up with +2/3 prevotes ("polka") for a block that is actually invalid, which may have liveness implications we need to be careful about.

Implementation strategy might support starting with (1) which is easier and then transitioning to (2) at some point once things are further worked out. But everyone would have to bear the latency hit of (1) in the meantime and it could be non-trivial. Note the difference in implementation complexity between these two options should be tiny compared to the difference between delayed and immediate execution.

We have another degree of freedom around whether proposers execute before proposing or at the same time as everyone else - ie. they could propose a block without an app hash, and then everyone fills in the same app hash before precommiting. This doesn't matter much for (1), but for (2) it complicates the relationship between prevotes and precommits (since prevotes wouldnt have the app hash and precommits would), so in the case of (2) probably proposers should execute before proposing.

There's also a question about the off-by-1 of commits, which I believe is independent of execution. Right now canonical commits are in the next block. Maybe this isn't as big a deal since a valid commit is always available once a block is committed, it just may not be canonical.

Of course there's lots of possible downstream performance improvements coming from pipelining and more interplay between mempools, block propogation, and speculative execution. But this can all get pretty complex quick :) . I also wonder sometimes whether a 3rd phase of voting ala Hotstuff might simplify some of this :P

tac0turtle · 2020-11-04T09:52:01Z

It's hard to imagine that the performance concerns that initially justified delayed execution really hold a candle to the drastic UX issues this has caused, or to the speedups to come later from aggregate sigs. Not to mention possible economic problems. So figuring out how to move forward with immediate execution seems like the right idea.

👍 👍 👍 👍 👍

In the diagram above, it looks like Prevote and Precommit are in the wrong order?

I accidentally swapped them in the diagram.

But everyone would have to bear the latency hit of (1) in the meantime and it could be non-trivial.

This came up in our conversation and we came to a similar conclusion. With various changes Tendermint this latency could be reduced. On top of changes in Tendermint an application could make changes to transaction execution to increase speed which would help as well.

cmwaters · 2020-11-05T10:06:45Z

there is an ABCI call to the application (checkBlock or preCheck) that provides it with the header and data fields of the block.

Just to confirm my understanding here, as well as checking that the app hash is the same, if any of the txs were invalid with respect to the app then the app would also return an error and then the node would prevote/precommit nil? If this is the case what happens if 2/3+ vote for this invalid block? does the node panic?

So I'd think it would probably be all or nothing on this change.

I don't know the details too well but I would tend to agree with this. I feel like we would be too stretched trying to offer both immediate and delayed execution and it also might come across as a bit confusing for developers

A few other things I would like to comment on. If we do execution before consensus is reached and we reach timeout or don't get votes because it is invalid in any way than applications will need to be able to revert all the txs that they just applied. I don't think we have asked apps to do that before right? Usually if a tx is invalid it is just dropped during deliverTx. Also in the case of multiple rounds (which I know very seldom occurs) we will have to do this operation multiple times.

alexanderbez · 2020-11-05T14:11:03Z

If we do execution before consensus is reached and we reach timeout or don't get votes because it is invalid in any way than applications will need to be able to revert all the txs that they just applied.

Application state is "cached" so to say, it's not persisted or finalized until ABCI#Commit, which I presume we would not call before there is consensus?

Note, invalid txs can and do make it into blocks atm.

liamsi · 2020-11-05T22:14:00Z

Just to confirm my understanding here, as well as checking that the app hash is the same, if any of the txs were invalid with respect to the app then the app would also return an error, and then the node would prevote/precommit nil? If this is the case what happens if 2/3+ vote for this invalid block? does the node panic?

I guess they would vote nil (as the proposed block wasn't valid in that sense).

If this is the case what happens if 2/3+ vote for this invalid block?

Similarly, that isn't related to the execution model and would require the same measures as currently if 2/3+ votes were collected for an otherwise invalid block. Ideally, a block proposer proposing an invalid block would be slashed imo. Fraud proofs as evidence to be included in the next block could help to enforce slashing a proposer that included invalid state transitions.

As Bez said, the app could execute the Tx and cache the result and on commit, it gets actually applied.

The abci method that would be called (e.g. on propose by the proposer and on before prevoting by the other nodes) should filter out all invalid Tx and return the valid Tx, the app state (apphash), and the pending abci updates. On commit both the app state gets actually updated as well as the abci updates get applied to actually update the tendermint state.

tac0turtle transferred this issue from tendermint/tendermint Nov 3, 2020

tac0turtle added C:abci Component: Application Blockchain Interface C:consensus Component: Consensus S:proposal Status: Proposal labels Nov 3, 2020

liamsi mentioned this issue Nov 3, 2020

Further investigate execution celestiaorg/celestia-core#3

Closed

liamsi mentioned this issue Dec 15, 2020

abci: add preprocess block celestiaorg/celestia-core#110

Merged

liamsi mentioned this issue Jan 4, 2021

Keep track of and return intermediate state roots to be included in the block celestiaorg/cosmos-sdk#8

Closed

adlerjohn mentioned this issue Jan 6, 2021

Replace IAVL+ with SMT celestiaorg/cosmos-sdk#6

Open

liamsi mentioned this issue Feb 8, 2021

Implement the PreprocessTxs ABCI method using WirePayForMessage celestiaorg/celestia-app#22

Closed

liamsi mentioned this issue Feb 22, 2021

Implement PreprocessTxs celestiaorg/celestia-app#21

Merged

14 tasks

evan-forbes mentioned this issue Jul 16, 2021

Proposal: Immediate execution to accomodate ISRs celestiaorg/celestia-core#463

Closed

cmwaters transferred this issue from tendermint/spec Feb 21, 2022

cmwaters mentioned this issue Apr 4, 2022

Implement _same-block execution_ mode #8233

Closed

4 tasks

kaldubin mentioned this issue Apr 14, 2022

Is CheckTx in recheck mode run in parallel with Proposal ? (Tendermint 0.34.11) #8350

Closed

evan-forbes mentioned this issue May 21, 2022

Charge gas per message share celestiaorg/celestia-app#431

Closed

github-actions bot added the stale for use by stalebot label Aug 21, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Aug 26, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: Consider immediate execution #7898

Proposal: Consider immediate execution #7898

liamsi commented Jun 17, 2020

liamsi commented Oct 20, 2020

tessr commented Oct 29, 2020

tac0turtle commented Nov 3, 2020

ebuchman commented Nov 3, 2020

tac0turtle commented Nov 4, 2020

cmwaters commented Nov 5, 2020

alexanderbez commented Nov 5, 2020

liamsi commented Nov 5, 2020

Proposal: Consider immediate execution #7898

Proposal: Consider immediate execution #7898

Comments

liamsi commented Jun 17, 2020

Summary

Problem Definition

Proposal

Related:

For Admin Use

liamsi commented Oct 20, 2020

tessr commented Oct 29, 2020

tac0turtle commented Nov 3, 2020

ebuchman commented Nov 3, 2020

tac0turtle commented Nov 4, 2020

cmwaters commented Nov 5, 2020

alexanderbez commented Nov 5, 2020

liamsi commented Nov 5, 2020