Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The unbounded gap between head receipt and the best domain block #1673

Closed
NingLin-P opened this issue Jul 18, 2023 · 13 comments · Fixed by #2933
Closed

The unbounded gap between head receipt and the best domain block #1673

NingLin-P opened this issue Jul 18, 2023 · 13 comments · Fixed by #2933
Labels
execution Subspace execution

Comments

@NingLin-P
Copy link
Member

Original from #1650 (comment)

In the domain v2 design, there is an unbounded gap between the head receipt and the best domain block, which means unbounded confirmation of the domain block.

The gap is introduced because ER is submitted with a bundle and there is only one ER within a bundle. In the worst case, the ER confirms the current head receipt thus head receipt is not extended but the bundle will result in a new domain block and increase the best domain block by one thus adding one block to the gap. In the best case, the ER extends the head receipt by one but the bundle will result in a new domain block so increasing the best domain block by one thus the gap remains.

So once the gap happens it will ever only grow but not shorten, and there is no limit to bound this gap. One good new is that if this issue is fixed then the unbound hash storage problem is also fixed (an open problem in domain v1).

@nazar-pc
Copy link
Member

nazar-pc commented Nov 2, 2023

Why is it possible to put old ER into the bundle? I thought when bundle is built it must point to the latest ER seen so far or else it is not valid.

@vedhavyas
Copy link
Contributor

vedhavyas commented Nov 2, 2023

Once we introduce the constraint where all the bundles within one consensus block extends the head_receipt, then the gap between head_receipt and best_domain_block will be just 1.

Here is my thinking on this
For example,

  • Current state on consensus chain at block #n

    • head_receipt = 0 and best_domain_block = 1 since any bundles in consensus block #n would increase best_domain_block from 0 to 1. And all the bundles within this consensus block will contain ER #0 else they are rejected
  • At consensus block #n+1,

    • Any bundles that do not include ER #1 will be rejected and all the bundles within this consensus block with ER#1 will be accepted.
    • Then the state would be head_receipt = 1 and best_domain_block = 2
  • At consensus block #n+2,

    • Any bundles that do not include ER #2 will be rejected and all the bunldes within this consensus block with ER #2 will be accepted.
    • Then the state would be head_receipt = 2 and best_domain_block = 3

This should not cause any unbounded gap between head_receipt and best_domain_block

Am I missing something here @NingLin-P ?

@NingLin-P
Copy link
Member Author

Why is it possible to put old ER into the bundle? I thought when bundle is built it must point to the latest ER seen so far or else it is not valid.

In main, we also accept the current head receipt which is the latest ER seen by the consensus chain but may be not the latest ER that is derived from the latest domain block. And this is exactly what #1673 does, to only accept the latest ER that is derived from the latest domain block.

Once we introduce the constraint where all the bundles within one consensus block extends the head_receipt, then the gap between head_receipt and best_domain_block will be just 1.
Am I missing something here @NingLin-P ?

Yeah, this is also what we expected the new constraint would bring, but it is just for the normal cases not when fraud proof is involved. Because fraud proof will prune the bad receipt and revert the head_receipt, so every fraud proof will bring a gap of head_receipt_number - bad_receipt_number.

When there is no fraud proof submitted, the gap remains the same, whenever new fraud proof is submitted, the gap grows by head_receipt_number - bad_receipt_number, and the gap never reduces, so it is unbounded.

@vedhavyas
Copy link
Contributor

so every fraud proof will bring a gap of head_receipt_number - bad_receipt_number.

Why, with our assumption that there is one honest operator in the network. If there is bad ER, the operator will be forced to submit a fraud proof before submitting the bundle and once that bad er and its descendents are pruned, the honest operator will continue to produce new good ER from that point.

@NingLin-P
Copy link
Member Author

Even in the best case, where there is only one bad ER, namely the head receipt is the bad ER, and the honest operator submits the fraud proof immediately in the next block, the gap will be increased by 1.

Besides, our assumption is the honest operator submit a fraud proof before the bad ER is confirmed, so there can be more than one bad ER build on top of the first bad ER. Depending on the number of bad ER, every pruned bad ER will increase the gap by 1.

@vedhavyas
Copy link
Contributor

Besides, our assumption is the honest operator submit a fraud proof before the bad ER is confirmed, so there can be more than one bad ER build on top of the first bad ER. Depending on the number of bad ER, every pruned bad ER will increase the gap by 1.

Hmmm, this would not be case if we also reset the best_domain_block, which is what causing the gap.
My reasoning behind this is, since bundles are built on top of the bad ER and/or their decendant ERs which will be pruned. These bundles are inherently bad bundles even though they may have some/all valid transactions.

An honest node should not build new blocks once they see a bad ER and once pruned, they can continue from good best_domain_number and head_receipt_number. Ofcourse users need to resubmit their transactions and this is an implicit assumption where until the domain block is confirmed, their transactions are not confirmed

@NingLin-P
Copy link
Member Author

since bundles are built on top of the bad ER and/or their decendant ERs which will be pruned.

Emm... bundle isn't built on top of ER, bundle just a collection of domain extrinsic, it has no relation with ER or other bundles, bundle and ER just submitted together.

These bundles are inherently bad bundles even though they may have some/all valid transactions.

Then why are they inherently bad bundles at all, we already have check to detect invalid bundle and exclude them from execution, I would say this is the only definition of bad bundle and any bundle that can pass the check is a good bundle.

An honest node should not build new blocks once they see a bad ER and once pruned, they can continue from good best_domain_number and head_receipt_number.

The best domain block number increases whenever there is a bundle accepted in the consensus chain, this is determined in the consensus chain and the operator just follows the consensus chain. Without further considering feasibility, IMO this change just brings more work and complexity than allowing submitting multiple ER within on bundle.

@vedhavyas
Copy link
Contributor

Emm... bundle isn't built on top of ER, bundle just a collection of domain extrinsic, it has no relation with ER or other bundles, bundle and ER just submitted together.

I disagree. All the bundles in a given consensus block form next domain block and ER in the bundle are inherently tied to the Execution of the bundles, and domain block, in the previous consensus block. They are tied together.

Then why are they inherently bad bundles at all, we already have check to detect invalid bundle and exclude them from execution, I would say this is the only definition of bad bundle and any bundle that can pass the check is a good bundle.

Disagree here as well since the extrinsincs inside these bundles and their validity is strictly tied to the execution of the domain that is tied to the ER they submnitted. If a specific ER is invalid, then anything that ties to this ER should be invalid as well.

The best domain block number increases whenever there is a bundle accepted in the consensus chain, this is determined in the consensus chain and the operator just follows the consensus chain. Without further considering feasibility, IMO this change just brings more work and complexity than allowing submitting multiple ER within on bundle.

IMHO, the problem we are seeing is due to the fact that we consider Bundles and ER are strinctly connected and causing this issues. This is also causing a lot of edge cases that we are finding when we actually run them.
Ofcourse this change brings more changes but I believe this also cleans up a lot of complexity since bundles extends domain blocks and ER cofirms that domain block's execution offchain.

Any thoughts on this @nazar-pc ?

@NingLin-P
Copy link
Member Author

Disagree here as well since the extrinsincs inside these bundles and their validity is strictly tied to the execution of the domain that is tied to the ER they submnitted. If a specific ER is invalid, then anything that ties to this ER should be invalid as well.

The invalid bundle check is performed against the execution result of the parent domain block, how can you say a bundle is invalid even if all of its extrinsic can pass all the checks?

IMHO, the problem we are seeing is due to the fact that we consider Bundles and ER are strinctly connected and causing this issues.

To me, the problem is simply because the fraud proof can prune multiple bad ER at once while the honest operator can only submit one ER within one bundle and the bundle itself introduces a new ER to be submitted in the future, thus the problem is the inequality.

This is also causing a lot of edge cases that we are finding when we actually run them.

Could you be more explicit about what edge cases and how your proposed change can fix them? As I see we are discussing the gap issue when fraud proof is involved and we certainly have not seen a fraud proof in the network before.

Ofcourse this change brings more changes but I believe this also cleans up a lot of complexity since bundles extends domain blocks and ER cofirms that domain block's execution offchain.

As a reminder, your proposed change will revert the decision we have made on serval topics:

  • Bundles are not considered invalid due to it is submitted with a bad ER
  • Let the consensus chain decide when the best domain number is increased instead of letting the operator decide in offchain

They are all made out of good reason, and exist through the development of domain v2 (along with the notion of "bundle and ER just submit together and not tied together"), to revert them is quite a fundamental change to me (again without further considering feasibility), thus would prefer to see what exact benefit it can bring and whether it is worth first.

@nazar-pc
Copy link
Member

nazar-pc commented Nov 6, 2023

Then why are they inherently bad bundles at all

Any operator that submits invalid bundle or extends an invalid ER will be slashed. They are supposed to all run execution themselves, only extend valid ERs and submit fraud proofs as soon as they see invalid execution.

how can you say a bundle is invalid even if all of its extrinsic can pass all the checks?

If it extends invalid ER it is automatically invalid as well and corresponding operator that produced it must be slashed too.

IMHO, the problem we are seeing is due to the fact that we consider Bundles and ER are strinctly connected and causing this issues.

This was done as a simplification of the architecture because splitting them was causing other difficulties.


If either of you have a specific proposal on how to solve this (not clear if you have completely though it through) create a clear suggestion on research forum and get research team involved. This is not something that we should take lightly and Jeremiah will also need to be informed on this.

@NingLin-P
Copy link
Member Author

Any operator that submits invalid bundle or extends an invalid ER will be slashed. They are supposed to all run execution themselves, only extend valid ERs and submit fraud proofs as soon as they see invalid execution.
If it extends invalid ER it is automatically invalid as well and corresponding operator that produced it must be slashed too.

Yeah, this is the exact behavior of how it works right now in main, and the multiple ER change doesn't change/conflict with this behavior.

@vedhavyas
Copy link
Contributor

I dont think this issue is relevant anymore with the recent changes that ensure bundles carry ER of the HeadDomain block number.
If you still see an issue here @NingLin-P , please post your concenrs here. Thanks!

@NingLin-P
Copy link
Member Author

with the recent changes that ensure bundles carry ER of the HeadDomain block number

IIRC, we have ensured the bundles carry ER of head_receipt_number + 1 but not the HeadDomain block number.

The issue still exist IMO, consider the head_receipt_number is 99 and the HeadDomain block number is 100, as fraud proof that target the ER of block 50 and reverts the head_receipt_number to 49, the HeadDomain block number is still 100 hence the gap still exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
execution Subspace execution
Projects
Status: Done
4 participants