Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Finality Verifier Pallet #628

Closed
HCastano opened this issue Jan 7, 2021 · 11 comments
Closed

Finality Verifier Pallet #628

HCastano opened this issue Jan 7, 2021 · 11 comments
Assignees
Labels
A-feat New feature or request P-Runtime

Comments

@HCastano
Copy link
Contributor

HCastano commented Jan 7, 2021

The current Substrate header sync pallet is susceptible to several different attacks. These attacks involve being able to write to storage indefinitely (like with the case of #454), or requiring potentially unbounded iteration in order to make progress (like in the case of #367).

To works around these issues the following is being proposed. Instead of accepting headers and finality proofs directly into the existing pallet we should instead go through a "middleware" pallet. The role of this new pallet will be to verify finality and ancestry proofs for incoming headers, and if successful write the headers to the base Substrate pallet.

This essentially flips the flow of the current pallet, in which we first import headers and later verify if they're actually part of the canonical chain (through the finality proof verification). By verifying finality first and writing headers later we can ensure that we only write "known good" headers to the base pallet.

Additionally, by keeping the base pallet as is (more or less), we are able to ensure that applications building on the bridge can continue to have access to all headers from the canonical chain .

The proposed API for the pallet is as follows:

// Ancestry proof should container headers from (`last_finalized_header`, `finalized_header`]
fn submit_finality_proof(justification: GrandpaJustification, ancestry_proof: Vec<Header>);

If this call is successful, the headers from the ancestry proof would be written to the base pallet.

Optionally, if the proofs don't fit into a single block, we can also have a "chunk" based API to allow relayers to submit proofs across several blocks:

// The `merkle_root` would be the root of a trie containing all the headers you plan to submit
fn signal_long_proof(deposit: T::Balance, finality_proof: GrandpaJustification, merkle_root: Vec<u8>);

// The chunk submitted here would be the leaves of the merkle trie
fn submit_chunk(ancestry_chunk: Vec<Header>);

After a call to signal_long_proof() you'd be able to call submit_chunk() as many times as required to complete the proof. We may want to restrict others from submitting proofs during this period, and add some timeouts in case we don't receive a chunk, or if we don't complete a proof, in a certain amount of time. In the case where we timeout, or the proof is invalid, the submitter will lose their deposit.

@adoerr
Copy link
Contributor

adoerr commented Jan 7, 2021

Could submit_chunk() be abused for some sort of DoS attack? Basically calling submit_chunk() without any prior signal_long_proof() forcing the pallet to repeatedly engage in some sort of costly housekeeping activities.

@HCastano
Copy link
Contributor Author

HCastano commented Jan 7, 2021

Could submit_chunk() be abused for some sort of DoS attack? Basically calling submit_chunk() without any prior signal_long_proof() forcing the pallet to repeatedly engage in some sort of costly housekeeping activities.

I think that upon calling signal_long_proof() we could write the origin of the call to storage. When submit_chunk() is called we could check that the origin is the same as the account that placed deposit.

@tomusdrw
Copy link
Contributor

tomusdrw commented Jan 8, 2021

LGTM! There are some concerns with the long_proof/chunk based API, but we shouldn't be focusing on it initially. I really hope that something like #263 will resolve that kind of issue for us. I can imagine that headers can be retro-actively imported after the fact if that's desired, i.e. we have finality proof + ancestry check via MMR and only then we allow to import missing headers before finality.

submit_chunk should contain merkle proof of header(s) instead of simply leaves, but as said earlier, the main focus should be on the happy path initially.

@HCastano
Copy link
Contributor Author

Reposting @svyatonik's comment from #629 here since this is a better place for this discussion.

Code changes lgtm in context of introducing separate pallet, but I have a couple of questions about where it all heads now. Why would you need this separate finality-verifier pallet at all (and this change particularly)? #628 says that there are problems with pallet-substrate-bridge && to work around these problems you're going to introduce additional pallet. What is going to happen with substrate-pallet calls? Will they still be accessible? I guess the answer is no - you can't leave this hole open, right? So I guess the current idea is to disable substrate pallet calls and expose finality-verifier pallet calls instead? But in this case - why have two pallets, one of which only provides storage and second only provides calls to update this storage?

Now to answer some of your questions.

What is going to happen with substrate-pallet calls? Will they still be accessible? I guess the answer is no - you can't leave this hole open, right?

This is correct, those calls would be removed. The only way to write to the pallet would be through the Finality Verifier.

But in this case - why have two pallets, one of which only provides storage and second only provides calls to update this storage?

This is a good point. Initially I think it makes sense to have two separate pallets. This allows us to explore a new approach without removing any functionality from the system as whole. If we do deem this approach to be successful we can look into refactoring the pallets.

@tomusdrw
Copy link
Contributor

If we do deem this approach to be successful we can look into refactoring the pallets.

Yes, long term it should be a single pallet, right now let's try to minimize the amount of changes, especially given the audit looming.

@tomusdrw
Copy link
Contributor

For the 2-phase fallback process, it's possible to jump 250 headers at once in ancestry even without using MMR (see #263). We can use frame_system storage holding last BlockHashCount headers, and with every jump back from the finality proof, we take a storage proof and skip all 249 headers we don't really need for anything.

@HCastano
Copy link
Contributor Author

HCastano commented Feb 1, 2021

For the 2-phase fallback process, it's possible to jump 250 headers at once in ancestry
even without using MMR (see #263). We can use frame_system storage holding last
BlockHashCount headers, and with every jump back from the finality proof, we take a
storage proof and skip all 249 headers we don't really need for anything.

So I'm trying to make sure I understand you correctly here. What you're saying is that
what we can do is use the BlockHash<T> storage map (whose size is capped at
BlockHashCount) to create a storage proof on the source chain. This proof can then be
submitted to the target chain instead of an ancestry proof to prove that
finality_target and finality_target - BlockHashCount are indeed part of the source
chain's DB (and thus related).

Is this the right idea?

@tomusdrw
Copy link
Contributor

tomusdrw commented Feb 3, 2021

Is this the right idea?

Yup, it's exactly that.

@HCastano
Copy link
Contributor Author

HCastano commented Feb 4, 2021

One of the problems we've been running into with the current design is handling long
chains of finalized headers (e.g if GRANDPA's been offline for a while and finalizes
a chain of 30,000 headers in one go).

The first way we thought of handling this was with the multi-stage protocol proposed in
the initial issue. After talking with @tomusdrw, @andresilva , and @adoerr today we came to
agreement that the multi-stage API proposed here would be complicated to implement and
hard to secure.

The next solution was to use sparse ancestry proofs. This can be done either with MMRs or
with storage proofs as proposed by Tomek in an earlier comment. The problem with these
sparse proofs is that since we wouldn't be importing every single header to the base
pallet we couldn't guarantee that all authority set transitions happened correctly.

Consider the following scenario:

The source chain has best_finalized as H1. Let's also say that H5 schedules and
enacts the next authority set change (set 1 -> set 2). Without importing H5 the target
chain would not know that there are new authorities signing off on blocks starting at
H5.

A set of malicious set 1 authorities could collude and finalize H10'. This would have a
"valid" justification as far as the target chain is concerned, so it could be imported
successfully - thus tricking the bridge.

Since the authorities are not bonded on the bridge, and there's no way to report this
misbehaviour to GRANDPA, there are no downsides for the authorities involved with this
attack.

To mitigate this we will limit the finality_target submitted to the bridge to be the
next header which schedules changes (in our earlier example, this would be H5). This
header is guaranteed to have been finalized by GRANDPA.

We do need to make some changes to the ConsensusLog of this header so that the next
set_id is included. By having this as part of the digest we can check that set_id's
are increasing monotonically, and authorities sets are thus transitioning correctly.

@tomusdrw
Copy link
Contributor

tomusdrw commented Feb 5, 2021

The final API should look like this:

enum AncestryProof {
  /// Full header-chain between current best finalized header (`current_best`) and the new one.
  Chain(Vec<Header>),
  /// A runtime storage proof of `frame_system::block_hash(current_best.number)`.
  ///
  /// The storage proof is verified against the new header that is being imported.
  /// If the storage value matches the hash of our best finalized header, we can be sure
  /// that we are extending the right chain.
  StorageProof(Vec<u8>),
  /// Merkle Mountain Range Proof
  /// 
  /// Reserved for future use when MMR comes along.
  MMRProof(Vec<u8>),
}

trait GrandpaLightClient {
   fn append_header(
     header: Header,
     justification: GrandpaJustification,
     ancestry_proof: AncestryProof,
   );
}

I've depicted this as an enum just to make it eaiser to understand and show that API stays the same despite of the proof being used, but the pallet will most likely only support one way at the time (this simplifies weights significantly).

The pallet logic should basically:

  1. Verify justification & ancestry proof
  2. Check that justification is signed by validator set it knows of.
  3. If the header contains ScheduledChanges digest item it should store that and perform set transition if block beyond the enactment point is imported (note that we might still import blocks before the enactment and they should be verified against current validator set).

Since GRANDPA is guaranteed to always create justification for ScheduledChanges block it will work just fine and no extra measures are necessary.

Later on, we would additionally verify that the validator-set-id is incrementing when Andre implements that changes, but note that we can easily implement everything already and it will work just fine.

@tomusdrw
Copy link
Contributor

Closing, since it's already implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-feat New feature or request P-Runtime
Projects
None yet
Development

No branches or pull requests

3 participants