Fast Finality in Filecoin (FIP-0086) #809
Replies: 16 comments 64 replies
-
This looks promising. Thanks for the research, conceptualization and communication work here, @hmoniz! Could you elaborate on the following topics? Some or all of these might be specified in the paper, but I think it's worth hoisting them to the discussion for easier accessibility and alignment.
|
Beta Was this translation helpful? Give feedback.
-
Thank you team for getting this up! Bring my question from another private forum here: could you please shed some light on why the existing EC requires 900 epoch for finality? Im sure there are many challenges/issues to “simply” bring that number down, and would like to understand a bit more there! |
Beta Was this translation helpful? Give feedback.
-
Thanks for getting this started! I'm excited for Granite, and think this proposal is well-motivated. @adlrocha was kind enough to give me a walkthrough of Granite a while ago, and I think it makes sense. I'm looking forward to feedback on the high-level idea from more people (in particular, anyone who thinks this is a bad idea). I'm also looking forward to flesh out the details of how we would introduce this to Filecoin. I think some important blanks to fill include:
|
Beta Was this translation helpful? Give feedback.
-
One thing we should discuss is that of a catch up mechanism in Granite (disregarding long-range attacks prevention as this FIP is not intending to solve that). That is, what a participant does when it disconnects for some time and reconnects later. I see 3 options: 1- Add to block headers a new parameter to the latest finalized block (locally seen), that includes some verifiable proof of finality. An example of a proof of finality is a list of the signed DECIDE messages signed by enough participants to account for 2/3s of the power in the power table. That is prohibitively expensive though, and threshold aggregation with weights is non-trivial. BLS aggregation can provide verifiability in about 500 Bytes with the current number of participants, for example. It can make support for committees more difficult though (see for example this WIP doc). 2- Implement a catchup procedure that navigates from the oldest power table known before the participant went offline, waits for the participants in that power table to re-send to the participant catching up signed DECIDE messages accounting for 2/3s of the power in the power table, and iterate through Granite decisions until reaching the head. This can be optimized by requesting only a constant number of participants (as they must locally have the required DECIDE messages). The drawback of this is that it is not verifiable with only on-chain information. 3- A sort of mix of 1 and 2. Have periodic verifiable 'checkpoints' that rest on-chain, and in between checkpoints rely on the catchup procedure (or directly wait for next checkpoint when catching up if they are frequent enough). |
Beta Was this translation helpful? Give feedback.
-
Is the current proposal succinctly provable? That is, we need some way to, given some proven granite consensus "block" at time T, validate some granite consensus "block" at tome T+1 without access to the rest of the Filecoin blockchain. From the sound of it, granite doesn't cover this and doesn't produce any provable outputs (and requires access to, e.g., the Filecoin power table to validate). That helps achieve fast finality, but doesn't solve the overall goal of fast cross-chain communication. |
Beta Was this translation helpful? Give feedback.
-
Is there a succinct comparison between granite and other existing / deployed finality gadgets from other ecosystems? |
Beta Was this translation helpful? Give feedback.
-
I have been reading and evaluating the Granite paper and have a range of points and concerns. I understand the paper is presented as a consensus algorithm independent from Filecoin, but the position for all of my comments is motivated by its use by Filecoin, and I think that perspective is all that this forum need concern ourselves with. I couldn't find a great structure for my comments, so please forgive the form of a long list. I am not a consensus researcher. There's a good chance that some of my comments are rooted in some misunderstanding of common background knowledge or similar. Please do point out when that is the case. But please also understand that most of the stakeholders in Filecoin protocol, including most of the core devs and technical contributors, are also not consensus researchers. Any eventual complete proposal for protocol change needs to be understandable by me and others, so my misunderstandings should also point to where explanation and documentation will eventually be needed in order for the community to evaluate a proposal. That said, I hope my relative naiveté is of some advantage here in thinking about things from a different point of view. Some of the below might be appropriate to break out into top-level comments so we can have a threaded discussion, but I can't tell which ones yet. Required properties, goals etcSomewhere between the opening discussion above and the Granite paper is a missing piece of alignment about what requirements we have for a finality protocol. "Fast" is an obvious goal, but I mean what properties we want of the integrated protocol. We need these to be articulated so we can evaluate a protocol proposal against them. E.g.
Beyond requirements, there are also desired properties. Latency under certain circumstances is one. Multiple posts above have pointed out a desire for the finality to be externally verifiable, especially in smart contracts, without requiring evaluation of the entire EC blockchain. A solution to this would probably answer questions about catch up, indefinite rebroadcasting etc. My impression is that there is some attempt to ignore this desire, but I caution that this may amount to merely postponing the debate until you've already done a lot of work that might need to be redone, vs gaining better alignment up front and then doing the work once. Weights are definitely needed, probably committees tooThe paper is mostly written for an environment of a small, fixed number of equally weighted participants, but this is not the Filecoin environment. Participation must be weighted by the power table. The section toward the end that states some adjustments for weighting is a very difficult way of understanding the resulting protocol. I think it's incomplete. For example it mentions replacing comparisons of counts of nodes >= sums of weights, but does not mention how to adapt counts of nodes == sums of weights; but equality conditions are used e.g. in message validation for PREPARE. An exact match of weight may not be possible. The 6-page separation makes errors like this inevitable. We'll need a description of the protocol as Filecoin is actually supposed to execute it. Similarly the quadratic message complexity of full participation is probably not ok. We should design for at least 100x participation than todays levels. So we need committees and weights. The paper considers as a "patch" to the initial protocol, but this makes it hard to grok. We need a clear description of the protocol assuming weighted committees from the start. Analysis of security and performance needs to include the uncertainty of weighted committee selection. What are nodes deciding on?The Granite paper talks about deciding on an abstract value How are nodes supposed to agree on which epoch they're attempting to finalise? If they don't agree on this, the chances of getting a majority to agree on the same tipset seem diminished. Is that what CONVERGE is for? If so, should we limit its values just to epochs? Alternatively, if nodes diverge on the epoch they select, can we use blockchain structure to drive the finality for the shortest agreed chain? I.e. a vote for finalising some tipset is implicitly a vote to finalise any prefix of that chain too. Or is there some out-of-band mechanism like a clock that dictates what epoch nodes should select when proposing? Message validation and monotonicityI am currently a bit confused about the role of message validity checking. "Ensure that messages from faulty participants conform to the protocol" is an odd goal. Do we not assume that a certain fraction participants are correct (non-faulty)? If that's assumed, why do we also have to check their work? Ultimately, if we have signed messages from a majority of power COMMITing to the same value, what is the message validation protecting us from? Perhaps all I need here is explanation of the design (why this is the simplest way of achieving some goal), but it seems complicated. Regardless, I think we need to express and prove some criteria for the message validation step. It seems likely that monotonicity is such a criteria: no message received can invalidate a message previously considered valid. Can we prove that? It took me quite a while looking at the validation for PREPARE to convince myself it might be monotonic (it is at least subjective, and depends on message ordering. A node can validate multiple PREPARE messages with different values because it can observe different majorities for the PROPOSE that lead to it over time, and another node might never consider the same set of values valid). Why would a node ever vote for someone else's proposal?Our context is validating a blockchain. Each node has a clear view of the chain that it considers correct and, as far as I can see, no benefit from ever voting to finalise a different chain. Nodes might have different views, and a node might accept that a majority of others have finalised a chain other than its heaviest. But I don't see why a node should ever vote for anything other than its heaviest chain. Say we've solved the "which epoch" problem. This means each node has exactly one tipset that it considers the right answer. It doesn't matter what anyone else proposes, it should never vote for anything else (except a null in order to "agree to disagree" and start over). The goal of this consensus algorithm is to confirm that a majority of power has all seen the same heaviest chain. If they haven't, the algorithm should not decide anything. I consider it an anti-goal for the algorithm to attempt to finalise something in this case. This is a different context from other uses of BFT consensus algorithms, including when they are a root chain. In that context, I am concerned with all the steps in Granite where a sets its I can see there might be some confusion if we reconsider the "which epoch" problem, e.g in CONVERGE. Here, if we're working with (epoch, tipset) pairs, it is ok for a node to go with another's proposal for which epoch to finalise, but not ok for it to vote for any tipset at that epoch other than its heaviest. The artificial separation of the abstract consensus algorithm from the integration into Filecoin might be hurting us here. For example, nodes could CONVERGE only on the epoch first (and take lowest ticket); or they could broadcast a set of (epoch, tipset) pairs and then converge on the mode that includes its own vote; or propose one but consider it a vote in favour of any parents in the chain, etc. Secret, weighted committees are incompatible with evaluating majorityVarious points in the main protocol and message validation evaluate whether a count of messages exceeds This seems like a tricky one. I can see some practical benefits to secret committees, but I can't see how they can ever decide something safely and finally. We may have to give up the practical benefits. Rebroadcasting forever?There's a throwaway sentence about rebroadcasting messages to ensure they are received by participants, but this seems insufficiently specified (especially if making claims about message complexity). Rebroadcast for how long? I don't think practical protocol can be rebroadcasting history forever. Instead, participants who miss the original broadcast probably need to catch up somewhere targeted. This raises the question of validation of this consensus algorithm for a fresh node starting up. How does it reach a notion of finality? It can scan the EC chain and (depending on unspecified details) perhaps read a finalised tipset CID out of a recent block header, but then it would be trusting the EC chain to be declaring its own finality or something – how can it discover that a majority of power on the previously finalised chain voted for this finalised tipset? Of course this is recursive, but all the message history is gone. A finalising blockchain?In some of the literature, the finalising module is itself a blockchain, but with a finalising consensus algorithm (such as Granite). It thus has its own chain of historical blocks and messages/inputs to those blocks, including a commitment to the power/stake table with which to validate the votes. This model presents a natural catch-up mechanism that would also be much more efficient that fully validating the EC chain. It's also potentially a solution to finality that can be verified in smart contracts inside other blockchains and low-trust systems, which would make the interop/bridging use cases driving this in the first place that much more secure. This structure is somewhat independent of what algo the nodes use to decide on the block at each round, and Granite or a variant could probably be used. But I think there's an important gap to be filled in here regarding how instances of Granite are put together into a verifiable commitment to finality among Filecoin nodes. |
Beta Was this translation helpful? Give feedback.
-
After 5 weeks of intensive refinement of the specification (big shout out to @anorth @matejpavlovic @ranchalp @mb1896 @arajasek @jsoares) - it is time for another round of public discussion. We are inviting community comments and discussion on our latest drafts Fast Finality in Filecoin (F3) (fka Finalizer, from the beginning of this thread) high-level design document, as well as documents linked therein, and notably GossiPBFT (fka Granite) implementation of F3. In the following days, the community can expect a formal FIP, yet we would like to solicit feedback as early as possible since the designs largely stabilized. F3 specification explains the high level design of fast finality in Filecoin and interaction with Expected Consensus. Interested readers should start from this document. The document discusses:
GossiPBFT implementation of F3 is a core design document of the consensus implementation of fast finality. We refined the previous proposal considerably, tailoring it to Filecoin and EC along the way. The end result can be seen as a variant of the celebrated Castro/Liskov PBFT protocol, maintaining its key invariants and common-case message pattern, without many of its limitations. Namely GossiPBFT:
We presents two versions of GossiPBFT: 1) base one with evidences embedded to messages (ala PBFT OSDI'99) which can be made really efficient by using BLS/Schorr signature aggregation, and 2) a version without evidences (Appendix A) that incurs a tradeoff of more elaborate background message validation and a stronger assumption on GossipSub. In the meantime - implementation efforts have already kicked-off. Have a sneak peak at @anorth 's simulator (WIP) (https://github.com/anorth/f3sim/). We also have a WIP PlusCal/TLA+ GossiPBFT specification which is linked from the GossiPBFT design doc. |
Beta Was this translation helpful? Give feedback.
-
The mention of power leakage attacks leads me to the following question. Given the current 900 epoch finality, technically the network has the ability to detect and potentially react to sudden and malicious power overtake (increase total power to 1/3 QAP within a short period of time) by one actor within 900 epochs. Responses could look like an emergency upgrade that reverts the malicious power overtake. Similarly, implementations like lotus have flags I think the implementation mitigation pathway (emergency rollback mechanism) in the new EC+F3 world should be discussed before it is finalized in production mainnet. cc @arajasek |
Beta Was this translation helpful? Give feedback.
-
Re: EC as the fallback in case of the F3 failures Assuming with the introduction of F3, how could services that depend on/refer to finalities (taking certain finality period in execution logic, i.e: exchanges, bridges), detect and quickly react to the slower finality latency (increase the confirmation waiting period) when instant finality (f3) fails? |
Beta Was this translation helpful? Give feedback.
-
F3 has been merged as (draft) FIP-0086 (https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0086.md), and we're targeting deployment in nv23. Please continue to provide your input here or in #fil-fast-finality. |
Beta Was this translation helpful? Give feedback.
-
The current FIP states:
First, I assume that "delivered by" should have been "delivered to". But even in that case, this is not a safe assumption in the presence of network partitions and denial of service attacks against GossipSub. It's obviously impossible to maintain liveness under such conditions, but F3 still needs to be safe. Can we clarify how F3 depends on this assumption? |
Beta Was this translation helpful? Give feedback.
-
Some threads for some technical review and considerations of the changes proposed in #998 (rebroadcast and queues). The existing FIP skipped over the question of what to do when a message is received entirely. It just assumed they were all there for use in predicates like checking quorum. What the implementation actually does, which is necessary for any reasonable efficiency, is immediately process all messages that are received and maintain state corresponding to the aggregate decisions that they add up to. (So there is no need to subsequently reprocess any messages to calculate things like whether some value has quorum at some phase.) #998 introduces an explicit queue for the first time, but this is a long way different from what is implemented or would be good to implement. I don't think specifying it this way in the FIP is a good idea because the FIP and implementation will be divergent in ways that are hard to mentally translate. Because the messages themselves are not retained, we can't implement the trim function as it is given in the proposed changes. Most of the FIP describes the algorithm in terms of predicates over sets of messages. The code doesn't directly implement these sets, but what it does implement is essentially a direct incremental computation of these predicates. But I think we can make things simpler anyway. There's no practical problem with retaining messages from past rounds. The only thing we need to bound is spammable messages for future rounds. Only messages without justification are spammable. Given a rebroadcast fallback (such as that proposed), we can probably do fine with a crude dropping of messages from far-future rounds. But we chould also explore making no messages spammable, by requiring evidence that a strong quorum has reached that round (whether or not they agree on value). So:
|
Beta Was this translation helpful? Give feedback.
-
#998 again. The location of the call to I think this is mainly an issue of clarity and engineering rather than a fundamental protocol issue. This would all be much clearer if
Structuring it like this would remove a bunch of pseudocode and prose that is attempting to establish these conditions from the place that it's proposed. It would also match closely how it could be implemented. As for the point above, I don't think the FIP should describe it in the current way because the implementation will need to be quite different, rendering analysis and audits of the spec far less useful as assurance that the algorithm as implemented is correct. |
Beta Was this translation helpful? Give feedback.
-
We're having an issue where the network never seems to reach a decision and we believe that this might be due to the lack of rebroadcast of quality messages. Specifically:
As long as the highest-ticket participant keeps participating, we'll never reach a decision because we have a weak quorum that will always vote for P.
|
Beta Was this translation helpful? Give feedback.
-
Proposed improvement to handle missed quality messages (addresses part of #809 (comment)):
We could consider inferring quality messages from, e.g., converge messages but... I'm not 100% convinced that's secure due to how swaying works. It's also probably over-complicating things. |
Beta Was this translation helpful? Give feedback.
-
FIP: https://github.com/filecoin-project/FIPs/blob/master/FIPS/fip-0086.md
Project brief: https://docs.google.com/document/d/10IE6hfK16dbrH9lPWlPS7vGcFRRTAtYzjXEEeYhdkek/edit#heading=h.gigkakuxvvlw
Problem
Currently, in the Filecoin mainnet, a tipset is considered final after 900 epochs. This value corresponds to around 7.5 hours; each epoch lasts 30 seconds. This long time to finality hinders the user experience considerably. It limits applications on Filecoin built with FVM and IPC. Exchanges must have a very long confirmation period for users to manage their FIL assets. Bridges are also notoriously affected by the long wait times in transferring assets.
Goal
ConsensusLab would like to propose a mechanism for the fast finalization of tipsets called the finalization module (FM). During regular network operation, we expect the FM to finalize tipsets within their epoch of proposal. This single-epoch finalization of tipsets is a significant improvement from the current 900-epoch finalization delay.
Background
Expected Consensus (EC) is the current mechanism by which participants in the Filecoin network reach an agreement on tipsets. A tipset is a set of blocks with the same epoch and the same set of parents.
EC is a longest-chain protocol (more accurately, a heaviest-chain protocol) in which each participant builds the chain speculatively as it receives blocks from the network. The protocol elects a set of network participants (i.e., storage providers) to become proposers on each epoch. Each proposer can build a new block and broadcast it to the network.
Additionally, each participant executes a continuous task colloquially called the EC syncer. The syncer receives blocks from the network, performs validation checks, and adds the blocks that pass validation to the chain. In the figure below, we depict this syncing mechanism in a simplified way with one block per tipset. On the left side of the figure, we have the syncer receiving blocks from the network and adding them to the chain. On the right side of the figure, we have the chain where each tipset points to its parent tipset.
If two or more blocks/tipsets of the same epoch have a different set of parent tipsets, this creates a fork in the chain. We can see in the figure that there are two different paths from epoch 104 to epoch 101, resulting in a fork. Forks are resolved using a fork choice rule, a deterministic algorithm that assigns a weight to each tipset in the chain and returns the heaviest path from some tipset, called the head, to the genesis tipset. We refer to this path as the canonical chain. In the figure, the tipset 𝒉 is the head, and the path from 𝒉 all the way to the genesis block is the canonical chain.
If a new block, however, were to be appended to the alternative path (represented by the dashed lines), this path could become the heaviest and, consequently, the canonical chain. This change is a reorganization - a different path to the genesis tipset becomes the canonical chain due to becoming heavier than any other path. We say a path is finalized when a different path can't become the canonical chain.
In EC and, generally, longest-chain protocols, the probability of a path from some tipset 𝒉 to the genesis tipset becoming finalized increases with the number of descendent tipsets of 𝒉. A tipset “inherits” the weight of its descendants and thus becomes harder for a different path to overcome that weight. Over time, that probability becomes high enough that the tipset is considered final for all practical purposes. In the Filecoin network, a tipset is considered final after 900 epochs (or, equivalently, 7.5 hours) from its proposal.
Solution
We propose to solve the problem of the long finalization delay by introducing a finalization module (FM). The figure below shows how the FM augments the existing system to finalize tipsets with a very low delay.
The FM consists of two components:
A Byzantine fault-tolerant consensus protocol called Granite.
The Finalizer. A straightforward algorithm (despite the ominous name) that uses Granite to finalize tipsets. Finalizing a tipset also finalizes the path from that tipset to the genesis.
We can see from the figure that FM requires the chain to expose only two operations: (1) read the current head, and (2) mark a tipset as finalized. Notably, the FM does not interact directly with EC. The only changes required in EC are adjusting the fork choice rule to account for the tipsets marked as finalized by the FM, and removing the 900-epoch “lookback” finalization rule.
Granite
Granite is a protocol that solves the consensus problem in its classic formulation from the scientific literature. Each participant inputs a proposal value, and the protocol outputs one of the input values as the final decision value. We emphasize final because, unlike a longest-chain protocol such as EC, the output of Granite is immutable.
Granite exists in partially synchronous and asynchronous variants. We have a working paper that specifies the partially synchronous variant in detail. We discuss here some of the features that make Granite a desirable protocol for Filecoin:
The protocol has optimal resilience, i.e., it tolerates f Byzantine participant failures up to just less than a third of the total number of participants n, i.e., n ≥ 3f + 1.
Unlike PBFT, HotStuff, Tendermint, and many other protocols in this space, Granite is a leaderless protocol. This property makes it resistant to denial of service attacks because no designated participant represents the weakest link.
Participants can have different weights, which aligns with how storage providers have different amounts of storage committed to the network. A participant's weight in the execution of the protocol can be proportional to their share of committed storage. As such, we can simply use the power table to define the membership in Granite.
At an acceptable cost of resilience, Granite can employ self-selecting committees for scalability such that, regardless of the total number of participants, each communication step only has a constant number of participants broadcasting a message. In the context of the Filecoin network and its peer-to-peer overlay, this technique reduces communication complexity per participant from O(n * log n) to O(log n).
Even when employing constant-sized committees, Granite is resistant to denial of service attacks due to the identity of the committee members remaining secret until after they take action.
Low latency → fast finality. During periods of good communication, Granite will finish in 2-3 communication steps. We expect this to happen well within the 30-second epoch duration. We also define good communication very conservatively, such that we expect the network to provide good communication most of the time. Thus, we expect the FM to finalize a tipset in the same epoch the tipset is proposed.
Finalizer
The Finalizer is a simple algorithm that employs Granite to finalize Filecoin tipsets. It executes as a separate task in a loop. Each iteration of the loop consists of five steps:
The figure below shows the chain after an iteration of the Finalizer in which Granite outputs tipset 𝑏. Notice that upon finalizing 𝑏, we also eliminate the fork from the chain.
Future Opportunities
The introduction of the FM also paves the way for future improvements to the network. In particular:
Removal of tipsets. Tipsets are a consequence of EC and its nature as a longest-chain protocol. With the FM, we can move towards single-block epochs, simplifying network operation and the implementation of clients and removing delayed execution.
Shorter block time. In EC, the long block time (i.e., 30-second epochs) is used as a stabilization mechanism to ensure that participants have more or less the same view of the chain. Due to its final output, we can leverage the Granite protocol to reduce block time significantly, particularly in its asynchronous variant.
Feedback
We welcome feedback from the community and look forward to stimulating discussion.
Beta Was this translation helpful? Give feedback.
All reactions