diff --git a/FIPS/fip-0086.md b/FIPS/fip-0086.md index b17408af..9d06f8df 100644 --- a/FIPS/fip-0086.md +++ b/FIPS/fip-0086.md @@ -26,7 +26,7 @@ We specify a mechanism for fast finality with the F3 component. F3 is expected t ## Change Motivation * The long time to finality on Filecoin mainnet restricts or severely affects applications built on Filecoin (e.g., IPC, Axelar, Wormhole, Glif, …). -* Even though applications on Filecoin can set a lower finalization time than the built-in 900 epochs, delayed finalization for important transactions will require tens of minutes with a longest-chain protocol like Filecoin's Expected Consensus (EC). +* Even though applications on Filecoin can set a lower finalization time than the built-in 900 epochs, delayed finalization for important transactions will require multiple epochs with Filecoin's Expected Consensus (EC). * Long finalization times also affect exchanges, by imposing a long confirmation period (often more than 1 hour) for users managing their FIL assets, and bridges, which face extended wait times for asset transfers. * Bridging to other systems is not currently fast, safe, and verifiable. @@ -40,7 +40,7 @@ We specify a mechanism for fast finality with the F3 component. F3 is expected t ![](../resources/fip-0086/chain.png) -Two or more tipsets of the same epoch with different parent tipsets (like tipsets $C$ and $C'$ above) form a _fork_ in the chain. Forks are resolved using a _fork choice rule_, a deterministic algorithm that, given a blockchain data structure, returns the heaviest tipset, called the _head_. We refer to the path from genesis to the head as the _canonical chain_. Participants may have different views of the blockchain, resulting in other canonical chains. For example, if a participant $p_1$ is not (yet) aware of tipset $D$, it would consider $C$ the heaviest tipset with the canonical chain $[G A B C]$. Another participant $p_2$ aware of tipset $D$ will consider $[G A C' D]$ to be the canonical chain. Once $p_1$ becomes aware of tipset $D$, it will update its canonical chain to $[G A C' D]$ - this is called _reorganization_. We say a tipset is _finalized_ when a reorganization involving that tipset is impossible, i.e., when a different path that does not contain the tipset cannot become the canonical chain. +Two or more tipsets of the same epoch with different ancestor tipsets (like tipsets $C$ and $C'$ above) form a _fork_ in the chain. Forks are resolved using a _fork choice rule_, a deterministic algorithm that, given a blockchain data structure, returns the heaviest tipset, called the _head_. We refer to the path from genesis to the head as the _canonical chain_. Participants may have different views of the blockchain, resulting in other canonical chains. For example, if a participant $p_1$ is not (yet) aware of tipset $D$, it would consider $C$ the heaviest tipset with the canonical chain $[G A B C]$. Another participant $p_2$ aware of tipset $D$ will consider $[G A C' D]$ to be the canonical chain. Once $p_1$ becomes aware of tipset $D$, it will update its canonical chain to $[G A C' D]$ - this is called _reorganization_. We say a tipset is _finalized_ when a reorganization involving that tipset is impossible, i.e., when a different path that does not contain the tipset cannot become the canonical chain. In EC and, generally, longest-chain protocols, the probability of a path from some tipset $h$ to the genesis tipset becoming finalized increases with the number of descendant tipsets of $h$. This happens because the weight of the heaviest chain increases the fastest, as the tipsets with the most new blocks are appended to it (assuming that honest participants form a majority). In the particular case of EC, in each epoch, most created blocks are expected to come from honest participants and thus extend the heaviest chain. Consequently, it becomes progressively harder for a different path to overcome that weight. Over time, the probability of a tipset never undergoing a reorganization becomes high enough that the tipset is considered final for all practical purposes. In the Filecoin network, a tipset that is part of the heaviest chain is considered final after 900 epochs (or, equivalently, 7.5 hours) from its proposal. @@ -188,9 +188,9 @@ EC needs to be modified to accommodate the finalization of tipsets by F3. **This The current EC fork-choice rule selects, from all the known tipsets, the tipset backed by the most weight. As the F3 component finalizes tipsets, the fork-choice rule must ensure that the heaviest finalized chain is always a prefix of the heaviest chain, preventing reorganizations of the already finalized chain. -We achieve this by adjusting the definition of weight for a finalized prefix: the heaviest finalized chain is the one that **matches exactly the tipsets finalized by F3**, in that a tipset $t'$ that is a superset of finalized tipset $t$ in the same epoch is not heavier than $t$ itself, despite it being backed by more EC power. +We achieve this by adjusting the fork choice rule in the presence of a finalized prefix: the fork choice rule selects the heaviest chain out of all chians with a prefix that **matches exactly all tipsets finalized by F3**, in that a tipset $t'$ that is a superset of finalized tipset $t$ in the same epoch is not heavier than $t$ itself, despite it being backed by more EC power. -This redefinition of the heaviest chain is consistent with the abstract notion of the heaviest chain being backed by the most power because a finalized tipset has been backed by a super-majority of participants in GossiPBFT. In contrast, any non-finalized block in the same epoch is only backed in that epoch by the EC proposer. +This redefinition of the fork choice rule is consistent with the abstract notion of the heaviest chain being backed by the most power because a finalized tipset has been backed by a super-majority of participants in GossiPBFT. In contrast, any non-finalized block in the same epoch is only backed in that epoch by the EC proposer. We illustrate the updated rule in the following figure, where blocks in blue are finalized blocks, and all blocks are assumed to be proposed by a proposer holding only one EC ticket (the weight of a chain is the number of blocks): @@ -326,7 +326,8 @@ A set of messages $M$ that does not contain equivocating messages is called _cle * `LowestTicketProposal(M)` * Let $M$ be a clean set of CONVERGE messages for the same round. * If $M = ∅$, the predicate returns $baseChain$. - * If $M \ne ∅$, the predicate returns $m.value$, such that $m$ is the message in $M$ with the lowest ticket ($m.ticket$). + * If $M \ne ∅$, the predicate returns $m.value$ and $m.evidence$, such that $m$ is the message in $M$ with the lowest ticket ($m.ticket$). M will never be empty as the participant must at least deliver its own CONVERGE message. + * `Aggregate(M)` * Where $M$ is a clean set of messages of the same type $T$, round $r$, and instance $i$, with $v=\texttt{StrongQuorumValue}(M)≠nil$. ``` @@ -343,145 +344,168 @@ We illustrate the pseudocode for GossiPBFT below, consisting of 3 steps per roun GossiPBFT(instance, inputChain, baseChain, participants) → decision, PoF: \*participants is implicitly used to calculate quorums -1: round ← 0; -2: decideSent ← False; -3: proposal ← inputChain; \* holds what the participant locally believes should be a decision +1: round ← 0 +2: decideSent ← False +3: proposal ← inputChain \* holds what the participant locally believes should be a decision 4: timeout ← 2*Δ -5: ECCompatibleChains ← all prefixes of proposal, not lighter than baseChain -6: value ← proposal \* used to communicate the voted value to others (proposal or 丄) -7: evidence ← nil \* used to communicate optional evidence for the voted value +5: value ← proposal \* used to communicate the voted value to others (proposal or 丄) +6: evidence ← nil \* used to communicate optional evidence for the voted value +7: C ← {baseChain} -8: while (not decideSent) { +8: while (NOT decideSent) { 9: if (round = 0) 10: BEBroadcast ; trigger (timeout) -11: collect a clean set M of valid QUALITY messages +11: collect a clean set M of valid QUALITY messages from this instance until HasStrongQuorum(proposal, M) OR timeout expires -12: let C={prefix : IsPrefix(prefix,proposal) and HasStrongQuorum(prefix,M)} -13: if (C = ∅) -14: proposal ← baseChain \* no proposals of high-enough quality -15: else -16: proposal ← heaviest prefix ∈ C \* this becomes baseChain or sth heavier -17: value ← proposal - -18: if (round > 0) \* CONVERGE -19: ticket ← VRF(Randomness(baseChain) || instance || round) -20: value ← proposal \* set local proposal as value in CONVERGE message -21: BEBroadcast ; trigger(timeout) -22: collect a clean set M of valid CONVERGE msgs from this round +12: C ← C ∪ {prefix : IsPrefix(prefix,proposal) and HasStrongQuorum(prefix,M)} +13: proposal ← heaviest prefix ∈ C \* this becomes baseChain or sth heavier +14: value ← proposal + +15: if (round > 0) \* CONVERGE +16: ticket ← VRF(Randomness(baseChain) || instance || round) +17: value ← proposal \* set local proposal as value in CONVERGE message +18: BEBroadcast ; trigger(timeout) +19: collect a clean set M of valid CONVERGE msgs from this round and instance until timeout expires -23: value ← LowestTicketProposal(M) \* leader election -24: if value ∈ ECCompatibleChains \* see also lines 55-56 -25: proposal ← value \* we sway proposal if the value is EC compatible -26: else -27: value ← 丄 \* vote for not deciding in this round - -28: BEBroadcast ; trigger(timeout) -29: collect a clean set M of valid msgs \* match PREPARE value against local proposal - until IsStrongQuorum(M) OR timeout expires -30: if IsStrongQuorumPower(M) \* strong quorum of PREPAREs for local proposal -31: value ← proposal \* vote for deciding proposal (COMMIT) -32: evidence ← Aggregate(M) \* strong quorum of PREPAREs is evidence -33: else -34: value ← 丄 \* vote for not deciding in this round -35: evidence ← nil - -36: BEBroadcast ; trigger(timeout) -37: collect a clean set M of valid COMMIT messages from this round +20: prepareReadyToSend ← False +21: while (not prepareReadyToSend){ +22: value, evidence ← LowestTicketProposal(M) \* leader election +23: if (evidence is a strong quorum of PREPAREs AND mightHaveBeenDecided(value, r-1)): +24: C ← C ∪ {value} +25: if (value ∈ C) +26: proposal ←value \* we sway proposal if the value is incentive compatible (i.e., in C) +27: prepareReadyToSend ← True \* Exit loop +28: else +29: M = {m ∈ M | m.value != value AND m.evidence.value != evidence.value} \* Update M for next iteration } + +30: BEBroadcast ; trigger(timeout) \* evidence is nil in round=0 +31: collect a clean set M of valid PREPARE messages from this round and instance + until (HasStrongQuorumValue(M) AND StrongQuorumValue(M) = proposal) + OR (timeout expires AND Power(M)>2/3) +32: if (HasStrongQuorumValue AND StrongQuorumValue(M) = proposal) \* strong quorum of PREPAREs for local proposal +33: value ← proposal \* vote for deciding proposal (COMMIT) +34: evidence ← Aggregate(M) \* strong quorum of PREPAREs is evidence +35: else +36: value ← 丄 \* vote for not deciding in this round +37: evidence ← nil + +38: BEBroadcast ; trigger(timeout) +39: collect a clean set M of valid COMMIT messages from this round and instance until (HasStrongQuorumValue(M) AND StrongQuorumValue(M) ≠ 丄) - OR (timeout expires AND IsStrongQuorum(M)) -38: if (HasStrongQuorumValue(M) AND StrongQuorumValue(M) ≠ 丄) \* decide -39: evidence ← Aggregate(M) -39: BEBroadcast -40: decideSent ← True \* break loop, wait for other DECIDE messages -41: if (∃ m ∈ M: m.value ≠ 丄) \* m.value was possibly decided by others -42: proposal ← m.value; \* sway local proposal to possibly decided value -43: evidence ← m.evidence \* strong PREPARE quorum is inherited evidence -44: else \* no participant decided in this round -45: evidence ← Aggregate(M) \* strong quorum of COMMITs for 丄 is evidence -46: round ← round + 1; -47: timeout ← updateTimeout(timeout, round) -48: } \*end while - -49: collect a clean set M of valid DECIDE messages + OR (timeout expires AND Power(M)>2/3) +40: if (HasStrongQuorumValue(M) AND StrongQuorumValue(M) ≠ 丄) \* decide +41: evidence ← Aggregate(M) +42: BEBroadcast +43: decideSent ← True \* break loop, wait for other DECIDE messages +44: if (∃ m ∈ M: m.value ≠ 丄 s.t. mightHaveBeenDecided(m.value, r)) \* m.value was possibly decided by others +45: C ← C ∪ {m.value} \* add to candidate values if not there +46: proposal ← m.value; \* sway local proposal to possibly decided value +47: evidence ← m.evidence \* strong PREPARE quorum is inherited evidence +48: else \* no participant decided in this round +49: evidence ← Aggregate(M) \* strong quorum of COMMITs for 丄 is evidence +50: round ← round + 1; +51: timeout ← updateTimeout(timeout, round) +52: } \*end while + +53: collect a clean set M of valid DECIDE messages until (HasStrongQuorumValue(M)) \* collect a strong quorum of decide outside the round loop -50: return (StrongQuorumValue(M), Aggregate(M)) \* terminate the algorithm with a decision +54: return (StrongQuorumValue(M), Aggregate(M)) \* terminate the algorithm with a decision +``` +``` +\* decide anytime +55: upon reception of a valid AND not decideSent +56: decideSent ← True +57: BEBroadcast | valid CONVERGE -AND (∃ M: IsStrongQuorum(M) AND m.evidence=Aggregate(M) | evidence is a strong quorum - AND ((∀ m' ∈ M: m'.step = COMMIT AND m'.value = 丄) | of COMMIT msgs for 丄 - OR (∀ m' ∈ M: m'.step = PREPARE AND | or PREPARE msgs for - m'.value = m.value)) | CONVERGE value - AND (∀ m' ∈ M: m'.round = m.round-1) | from previous round to CONVERGE - AND (∀ m' ∈ M: m'.instance = m.instance))) | from the same instance as CONVERGE - - -OR (m = | valid COMMIT - AND (∃ M: IsStrongQuorum(M) AND m.evidence=Aggregate(M) | evidence is a strong quorum - AND ∀ m' ∈ M: m'.step = PREPARE | of PREPARE messages - AND ∀ m' ∈ M: m'.round = m.round | from the same round as COMMIT - AND ∀ m' ∈ M: m'.instance = m.instance | from the same instance as COMMIT - AND ∀ m' ∈ M: m'.value = m.value) | for the same value as COMMIT, or - OR (m.value = 丄) AND m.evidence=nil) | COMMIT is for 丄 with no evidence - -OR (m = | valid DECIDE - AND (∃ M: IsStrongQuorum(M) AND m.evidence=Aggregate(M) | evidence is a strong quorum - AND ∀ m' ∈ M: m'.step = COMMIT | of COMMIT messages - AND ∀ m', m'' ∈ M: m'.round = m''.round | from the same round as each other - AND ∀ m' ∈ M: m'.instance = m.instance | from the same instance as DECIDE - AND ∀ m' ∈ M: m'.value = m.value) | for the same value as DECIDE +ValidEvidence(m= ): + +if ( step = PREPARE and round = 0) | in the first round, + AND (evidence = nil) | evidence for PREPARE + return True | is nil + +if (step = COMMIT and value = 丄) | a COMMIT for 丄 carries no evidence + AND (evidence = nil) + return True + +If (evidence.instance != instance) | the instance of the evidence must be the + return False | same as that of the message + + | valid evidences for +return (step = CONVERGE OR (step = PREPARE AND round>0) | CONVERGE and PREPARE in + AND (∃ M: Power(M)>⅔ AND evidence=Aggregate(M) | round>0 is strong quorum + AND ((∀ m’ ∈ M: m’.step = COMMIT AND m’.value = 丄) | of COMMIT msgs for 丄 + OR (∀ m’ ∈ M: m’.step = PREPARE AND | or PREPARE msgs for + m’.value = value)) | CONVERGE value + AND (∀ m’ ∈ M: m’.round = round-1))) | from previous round + + +OR (step=COMMIT | valid COMMIT evidence + AND (∃ M: Power(M)>⅔ AND evidence=Aggregate(M) | is a strong quorum + AND ∀ m’ ∈ M: m’.step = PREPARE | of PREPARE messages + AND ∀ m’ ∈ M: m’.round = round | from the same round + AND ∀ m’ ∈ M: m’.value = value)) | for the same value, or + +OR (step = DECIDE | valid DECIDE evidence + AND (∃ M: Power(M)>⅔ AND evidence=Aggregate(M) | is a strong quorum + AND ∀ m’ ∈ M: m’.step = COMMIT | of COMMIT messages + AND ∀ m’ ∈ M: m’.round = round | from the same round + AND ∀ m’ ∈ M: m’.value = value)) | for the same value ``` ### Evidence verification complexity @@ -500,7 +524,7 @@ GossiPBFT ensures termination provided that (i) all participants start the insta [Given prior tests performed on GossipSub](https://research.protocol.ai/publications/gossipsub-v1.1-evaluation-report/vyzovitis2020.pdf) (see also [here](https://gist.github.com/jsoares/9ce4c0ba6ebcfd2afa8f8993890e2d98)), we expect that sent messages will reach almost all participants within $Δ=3s$, with a majority receiving them even after $Δ=2s$. However, if several participants start the instance $Δ + ε$ after some other participants, termination is not guaranteed for the selected timeouts of $2*Δ$. Thus, we do not rely on an explicit synchrony bound for correctness. Instead, we increase the estimate of Δ locally within an instance as rounds progress without decision. -The synchronization of participants is performed in the call to $\texttt{updateTimeout(timeout, round)}$ (line 47), and works as follows: +The synchronization of participants is performed in the call to $\texttt{updateTimeout(timeout, round)}$ (line 51), and works as follows: * Participants start an instance with $Δ=2s$. * Participants set their timeout for the current round to $Δ*1.3^{r}$ where $r$ is the round number ($r=0$ for the first round). @@ -667,9 +691,9 @@ Because of changes to the EC fork choice rule, this FIP requires a network upgra - EB: Successful catch-up to currently executed instance and decision in it. Make sure new participants catch up timely with the rest (as progress will happen while they catch up). * Significant network delays: - Participants holding ½ QAP start the instance 10 seconds after the other half. - - EB: After round 5, participants are synchronized by drand and decide in this round. + - EB: Participants eventually decide. - All messages are delayed by 10 seconds. - - EB: After some round >5, participants can decide. + - EB: Participants eventually decide. * Tests on message validity: - Invalid messages should be discarded; this includes: * Old/decided instance @@ -729,7 +753,7 @@ The modifications proposed in this FIP have far-reaching implications for the se * **Censorship.** F3 and GossiPBFT are designed with censorship resistance in mind. The updated fork choice rule means that an adversary controlling at least more than ⅓ QAP can try to perform a censorship attack if honest participants start an instance of GossiPBFT proposing at least two distinct inputs. While this attack is theoretically possible, it is notably hard to perform on F3 given the QUALITY step of GossiPBFT and other mitigation strategies specifically put in place to protect against this. We strongly believe that, even against a majority adversary, the mitigations designed will prevent such an attack. * **Liveness.** Implementing F3 introduces the risk that an adversary controlling at least ⅓ QAP prevents termination of a GossiPBFT instance. In that case, the F3 component would halt, not finalizing any tipset anymore. At the same time, EC would still operate, outputting tipsets and considering them final after 900 epochs. The liveness of the system is thus not affected by attacks on the liveness of F3. -* **Safety.** Implementing F3 ensures the safety of finalized outputs during regular or even congested networks against a Byzantine adversary controlling less than ⅓ QAP. For stronger adversaries, F3 provides mitigations to prevent censorship attacks, as outlined above. If deemed necessary, the punishment and recovery from coalitions in the event of an attempted attack on safety can be explored in future FIPs. Note that safety is already significantly improved by F3 compared to the status quo: F3 provides safety of finalized outputs two orders of magnitude faster than the current estimate of 900 epochs during regular network operation. +* **Safety.** Implementing F3 ensures the safety of finalized outputs during regular or even congested networks against a Byzantine adversary controlling less than ⅓ QAP. For stronger adversaries, F3 provides mitigations to prevent censorship attacks, as outlined above. If deemed necessary, the punishment and recovery from coalitions in the event of an attempted attack on safety can be explored in future FIPs. Note that the (safe) finality time is already significantly improved by F3 compared to the status quo: F3 provides safety of finalized outputs two orders of magnitude faster than the current parameter of 900 epochs during regular network operation. * **Denial-of-service (DoS).** The implementation of the F3 preserves resistance against DoS attacks currently ensured by Filecoin, thanks to the fully leaderless nature of GossiPBFT and to the use of a VRF to self-assign tickets during the CONVERGE step. * **Committees.** This FIP proposes to have all participants run all instances of GossiPBFT. While this ensures optimal resilience against a Byzantine adversary, it can render the system unusable if the number of participants grows too large. While we are still evaluating the maximum practical number of participants in F3, it is expected to be at least one order of magnitude greater than the current number of participants in Filecoin. This introduces an attack vector: if the scalability limit is 100,000 participants, a participant holding as little as 3% of the current QAP can perform a Sybil attack to render the system unusable, with the minimum QAP required per identity. As a result, the implementation should favor the messages of the more powerful participants if the number of participants grows too large. Given that favoring more powerful participants discriminates against the rest, affecting decentralization, amending F3 to use committees in the event of the number of participants exceeding the practical limit will be the topic of a future FIP, as well as the analysis of optimized message aggregation in the presence of self-selected committees.