-
Notifications
You must be signed in to change notification settings - Fork 170
/
Copy pathfip-0086.md
892 lines (641 loc) · 71.9 KB
/
fip-0086.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
---
fip: "0086"
title: Fast Finality in Filecoin (F3)
author: Steven Allen (@stebalien), Jie Hou (@mb1896), Henrique Moniz (@hmoniz), Alex North (@anorth), Matej Pavlovic (@matejpavlovic), Aayush Rajasekaran (@arajasek), Alejandro Ranchal-Pedrosa (@ranchalp), Jorge M. Soares (@jsoares), Jakub Sztandera (@Kubuxu), and Marko Vukolic (@vukolic)
discussions-to: https://github.com/filecoin-project/FIPs/discussions/809
status: Draft
type: Technical
category: Core
created: 2023-12-18
---
# Fast Finality in Filecoin (F3)
## Simple Summary
Filecoin clients currently consider blocks irreversible after 900 epochs, hindering applications requiring low latency. This FIP specifies Fast Finality in Filecoin (F3), extending the protocol with a new component that reduces finalization time from 7.5 hours to tens of seconds.
## Abstract
The current Filecoin consensus mechanism only provides probabilistic finality. To simplify client implementations and provide some form of determinism, the protocol also includes a soft finality threshold, whereby miners at round $N$ reject all blocks that fork off before $N-900$. This finalization delay of 900 epochs (7.5 hours) hinders user experience and limits applications built on Filecoin.
We specify a mechanism for fast finality with the F3 component. F3 is expected to finalize tipsets within tens of seconds during regular network operation, compared to the current 900-epoch finalization delay. It does so by leveraging GossiPBFT, an optimally resilient partially synchronous BFT consensus protocol, which runs in parallel with the current protocol, taking EC tipsets as input and providing finalized prefixes as output. The fork choice rule of EC is modified to ensure it never selects a chain other than the F3-finalized chain.
## Change Motivation
* The long time to finality on Filecoin mainnet restricts or severely affects applications built on Filecoin (e.g., IPC, Axelar, Wormhole, Glif, …).
* Even though applications on Filecoin can set a lower finalization time than the built-in 900 epochs, delayed finalization for important transactions will require multiple epochs with Filecoin's Expected Consensus (EC).
* Long finalization times also affect exchanges, by imposing a long confirmation period (often more than 1 hour) for users managing their FIL assets, and bridges, which face extended wait times for asset transfers.
* Bridging to other systems is not currently fast, safe, and verifiable.
## Specification
### Background
[Expected Consensus (EC)](https://spec.filecoin.io/algorithms/expected_consensus/) is the current mechanism by which participants in the Filecoin network reach an agreement on tipsets. A tipset is a set of blocks with the same epoch and the same set of parents. EC is a longest-chain protocol (more accurately, a heaviest-chain protocol) in which each participant independently builds the chain as it receives blocks from the network. Time is divided into slots of 30 seconds, called _epochs_. In each epoch, the protocol elects a set of network participants (i.e., storage providers) to become block proposers. Each proposer can construct a new block and broadcast it to the network. On reception, each participant appends the block to its local view of the blockchain. Each tipset has a _weight_ corresponding to the total number of blocks in the path between the genesis and the tipset (the actual weight function is slightly more complex in reality, but this approximation is sufficient for this document). An example blockchain data structure is shown below, indicating the weight of each tipset in parentheses.
![](../resources/fip-0086/chain.png)
Two or more tipsets of the same epoch with different ancestor tipsets (like tipsets $C$ and $C'$ above) form a _fork_ in the chain. Forks are resolved using a _fork choice rule_, a deterministic algorithm that, given a blockchain data structure, returns the heaviest tipset, called the _head_. We refer to the path from genesis to the head as the _canonical chain_. Participants may have different views of the blockchain, resulting in other canonical chains. For example, if a participant $p_1$ is not (yet) aware of tipset $D$, it would consider $C$ the heaviest tipset with the canonical chain $[G A B C]$. Another participant $p_2$ aware of tipset $D$ will consider $[G A C' D]$ to be the canonical chain. Once $p_1$ becomes aware of tipset $D$, it will update its canonical chain to $[G A C' D]$ - this is called _reorganization_. We say a tipset is _finalized_ when a reorganization involving that tipset is impossible, i.e., when a different path that does not contain the tipset cannot become the canonical chain.
In EC and, generally, longest-chain protocols, the probability of a path from some tipset $h$ to the genesis tipset becoming finalized increases with the number of descendant tipsets of $h$. This happens because the weight of the heaviest chain increases the fastest, as the tipsets with the most new blocks are appended to it (assuming that honest participants form a majority). In the particular case of EC, in each epoch, most created blocks are expected to come from honest participants and thus extend the heaviest chain. Consequently, it becomes progressively harder for a different path to overcome that weight. Over time, the probability of a tipset never undergoing a reorganization becomes high enough that the tipset is considered final for all practical purposes. In the Filecoin network, a tipset that is part of the heaviest chain is considered final after 900 epochs (or, equivalently, 7.5 hours) from its proposal.
### F3 Overview and Interaction with EC
We propose implementing fast finality in Filecoin by introducing an F3 component, which works alongside EC in Filecoin client nodes (participants).
The participants in F3 are the storage providers (SPs) of the Filecoin network. The participation of each SP is weighted according to its quality-adjusted power (QAP), which is a function of the storage power that the SP has committed to the network. This information is maintained in a built-in actor called the _power actor_ (f04). Only SPs that hold more than the threshold of power to participate in EC can participate in F3.
In short, each participant $p$ in the Filecoin network (i.e., a storage provider with power) runs F3 in a loop and feeds its input from EC. F3 returns a finalized prefix chain to EC, which updates the canonical chain to extend this F3-finalized prefix. More precisely, in each loop iteration $i$ (corresponding to the i-th instance of F3):
- **EC/F3 interface:** Participant $p$ feeds its current canonical chain $canonical$ and its previously F3-finalized chain, which we call the $baseChain$, to F3. The $baseChain$ defines the power table **used by F3** and seeds the randomness used to configure the F3 instance, while $p$'s current canonical chain is the chain $p$ proposes to be finalized. Periodically, $p$ also feeds to F3 tipset updates from EC that extend $p$'s canonical chain as EC delivers these tipsets.
- **F3 consensus:** F3 establishes network-wide consensus on finalized tipsets and produces a _Proof of Finality (PoF)_ for a tipset finalized in instance $i$, $t_i$. Every _PoF_ output by F3 is signed by ≥ ⅔ of the total QAP, i.e., a super-majority of the power table vouching that honest participants with more than ⅓ QAP consider tipset $t_i$ final.
- **EC:** Participant $p$ updates its local EC chain and commits not to reorganize the finalized tipset $t_i$, i.e., the tipset must stay in $p$'s EC canonical chain for good. Apart from this change, EC continues operating as it currently does. In particular, EC still does a 900-epoch lookback for its power table (which can therefore differ from the one used by F3) and continues operating “normally” if F3 assumptions are violated and F3 halts, with the combined EC/F3 protocol favoring availability over consistency (in CAP theorem parlance).
- **F3 synchronization:** Participants disseminate information about finalized tipset $t_i$ and its proof $PoF_i$ to all other participants, light clients, and smart contracts. The main goal of F3 synchronization is to allow an external party (or a lagging F3 participant) to fetch a sequence of messages that prove the finality of some recent final tipset. The sequence demonstrates a chain of eligible participants deciding on a final tipset and, thus, the eligible power table for the next round. Verifying the finality of a tipset from genesis does not require validating the EC chain.
![](../resources/fip-0086/flow.png)
### F3 Adversarial Model
Honest participants follow the protocol at all times. Byzantine participants deviate arbitrarily from the protocol. Rational participants deviate from the protocol to pursue a greater (expected) pay-off according to their utility function. We consider a Byzantine adversary able to control and orchestrate up to less than ⅓ QAP.
We further adapt our protocols to align the actions of honest and rational participants by ensuring that rational participants never contribute to finalizing a chain they do not locally possess. Without this, If a participant $p$ were to finalize a chain it didn't have, then $p$ would not be able to mine in EC until it retrieved the required blocks and thus, in the meantime, could not obtain the cryptoeconomic rewards that EC offers. To this end, F3 provides _EC incentive compatibility_.
F3 and its implementation GossiPBFT provide additional robustness for strong adversaries that control more than ⅓ QAP, such as:
* Censorship resistance against a coalition of rational participants trying to leak power, without providing firm guarantees.
* Resilience to long-range attacks by an adversary controlling up to less than ⅔ QAP of any old power table.
### Best-effort Broadcast and Gossipsub
F3 uses GossipSub (like Filecoin EC) to disseminate protocol messages, implementing a broadcast channel among participants. We assume and model GossipSub to satisfy the following properties of the _best-effort broadcast primitive_ (denoted _BEBroadcast_ hereafter):
* If $p$ and $p'$ are honest, then every message broadcast by $p$ is eventually delivered by $p'$.
* No message with sender $p$ is delivered unless it was previously broadcast by $p$.
### Partially Synchronous Model and Δ-Synchrony
We say the system is $\Delta$-_synchronous_ if the combined computation and communication delay between any two honest participants is smaller than $\Delta$. We assume that, under $\Delta$-synchrony, any message broadcast by an honest participant is delivered by every honest participant within a time bound of $\Delta$.
In practice, if GossipSub is used for _BEBroadcast_, $\Delta$ is effectively the assumed upper bound on GossipSub latency (e.g., a few seconds).
For termination, we assume a classical partially synchronous model with an unknown bound on communication delays among non-Byzantine participants.
### Properties of F3
F3 is not identical to classical consensus, however similar to it. In a nutshell, the reasons why EC and Filecoin should not rely on classical consensus properties (implemented by existing off-the-shelf BFT consensus protocol) for fast finality are twofold: EC incentive compatibility and resilience to Filecoin power leakage attacks in case of a strong adversary.
With this in mind, the F3 component interface and properties are given below.
> **Interface and properties of F3 at participant p:**
>
> `F3.invoke (int i, chain canonical, chain baseChain)` **returns** (we say finalizes) `(int i, chain h, proof_of_finality PoF)`
>
> **Properties:**
>
> **Agreement.** If two honest participants finalize $(i,h,\ast)$ and $(i,h',\ast)$, then $h = h'$.
>
> **Validity.** If an honest participant finalizes $(i,h,\ast)$, then $h$ is a prefix of the canonical input chain of some honest participant $p'$ in instance $i$.
>
> **Proof of Finality.** If an honest participant finalizes $(i,h,PoF)$, then $PoF$ is signed by ⅔ QAP majority corresponding to the $\texttt{PowerTable}(baseChain)'$ input in instance $i$ of some honest participant $p'$.
>
> **Progress.** If the system is $\Delta$-synchronous, i.e.,
> * All honest participants can communicate within a known time bound $\Delta$, and
> * no honest participants invokes $\texttt{F3}(i, \ast, \ast)$ later than $\Delta$ after another participant invokes $\texttt{F3}(i, \ast, \ast)$,
>
> Let c be the heaviest common prefix of the inputs of all honest participants in instance $i$. Then, if an honest participant finalizes $(i,c',\ast)$, $c$ is a prefix of $c'$ with probability > 0.5.
>
> **Termination.** If the system is $\Delta$-synchronous, every call to F3 eventually returns with probability 1.
>
> **EC Incentive compatibility.** An honest participant never contributes to gathering ⅔ QAP (super-majority) of votes for a chain $c$ which its local EC instance did not already deliver unless it already observes evidence of such a super-majority for chain $c$.
>
> `F3.ECupdate (int i, chain c)`: Participant $p$ updates F3 instance $i$ with chain $c$, which its EC instance locally delivered (relevant to EC incentive compatibility)
We later show the specific estimate chosen for $\Delta$ to optimize termination (See [here](#Synchronization-of-Participants-in-the-Current-Instance)).
### Consensus Interface
The GossiPBFT consensus protocol is the main dependency of the F3. We will treat GossiPBFT as a black box here and defer the explanation of its internals to the [respective section](#GossiPBFT-Consensus) of this FIP. For now, it suffices to say that it is a Byzantine fault-tolerant consensus protocol that provides the typical properties of (1) agreement (no two honest participants decide differently), (2) validity (the decided value must pass a validity predicate), and (3) termination (every honest participant eventually decides).
We denote the invocation of a consensus instance from a participant as follows:
```
GossiPBFT(i, proposal, baseChain, participants) → decision, PoF
```
A participant that invokes the $\texttt{GossiPBFT()}$ function starts a consensus instance identified by the sequence number $i$ and with an input value $proposal$. The $baseChain$ parameter represents the default value (agreement on not finalizing any new tipset). The $participants$ parameter is composed of the SPs that participate in this instance along with their respective individual weights (i.e., quality-adjusted power). The next section explains how the $participants$ are obtained from the power table. The invocation eventually returns a $decision$ value and a proof of finality $PoF$ that proves $decision$ is indeed the output of this consensus instance. In any invocation from F3, $decision$ is a finalized chain prefix in the form of a tipset. $PoF$ can be verified by using the $\texttt{Verify}()$ function that we explain below:
```
Verify(PoF, decision, participants) → boolean
```
This function uses the $PoF$ to verify that $decision$ was the output of some instance of consensus executed among the $participants$.
### Power Table and Consensus Participants
The _power table_ in the power actor maintains, among other things, the quality-adjusted power of each SP. Each tipset $t$ in the chain corresponds to a particular version of the power table denoted $\texttt{PowerTable}(t)$ that results from executing all messages in the chain from genesis to $t$. The participants of an instance of GossiPBFT are determined by the power table resulting from the tipset finalized by the previous instance. More rigorously, the input parameter $participants$ of an instance $i$ is obtained from $\texttt{PowerTable}(decision_{i-1})$, where $decision_{i-1}$ is the tipset output by instance $i-1$. If $i$ is the first consensus instance, $decision_{i-1}$ is the genesis tipset.
We assume that $\texttt{PowerTable}(t)$ ignores any unnecessary data in the power table and returns the set consisting of each participant's identity and weight.
### F3 Pseudocode
We now present the pseudocode of the F3 algorithm:
```
// finalizedTipsets is a list of records with tipset and PoF fields representing
// all finalized tipsets with their respective proof of finality.
// It is initialized with the genesis tipset by definition, which needs no proof.
finalizedTipsets[0].tipset ← genesis
finalizedTipsets[0].PoF ← nil
i ← 1
while True:
participants ← PowerTable(finalizedTipsets[i-1].tipset)
proposal ← ChainHead()
finalizedTipset, PoF ← GossiPBFT(i, proposal, finalizedTipsets[i-1], participants)
finalizedTipsets[i].tipset ← finalizedTipset
finalizedTipsets[i].PoF ← PoF
i ← i + 1
```
Each participant keeps track of the finalized tipsets and respective proofs of finality in a data structure named $finalizedTipsets$. It contains one $finalizedTipsets[i]$ entry per consensus instance where $finalizedTipsets[i].tipset$ and $finalizedTipsets[i].PoF$ denote, respectively, the tipset and PoF output by consensus instance $i$. A participant considers a tipset finalized if it is included in $finalizedTipsets$ or is an ancestor of some tipset included in $finalizedTipsets$.
The algorithm starts by initializing $finalizedTipsets[0]$ with the genesis tipset. This tipset is pre-defined as finalized and does not require a PoF. Then, in each iteration $i$ of the loop, it takes the following steps:
1. Obtain the set of participants of instance $i$ from the power table determined by the previously finalized tipset.
2. Call the $\texttt{ChainHead}()$ function that returns the head tipset of the locally observed chain constructed by EC and sets the $proposal$ variable to the returned value. We assume that such a function is available to the finalizer.
3. Execute the instance $i$ of GossiPBFT consensus.
4. Add the returned tipset from the consensus execution to the $finalizedTipsets$ list.
To prevent useless instances of GossiPBFT deciding on the same $baseChain$ because of the lack of a new epoch that provides a new proposal, we make the $\texttt{ChainHead}()$ function blocking in that it does not return a proposal unless the current chain head is different from the chain head of the finalized chain.
### Changes to EC: Fork Choice Rule
EC needs to be modified to accommodate the finalization of tipsets by F3. **This is the only change to EC this FIP introduces.**
The current EC fork-choice rule selects, from all the known tipsets, the tipset backed by the most weight. As the F3 component finalizes tipsets, the fork-choice rule must ensure that the heaviest finalized chain is always a prefix of the heaviest chain, preventing reorganizations of the already finalized chain.
We achieve this by adjusting the fork choice rule in the presence of a finalized prefix: the fork choice rule selects the heaviest chain out of all chians with a prefix that **matches exactly all tipsets finalized by F3**, in that a tipset $t'$ that is a superset of finalized tipset $t$ in the same epoch is not heavier than $t$ itself, despite it being backed by more EC power.
This redefinition of the fork choice rule is consistent with the abstract notion of the heaviest chain being backed by the most power because a finalized tipset has been backed by a super-majority of participants in GossiPBFT. In contrast, any non-finalized block in the same epoch is only backed in that epoch by the EC proposer.
We illustrate the updated rule in the following figure, where blocks in blue are finalized blocks, and all blocks are assumed to be proposed by a proposer holding only one EC ticket (the weight of a chain is the number of blocks):
![](../resources/fip-0086/fork.png)
The current EC fork-choice rule would select the tipset $\lbrace D_0, D_1\rbrace$ as the head of the heaviest chain. However, the heaviest finalized tipset is $\lbrace C_3\rbrace$, which is not an ancestor of $\lbrace D_0, D_1\rbrace$. Therefore, the new fork choice rule selects $\lbrace D_3\rbrace$ as the head of the heaviest chain. The reason why $D_4$ is not selected is that its parent tipset does not exactly match the finalized tipset $\lbrace C_3\rbrace$, but a superset of it, i.e. $\lbrace C_3, C_4\rbrace$.
In the event that F3 halts, the fork choice rule allows EC to continue to progress, building a chain of unbounded length, per the current rules, on the last F3-finalized prefix.
### Bootstrapping
Integrating F3 into Filecoin follows the usual path for Filecoin upgrades. One epoch, $upgradeEpoch$, will be defined as the target epoch upon which participants upgrade Filecoin. Then, every participant starts the first instance of F3 with the tipset at the $upgradeEpoch$ minus the 900-epoch lookback as the head tipset of the first $baseChain$, which is assumed by design to be common to all participants.
### GossiPBFT Consensus
This section provides the specification of GossiPBFT, the consensus protocol that is iteratively instantiated by the F3 loop and returns a decision (in the form of an F3-finalized chain) and a PoF per instance.
GossiPBFT is a Byzantine fault-tolerant consensus protocol that is resilient optimal, i.e., it tolerates up to less than ⅓ QAP being controlled by a Byzantine adversary. Each instance of the protocol has a known set of participants. Each participant inputs a proposal value, and the protocol outputs one of the input values as the final decision value. We emphasize _final_ because, unlike a longest-chain protocol, the output of GossiPBFT is permanent.
GossiPBFT was designed with the Filecoin network in mind and presents a set of features that make it desirable in that context:
* Participants can have different weights, which aligns with how storage providers have different amounts of storage committed to the network. A participant's weight in executing the protocol is proportional to their share of QAP.
* The protocol has optimal resilience, i.e., it tolerates a Byzantine adversary controlling up to less than ⅓ QAP.
* Unlike PBFT, HotStuff, Tendermint, and many other protocols in this space, GossiPBFT is a leaderless protocol. This property makes it resistant to denial of service attacks because no designated participant represents the weakest link.
* Low latency. During periods of synchrony and honest proposers, GossiPBFT will finish in three communication steps. We expect this to happen within tens of seconds.
* GossiPBFT has been tailored with Filecoin and open blockchains in mind, with strong resilience against censorship attacks and incentive compatibility for rational participants.
* GossiPBFT is tailored to using a broadcast communication primitive, GossipSub, which Filecoin already uses.
* GossipPBFT internal invariants, on which the protocol correctness is based, are very similar to those of the seminal PBFT protocol, making protocol correctness easier to establish.
#### Message format, signatures, and equivocation
Messages include the following fields: $\langle Sender, Signature, MsgType, Value, Instance, [Round, Evidence, Ticket] \rangle$. As $Round$, $Evidence$, and $Ticket$ are fields that not all message types require, when not required by a message type, their default value is used (i.e. $0$, $nil$, and $[] byte \lbrace\rbrace$, respectively). We refer to a _field_ of message $m$, with $m.field$:
```go
// A message in the GossiPBFT protocol.
// The same message structure is used for all rounds and phases.
// Note that the message is self-attesting so no separate envelope or signature is needed.
// - The signature field fixes the included sender ID via the implied public key;
// - The signature payload includes all fields a sender can freely choose;
// - The ticket field is a signature of the same public key, so also self-attesting.
type GossiPBFTMessage struct {
// ID of the sender/signer of this message (a miner actor ID).
Sender ActorID
// Vote is the payload that is signed by the signature.
Vote Payload
// Signature by the sender's public key over serialized vote payload.
Signature []byte
// VRF ticket for CONVERGE messages (otherwise empty byte array).
Ticket Ticket
// Evidence that justifies this message, or nil.
Evidence *Evidence
}
type Evidence struct {
// Vote is the payload that is signed by the signature.
Vote Payload
// RLE+ serialization of the indexes in the base power table of the signers.
Signers []byte
// BLS aggregate signature of signers over the vote.
Signature []byte
}
// Fields of the message that make up the signature payload.
type Payload struct {
// GossiPBFT instance number.
Instance uint64
// GossiPBFT round number.
Round uint64
// GossiPBFT step number.
Step uint8
// Chain of ECTipset objects proposed/voted for finalisation.
// Always non-empty; the first entry is the base tipset finalised in the previous instance.
Value []ECTipset
}
type ECTipset struct {
Epoch uint64 // The Filecoin epoch
TipsetKey []byte // A canonical concatination of the block-header CIDs in a tipset.
PowerTable CID // A blake2b-256 CID of the cbor-encoded power-table.
Commitments [32]byte // Root of a Merkle tree containing additional commitments.
}
// Table of nodes with power > 0 and their public keys.
// This is expected to be calculated from the EC chain state and provided to GossiPBFT.
// Ordered by (power descending, participant ascending).
type PowerTable = []PowerTableEntry
type PowerTableEntry struct {
ParticipantID ActorID
Power BigInt
Key BLSPublicKey
}
```
All messages broadcast by a participant have their participant ID in the sender field and contain a digital signature by that participant $(m.Signature)$ over `"GPBFT:filecoin:":15 || Step:1 || Round:8 || Instance:8 || MerkelizeValue(Value):32` (where `Round` and `Instance` are big-endian encoded). The protocol assumes aggregatable signatures (e.g., BLS, Schnorr), resilient to [rogue public key attacks](https://crypto.stanford.edu/~dabo/pubs/papers/BLSmultisig.html) (see [Boneh, Drijvers, and Neven](https://eprint.iacr.org/2018/483.pdf) construction).
The receiver of a message only considers messages with valid signatures and discards all other messages as invalid. We sometimes omit the sender IDs and signatures in further descriptions for better readability.
Two (or more) messages $m1$ and $m2$ are called _equivocating messages_ if $m1.Sender=m2.Sender \land m1.Instance=m2.Instance \land m1.Value ≠ m2.Value \land m1.MsgType=m2.MsgType \land \texttt(if applicable)\ m1.Round=m2.Round$. We call $m1.Sender$ an _equivocating sender_.
A set of messages $M$ that does not contain equivocating messages is called _clean_. Participants discard all equivocating messages when forming clean sets. In fact, in the proposed version of GossiPBFT in this document, only one of the equivocating messages needs to be stored (thanks to evidences explicitly validating a message).
##### `MerkelizeValue`
Instead of encoding `Payload.Value` as CBOR and signing the result, we convert it into a merkle-tree and format the individual `ECTipset` objects in a way that's easy to parse within both snarks and EVM-based blockchains.
Specifically, we encode each `ECTipset` by concatenating:
1. The `Epoch`, encoded as a big-endian 64bit value (8 bytes).
2. The `Commitements` hash as-is (32 bytes).
3. The tipset _CID_, a blake2b CID of the `TipsetKey` encoded as a CBOR byte array (38 bytes). The resulting CID is the "tipset CID". Importantly, it's hash digest is the digest returned by the `BLOCKHASH` instruction in Filecoin's EVM implementation.
4. The `PowerTable` CID .
Then, we build a [merkle tree](#merkle-tree-format) from these encoded `ECTipset` values. Only the final merkle tree root gets signed.
#### Predicates and functions used in the protocol
* `Power(p | P) ∈ [0x0000, 0xffff]`
* Returns the relative QAP of participant $p$ in EC as an integer in the range `[0x0000, 0xffff]`, defined by the power table corresponding to $baseChain$.
* If $p$ is a set of participants, the returned value is $\sum_{p\prime \in p} \texttt{Power}(p\prime | P)$. That is, the sum of the $\texttt{Power}$ of the participants in the set.
* Otherwise, the returned value is a fraction of the total QAP of all participants in the power table scaled to an integer in the range $[0, 2^{16})$, rounded down ($\lfloor (2^{16}-1) \times \texttt{QAP}(p) / \texttt{QAP}(P) \rfloor$).
* `IsStrongQuorum(p | P)`
* Returns $True$ if $\texttt{Power}(p) \ge \lceil ⅔\texttt{Power}(P) \rceil$ where $p$ is a subset of the participants and $P$ is the set of all members in the power table.
* $\texttt{Power}(P)$ will likely be less than $\mathrm{0xffff}$ due to compounded rounding errors.
* `isPrefix(a,b)`
* Returns $True$ if chain $a$ is a prefix of chain $b$. (Each chain is also a prefix of itself.)
* `HasStrongQuorum(prefix,M)`
* Where $M$ is a clean set of messages of the same type and the same round.
* Let $P$ be the set of participants who are senders of messages in $M$ such that their message contains a value with $prefix$ as a prefix. More precisely:
```
Let P={p : ∃ m∈ M: m.sender=p AND isPrefix(prefix,m.value)}
```
then the predicate returns $True$ iff $\texttt{IsStrongQuorum}(P)$.
* `HasStrongQuorumValue(M)`
* Where $M$ is a clean set of messages of the same type and the same round
* The predicate returns $True$ iff there is a value $v$, such that there is a set $P$ of participants who are senders of messages in $M$ such that their message value is exactly $v$ and $IsStrongQuorum(P)$. More precisely:
```
HasStrongQuorumValue(M) = ∃ v: IsStrongQuorum({p : ∃ m∈ M: m.sender=p AND m.value=v})
```
* `StrongQuorumValue(M)`
* Returns $v$ if $\texttt{HasStrongQuorumValue}(M)$ holds, $nil$ otherwise.
* `GreatestTicketProposal(M)`
* Let $M$ be a clean set of CONVERGE messages for the same round.
* If $M = ∅$, the predicate returns $baseChain$.
* If $M \ne ∅$, the predicate returns $m.value$ and $m.evidence$, such that $m$ is the message in $M$ with the greatest ticket after adjusting for the power of the sender ($m.ticket*m.sender.power$). M will never be empty as the participant must at least deliver its own CONVERGE message.
* `Aggregate(M)`
* Where $M$ is a clean set of messages of the same type $T$, round $r$, and instance $i$, with $v=\texttt{StrongQuorumValue}(M)≠nil$.
```
Let M' ← {m ∈ M : m.value = StrongQuorumValue(M)}
```
* Returns a tuple $\langle participants, aggSig \rangle$ where $participants$ are all participants such that $m.sender ∈ M'$ (in some compact/compressed representation, i.e. a bitmask with optional run-length encoding) and $aggSig$ is an aggregate signature (BLS) across all of those participants on $(m.i||m.T||m.r||m.v)$ for some $m ∈ M'$.
#### GossiPBFT pseudocode (main algorithm)
We illustrate the pseudocode for GossiPBFT below, consisting of 3 steps per round (QUALITY/CONVERGE, PREPARE, COMMIT) and an additional step outside the round loop (DECIDE). The $Sender$, $Signature$, and $Instance$ fields are omitted from messages for better readability. See also the simplified [PlusCal/TLA+ specification](https://github.com/filecoin-project/tla-f3).
```
GossiPBFT(instance, inputChain, baseChain, participants) → decision, PoF:
\*participants is implicitly used to calculate quorums
1: round ← 0
2: decideSent ← False
3: proposal ← inputChain \* holds what the participant locally believes should be a decision
4: timeout ← 2*Δ
5: value ← proposal \* used to communicate the voted value to others (proposal or 丄)
6: evidence ← nil \* used to communicate optional evidence for the voted value
7: C ← {baseChain}
8: while (NOT decideSent) {
9: if (round = 0)
10: BEBroadcast <QUALITY, value, instance>; trigger (timeout)
11: collect a clean set M of valid QUALITY messages from this instance
until HasStrongQuorum(proposal, M) OR timeout expires
12: C ← C ∪ {prefix : IsPrefix(prefix,proposal) and HasStrongQuorum(prefix,M)}
13: proposal ← heaviest prefix ∈ C \* this becomes baseChain or sth heavier
14: value ← proposal
15: if (round > 0) \* CONVERGE
16: ticket ← VRF(Randomness(baseChain) || instance || round)
17: value ← proposal \* set local proposal as value in CONVERGE message
18: BEBroadcast <CONVERGE, value, instance, round, evidence, ticket>; trigger(timeout)
19: collect a clean set M of valid CONVERGE msgs from this round and instance
until timeout expires
20: prepareReadyToSend ← False
21: while (not prepareReadyToSend){
22: value, evidence ← GreatestTicketProposal(M) \* leader election
23: if (evidence is a strong quorum of PREPAREs AND mightHaveBeenDecided(value, r-1)):
24: C ← C ∪ {value}
25: if (value ∈ C)
26: proposal ←value \* we sway proposal if the value is incentive compatible (i.e., in C)
27: prepareReadyToSend ← True \* Exit loop
28: else
29: M = {m ∈ M | m.value != value AND m.evidence.value != evidence.value} \* Update M for next iteration }
30: BEBroadcast <PREPARE, value, instance, round, evidence>; trigger(timeout) \* evidence is nil in round=0
31: collect a clean set M of valid PREPARE messages from this round and instance
until (HasStrongQuorumValue(M) AND StrongQuorumValue(M) = proposal)
OR (timeout expires AND Power(M)>2/3)
32: if (HasStrongQuorumValue AND StrongQuorumValue(M) = proposal) \* strong quorum of PREPAREs for local proposal
33: value ← proposal \* vote for deciding proposal (COMMIT)
34: evidence ← Aggregate(M) \* strong quorum of PREPAREs is evidence
35: else
36: value ← 丄 \* vote for not deciding in this round
37: evidence ← nil
38: BEBroadcast <COMMIT, value, instance, round, evidence>; trigger(timeout)
39: collect a clean set M of valid COMMIT messages from this round and instance
until (HasStrongQuorumValue(M) AND StrongQuorumValue(M) ≠ 丄)
OR (timeout expires AND Power(M)>2/3)
40: if (HasStrongQuorumValue(M) AND StrongQuorumValue(M) ≠ 丄) \* decide
41: evidence ← Aggregate(M)
42: BEBroadcast <DECIDE, StrongQuorumValue(M), instance, evidence>
43: decideSent ← True \* break loop, wait for other DECIDE messages
44: if (∃ m ∈ M: m.value ≠ 丄 s.t. mightHaveBeenDecided(m.value, r)) \* m.value was possibly decided by others
45: C ← C ∪ {m.value} \* add to candidate values if not there
46: proposal ← m.value; \* sway local proposal to possibly decided value
47: evidence ← m.evidence \* strong PREPARE quorum is inherited evidence
48: else \* no participant decided in this round
49: evidence ← Aggregate(M) \* strong quorum of COMMITs for 丄 is evidence
50: round ← round + 1;
51: timeout ← updateTimeout(timeout, round)
52: } \*end while
53: collect a clean set M of valid DECIDE messages
until (HasStrongQuorumValue(M)) \* collect a strong quorum of decide outside the round loop
54: return (StrongQuorumValue(M), Aggregate(M)) \* terminate the algorithm with a decision
```
```
\* decide anytime
55: upon reception of a valid <DECIDE, instance, v, evidence> AND not decideSent
56: decideSent ← True
57: BEBroadcast <DECIDE, v, instance, evidence)
58: go to line 53.
```
The helper function mightHaveBeenDecided returns False if, given the already delivered messages, the participant knows for a fact that no correct participant could have decided the given value in the given round, even in the presence of an adversary controlling <⅓ the QAP equivocating, and True otherwise:
```
59: mightHaveBeenDecided(value, round):
60: M ← { m | m.step = COMMIT AND m.round = round AND m is valid}
61: M' ← { m | m ∈ M AND m.value = value }
62: f Power(M') + (1-Power(M)) < ⅓ : \* value cannot have been decided
63: return False
64: return True
```
Note that the selection of the heaviest prefix in line 13 does not need access to the tipsets' EC weights, as only prefixes that extend each other can gather a strong quorum in QUALITY. In other words: if there are two tipsets $t_1, t_2$ that gather a strong quorum of QUALITY, then either the corresponding chain that has $t_1$ as head tipset is a prefix of the analogous chain that has $t_2$ as head, or viceversa (since the adversary controls < ⅓ of the QAP). As a result, selecting the heaviest prefix is as simple as selecting the highest blockheight (greatest number of blocks), while ensuring all proposed prefixes that gather a strong quorum in QUALITY extend each other as a sanity check.
Implementations may optimise this algorithm to treat the reception of an aggregated signature over some (MsgType, Instance, Round, Value) as evidence of a message as if it had received the same tuple of values directly from each participant in the aggregate. This may be used to effectively skip a partially-complete step. In the particular case of a DECIDE message, which carries evidence of a strong quorum of COMMITs for the same round and value, a participant immediately sends its own DECIDE for the same value (copying the evidence) and exits the loop at line 8.
Also, concurrently, we expect that the participant feeds to GossiPBFT chains that are incentive-compatible with EC. To this end, we restrict the set C of candidate values that the participant contributes to deciding to only values that either (i) pass the QUALITY step or (ii) may have been decided by other participants (hence the functin `mightHaveBeenDecided`).
#### Valid messages and evidence
The $\texttt{Valid}()$ predicate (referred to in lines 11, 19, 31, and 39, 53, 55) is defined below.
```
Valid(m): | For a message m to be valid,
if m.signature does not verify | m must be properly signed.
return False |
if m.instance != instance | m must match the instance ID.
return False |
if m.sender has zero power or is unknown: | The sender must have power and be known.
return False |
if NOT (m.value = 丄 | m's value must extend baseChain
OR baseChain.isPrefix(m.value)) | or be 丄
return False |
if m.step != CONVERGE AND m.ticket != nil | ticket must be nil if not CONVERGE
return False |
if m.step = QUALITY AND | A QUALITY message must
(m.round != 0 OR m.value = 丄) | have round 0 and non-丄 value.
return False |
if m.step = CONVERGE AND | A CONVERGE message must
(m.round = 0 OR m.value = 丄 OR | have non-zero round, non-丄 value,
m.ticket does not pass VRF validation) | and a valid ticket.
return False |
if m.step = DECIDE AND | A DECIDE message must
(m.round != 0 OR m.value = 丄) | have round 0 and non-丄 value.
return False |
if m.step IN {QUALITY} | If the step never requires evidence
if m.evidence != nil: | (QUALITY), evidence should not be present.
return False |
else |
return ValidEvidence(m) |
return True |
```
The $\texttt{ValidEvidence}()$ predicate is defined below. Note that QUALITY messages do not need evidence.
```
ValidEvidence(m= <step, value, instance, round, evidence, ticket>):
if ( step = PREPARE and round = 0) | in the first round,
AND (evidence = nil) | evidence for PREPARE
return True | is nil
if (step = COMMIT and value = 丄) | a COMMIT for 丄 carries no evidence
AND (evidence = nil)
return True
If (evidence.instance != instance) | the instance of the evidence must be the
return False | same as that of the message
| valid evidences for
return (step = CONVERGE OR (step = PREPARE AND round>0) | CONVERGE and PREPARE in
AND (∃ M: Power(M)>⅔ AND evidence=Aggregate(M) | round>0 is strong quorum
AND ((∀ m’ ∈ M: m’.step = COMMIT AND m’.value = 丄) | of COMMIT msgs for 丄
OR (∀ m’ ∈ M: m’.step = PREPARE AND | or PREPARE msgs for
m’.value = value)) | CONVERGE value
AND (∀ m’ ∈ M: m’.round = round-1))) | from previous round
OR (step=COMMIT | valid COMMIT evidence
AND (∃ M: Power(M)>⅔ AND evidence=Aggregate(M) | is a strong quorum
AND ∀ m’ ∈ M: m’.step = PREPARE | of PREPARE messages
AND ∀ m’ ∈ M: m’.round = round | from the same round
AND ∀ m’ ∈ M: m’.value = value)) | for the same value, or
OR (step = DECIDE | valid DECIDE evidence
AND (∃ M: Power(M)>⅔ AND evidence=Aggregate(M) | is a strong quorum
AND ∀ m’ ∈ M: m’.step = COMMIT | of COMMIT messages
AND ∀ m’ ∈ M: m’.round = round | from the same round
AND ∀ m’ ∈ M: m’.value = value)) | for the same value
```
### Evidence verification complexity
Note that during the validation of each CONVERGE, COMMIT, and DECIDE message, two signatures need to be verified: (1) the signature of the message contents, produced by the sender of the message (first check above) and (2) the aggregate signature in $m.evidence$ ($\texttt{ValidEvidence}(m)$ above). The former can be verified using the sender's public key (stored in the power table). The latter, being an aggregated signature, requires aggregating public keys of all the participants whose signatures it combines. Notice that the evidence of each message can be signed by a different set of participants, which requires aggregating a linear number of public BLS keys (in system size) per evidence verification.
However, a participant only needs to verify the evidence once for each *unique* message (in terms of (step, round, value)) received. Moreover, verification can further be reduced to those messages a participant uses to make progress (such as advancing to the next step or deciding), ignoring the rest as they are not needed for progress. This reduces the number of evidence verifications in each step to at most one.
#### Randomness
An instance of GossiPBFT requires a seed σ that must be common among all participants due to the dependency on a VRF in the CONVERGE step. As only one random seed is required per GossiPBFT instance, and there is only one GossiPBFT instance per epoch (see below), the random seed for GossiPBFT can be sourced from drand's output for that epoch in EC.
### Synchronization of Participants in the Current Instance
GossiPBFT ensures termination provided that (i) all participants start the instance at most $\Delta$ apart, and (ii) the estimate on Δ is large enough for $\Delta$-synchrony in some round (either because of the real network delay recovering from asynchrony or because of the estimate increasing).
[Given prior tests performed on GossipSub](https://research.protocol.ai/publications/gossipsub-v1.1-evaluation-report/vyzovitis2020.pdf) (see also [here](https://gist.github.com/jsoares/9ce4c0ba6ebcfd2afa8f8993890e2d98)), we expect that sent messages will reach almost all participants within $Δ=3s$, with a majority receiving them even after $Δ=2s$. However, if several participants start the instance $Δ + ε$ after some other participants, termination is not guaranteed for the selected timeouts of $2*Δ$. Thus, we do not rely on an explicit synchrony bound for correctness. Instead, we increase the estimate of Δ locally within an instance as rounds progress without decision.
The synchronization of participants is performed in the call to $\texttt{updateTimeout(timeout, round)}$ (line 51), and works as follows:
* Participants start an instance with $Δ=2s$.
* Participants set their timeout for the current round to $Δ*1.3^{r}$ where $r$ is the round number ($r=0$ for the first round).
Additionally, as an optimization, participants may continue to the subsequent step once the execution of the current step has been determined, without waiting for the timeout. For example, if a participant receives QUALITY messages from all participants, it can proceed to the next step without waiting for the timeout. More generally, if the remaining valid messages to be received cannot change the execution of the step, regardless of the values contained in the messages, then a participant may continue to the next step.
### Synchronization of Participants across Instances
The $finalizedTipsets$ data structure is disseminated among the participants and other observers of the Filecoin network in the same manner as the chain itself. In particular, newly joined and temporarily disconnected participants and observers wishing to download and verify the chain can obtain the $finalizedTipsets$ data from other participants or observers.
To verify a finality decision, assuming prior knowledge of an already-final base, a client needs:
* The power table entries from the base
* A set of signed DECIDE messages for that base that agree on the same value, from distinct senders comprising more than ⅔ QAP in the base
* The public keys for those signers
A finality certificate brings them together.
```
type FinalityCertificate struct {
// Instance, Value as for GossiPBFTMessage (DECIDE)
Instance int
Value []ECTipset
// Indexes in the base power table of the certifiers (bitset)
Signers []bytes
// Aggregated signature of the certifiers
Signature []bytes
// Changes to the base power table implied by the certified tipset
PowerTableDelta []PowerTableDelta
}
type PowerTableDelta struct {
// Participant with changed power
ParticipantID ActorID
// Change in power from base (signed)
PowerDelta BigInt
// New signing key if relevant (else empty)
SigningKey Key
}
```
A finality certificate is analogous to a full block in the EC block exchange protocol.
#### Exchange protocol
Like with EC block exchange, participants follow **a simple request/response protocol to provide a contiguous sequence of finality certificates**. Note that the sequence of certificates is traversed backward, matching block exchange. A finality certificate contains the tipset and the PoF of the tipset ($Signature$). PoFs may vary across participants, but all PoFs must be for the same tipset at the same instance number $i$.
```
type Request struct {
// Highest GossiPBFT instance number, to fetch first.
LastInstance int
// Number of certificates to fetch backward from LastInstance, inclusive.
Length int
}
type Response struct {
// OK, or some error code
Status int
// Sequence of certificates, in reverse order, beginning last requested.
Certificates []FinalityCertificate
}
```
#### Certificate verification
The client's algorithm to verify a finality certificate is roughly:
* Check that the instance number and base tipset follow from the previous certificate
* Load the power and public key for each sender from the base power table
* Verify that the sum of the power of signers exceeds ⅔ of the total power in the table
* Compute the _GossiPBFTMessage_ corresponding to a DECIDE by each sender
* Verify the BLS aggregate signature against the list of messages and keys
Further, a client prepares to verify the next finality certificate by:
* Recoding the decided tipset as the new base
* Applying power table deltas to the base power table
Certificate verification must bottom out in some genesis certificate, which is assumed to be trusted and does not need to be verified. The introduction of the F3 protocol will require such genesis to bootstrap the protocol, and this genesis state is also needed to root a certificate chain.
#### Power table deltas
**A verifier needs each signer's power and public key** (GossiPBFT also needs these for its execution). These are provided by the power table associated with each finalized tipset.
Assuming some genesis tipset with a known power table, a _GossiPBFTMessage_ can provide a commitment to (CID of) the resulting power table but doesn't provide the actual entries. A finality certificate includes these as a delta from the previous power table. A verifier can compute the power table for the subsequent certificate by applying these deltas to its base power table.
Note the power table deltas don't need to be signed. All DECIDE messages include a commitment to the CID of the resulting power table. Hence, a verifier only needs to confirm that the power table CID as a result of their computation matches that committed to by the signed messages. A finality certificate with incorrect power table deltas cannot be used as a base to verify subsequent instances.
#### Verification by Filecoin node (“fast catch-up”)
A new Filecoin validating node can listen to GossiPBFT gossip to learn about the current instance number. They can then issue exchange requests to other nodes to fetch (in parallel) the complete sequence of finality certificates back to their genesis. **These certificates can be verified without reference to the EC chain.**
At the same time, the node can sync the headers (and only the headers) of the EC chain and fetch a snapshot of the state at or before the last finalized epoch. They can then verify that:
* The EC block headers are correctly computed from their ancestors (except for the state root).
* The snapshot state root CID matches that in the block headers.
* The block headers are final.
This produces a fully trusted chain sync without revalidating the EC chain (which takes days) or fetching the message history. Today, many nodes simply trust that the snapshot they obtain is on the “right” chain; **this fast catch-up is, therefore, both faster and more secure (no eclipse) than revalidating the EC chain.**
#### Verification by light client
A light client can also verify the most recent finalized tipset without validating the EC chain execution. They can fetch just one epoch of block headers and a subset of the state tree to verify some specific state or produce a valid transaction against that state.
#### Verification by smart contract
The finality certificates can be submitted (in order) to a smart contract. The smart contract must maintain the power table of the last finalized tipset in its state.
A smart contract will never accept a finality certificate earlier than those it has already validated. It will be safe from long-range attacks as long as it is kept reasonably up-to-date with finality decisions as F3 progresses. Like a new client, a new smart contract must be carefully initialized to ensure it follows the right certificate chain.
Only one such smart contract is needed on a single blockchain execution environment, which can provide the resulting final tipset information to others.
### Merkle Tree Format
This protocol uses merkle trees as vector commitments to commit to both (a) the tipsets in an instance and (b) the values committed to in each tipset. We use a single binary merkle tree format in both cases.
Specifically, we use a balanced binary merkle tree where:
1. The hash function is `keccak256` (same as Ethereum) for maximum EVM compatibility.
2. Each internal node is prefixed with a single `0x0` byte before hashing (`keccak256([0] + left + right)`).
3. Each leaf is prefixed with a single `0x1` byte before hashing (`keccak256([1] + value)`).
4. Missing values and empty sub-trees hash to `[32]byte{}` (32 `0x0` bytes).
In our tree, we prove that some value `v` is committed to at index `i` in the merkle tree rooted at `r` by providing the "uncles" (the branches not taken) of the merkle path from `v` to `r` (bottom to top). This algorithm is best described in [code](../resources/fip-0086/merkletree.go).
## Design Rationale
Many possibilities were considered for faster finality. We justify the architecture of F3 as the best candidate to achieve fast finality in Filecoin by comparing it with other options:
* _Improvements to Filecoin's EC_: Longest chain protocols rely on the increasing probability over time that some state is consistent across all participants. In all existing implementations of longest-chain protocols, finalization thus incurs at least tens of minutes.
* _Best-effort finality gadgets_: Ethereum's approach to faster finality involves the introduction of a novel finality gadget. However, this finality gadget requires at least two epochs for finalization, translating to at least more than 30 seconds in Filecoin. In contrast, GossiPBFT can achieve sub-epoch finalization and is only limited by the network speed. Moreover, Ethereum's protocol is yet to be proven correct. In fact, multiple attacks have been described in the literature that show how an adversary can keep the system from ever finalizing blocks, even after fixes.
* _Sub-sample voting_: Avalanche's sub-sample voting copes with arbitrarily large groups of participants by repeatedly sub-sampling constant-size subsets. Unfortunately, Avalanche's strong network assumptions make the protocol vulnerable to an adversary exploiting network delays. Moreover, Avalanche ensures correctness against an adversary controlling at most ⅕ of participants, versus ⅓ targeted by GossiPBFT. Even when GossiPBFT becomes amenable to committees, for reasonable committees of 600 participants, GossiPBFT already tolerates Avalanche's ⅕ of total participants with significantly weaker network assumptions.
* _Replacing EC with BFT_: Replacing EC with a BFT protocol (like GossiPBFT) would imply a huge code refactoring, leading to more bugs and significantly more time to develop and productionize the Filecoin upgrade. Furthermore, the combination of EC and F3 is more redundant, ensuring that the system continues operating in the unlikely case that F3 halts.
### Smart Contract Validation
Several design choices were made to reduce the cost of smart-contract validation, although further improvements are required before it's practical.
#### `ECTipset` Contents
The `ECTipset` contains 4 values:
1. The epoch.
2. The tipset key itself.
3. A CID of the power table.
4. The root of the commitements merkle tree.
We included the epoch directly in the `ECTipset` object itself because we expect proofs of the form "event X happened before/after/at epoch Y" to be common in cross-chain interactions.
The tipset key is required as this is the main value F3 is finalizing. While off-chain light-clients will use the tipset key, cross-chain bridges likely won't which is why we _hash_ the tipset key before signing (so we won't have to send the full tipset key to bridge contracts, saving gas).
We include a blake2b hash of the power table to aid in syncing the F3 chain. In theory we could have added it to the commitment merkle tree and left it out of the root `ECTipset` object, but being able to validate a chain of finality certificates without any additional merkle-proofs is convenient and the an extra 32 bytes is not an unreasonable cost (512 gas on Ethereum) .
Finally, we include the root hash of the commitments merkle-tree. This tree is _empty_ at the moment but will eventually include:
1. Commitements to snark-friendly representations of the power table.
2. Events, likely indexed by emitter (useful for cross-chain messaging).
And probably more. Importantly, the network can easily commit to additional values here without altering the `ECTipset` structure (and potentially breaking bridges).
#### Wire Format v. Signing Format
For both the `Payload` and the `ECTIpset`s themselves, we encode them one way for signing and another when sending them over the network.
1. Even though we end up signing a root of a merkle-tree of the tipsets, we have to send the full list of tipsets over the network as the protocol needs a way to find common prefixes.
2. Even though we end up signing a CID of the tipset key, we have to send the full tipset key over the wire because the Filecoin chainsync protocol names tipsets by tipset key, not tipset CID/hash. By sending the full key, a client can learn about _new_ tipsets from F3 if it doesn't hear about them from its peers.
On the other hand, when signing, we tend to sign hashes and merkle-trees so we can easily omit unnecessary data from proofs sent across bridges.
##### Certificate/Decision Signing Format
Decision payloads are designed to be easy to work with in the EVM (32-byte alignment) and within zkSNARKs (small with fixed-length fields). Instead of naively signing a CBOR-encoded value, we sign `Payload` objects such that decisions are encoded as follows:
| | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 |
|-----------|---|---|---|---|---|---|---|---|---|---|----|----|----|----|----|-----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|----|
| **00-31** | G | P | B | F | T | : | f | i | l | e | c | o | i | n | : | 0x5 | . | . | . | . | . | . | . | . | x | x | x | x | x | x | x | x |
| **32-63** | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y | y |
Where `.` is a zero byte, `xxx` is the instance (64bit BigEndian integer) and `yyy` is the root hash of the tipset merkle tree.
Importantly, the instance and merkle tree root are 32-byte aligned and bytes 0-24 are constant.
However, note that:
1. The BLS curve we're using in this signature is not zkSNARK friendly.
2. The hash function we're using to map the payload onto the BLS curve point uses blake2s (also not snark friendly).
##### ECTipset Signing Format
ECTipsets are encoded (for signing) as:
1. Two fixed-sized values first: epoch (8 bytes) and the commitements root (32 bytes). That way, EVM smart contracts can extract these values at fixed offsets without any parsing.
2. Two "variable length" (although they'll be fixed-sized in practice) values: the tipset CID and the power table CID. Cross-chain bridges should generally ignore these values.
Additionally, the epoch comes _before_ the commitments as it lets the bridge arrange for padding such that both the epoch and the commitments root hash are 32-byte aligned.
#### Merkle Trees
We put both the tipsets and the commitments inside tipsets into merkle trees to keep proof-sizes (for bridges) small, even if we have a large instance with many tipsets.
Our merkle tree is heavily based on the [merkle tree from CometBFT][comet-merkle], but that one:
1. Uses sha256 (not EVM friendly).
2. Requires more complex math on validation. This lets it avoid a few hash operations in some cases, but that likely isn't worth the complexity.
3. Needs additional out-of-band information to be a _true_ vector commitment. Optimization 2 in the CometBFT merkle tree means that a proof for item 7/7 can be treated as a proof of an item 6/6.
Most other well-known merkle tree formats (verkle trees, etc.) optimize for sparse trees and where we have _full_ trees, so the complexity isn't worth it.
[comet-merkle]: https://github.com/cometbft/cometbft/blob/main/spec/core/encoding.md#merkle-trees
## Backwards Compatibility
The implementation of F3 involves three main tasks:
1. Implementing GossiPBFT in a modular way (no changes to existing codebase).
2. Re-formulating the fork-choice rule to the heaviest suffix of the heaviest finalized chain.
3. Implementating the F3 component, which reads the EC chain and marks blocks as final locally.
Because of changes to the EC fork choice rule, this FIP requires a network upgrade to take effect.
## Test Cases
1. Unit tests
2. Test propagation times during network load to set an estimate on $\Delta$
1. 3.5k participants sending messages of size >120B over GossipSub (and measure time to receive first, a strong quorum, and all messages).
2. Same test but for DECIDE messages (which include the passive nodes, not just participants, in the corresponding GossipSub topic).
3. Protocol-specific tests:
- Correctness tests (synchrony is assumed for tests unless stated otherwise):
* Best case: all participants start with the same input.
- Expected Behavior (EB): all participants decide in round 1.
* No synchrony: all participants start with the same input but there is no synchrony at first (less than ⅔ QAP receive the QUALITY messages on time from the rest). Synchrony is restored only after QUALITY finishes.
- EB: all participants decide $baseChain$.
* No quality: participants holding <⅔ QAP start GossiPBFT with a chain $c$ and the rest with a chain $c'$, such that $c$ and $c'$ only share $baseChain$ as prefix.
- EB: all participants decide $baseChain$.
* Prefix quality: participants holding <⅔ QAP start GossiPBFT with a chain $c$ and the rest with a chain $c'$, such that $c$ and $c'$ share some prefix $c''$ that extends $baseChain$.
- EB: all participants decide $c''$.
* <a id="three-partitions">Decision of different participants in different rounds</a>: three different partitions, $P$, $Q$, $T$ start with different input chains $c$, $c'$, $c''$, such that $c$ is a prefix of $c'$, and $c'$ of $c''$, respectively, (they strictly extend each other, none being baseChain). QAP of $Q$ >½, while that of $P$ and $T$ are equally split. There is no synchrony between $p$ and $T$ until the beginning of the second round.
- EB: The first round results in participants in $P$ voting for $c$, the rest voting for $c'$ and deciding $c'$. At the start of the second round, once synchrony is restored, $P$ also decides $c'$.
* Membership-change:
- New participants (>⅓ QAP) join the power table and catch up to the latest instance.
- EB: Successful catch-up to the currently executed instance and decision in this instance.
- New participants (much less than ⅓ QAP) join the power table and catch up to the latest instance
- EB: Successful catch-up to currently executed instance and decision in it. Make sure new participants catch up timely with the rest (as progress will happen while they catch up).
* Significant network delays:
- Participants holding ½ QAP start the instance 10 seconds after the other half.
- EB: Participants eventually decide.
- All messages are delayed by 10 seconds.
- EB: Participants eventually decide.
* Tests on message validity:
- Invalid messages should be discarded; this includes:
* Old/decided instance
* Invalid ticket
* Invalid signature
- Also: signature by a non-participant according to the corresponding power table.
* Invalid evidence
- Evidence containing invalid signature
- Evidence for a different message
- Evidence insufficient by QAP according to the corresponding power table.
* Invalid value
- Value not extending the corresponding baseChain
- Value not being acceptable (after QUALITY step)
- Tests under faults:
* Crashing:
- <⅓ QAP does not send any message from the beginning; the rest share the same input.
- EB: the common input is decided in the first round.
- \>⅔ QAP send QUALITY messages for some chain $c$, while the rest for some chain $c'$ ($c$ and $c'$ only share $baseChain$ as common prefix). After sending QUALITY messages, participants that sent $c$ and that hold <⅓ QAP crash.
- EB: $c$ is decided (participants that sent $c'$ swayed to $c$) in the first round.
- Same setup as the test with [three different partitions](#three-partitions), but participants holding less than ⅓ QAP in Q crash after sending QUALITY. One more participant (still in total <⅓ QAP), holding the greatest ticket, crashes in the second round right after sending its CONVERGE.
- EB: the chain voted by the crashed participant (i.e., $c'$) is decided.
* Crash-recovery:
- Repeat tests and their EB when, at a given moment, all participants crash and recover (can be done on smaller scales)
- Make tests for crashes at different points of the protocol through fuzzing
* Equivocating:
- Three partitions, $F$ holding <⅓ QAP, $P$ and $Q$ holding the rest of QAP (with $P$ holding more QAP than $Q$). Participants in $P$ do not receive messages from $Q$ timely, and vice versa, in the first round. Participants in $F$ send QUALITY for $c$ to $P$ and QUALITY for $c'$ to _P'_. They do the same with PREPARE, PROPOSE, COMMIT. Synchrony is restored in round 2.
- EB: in the first round, participants in $P$ decide $c$. Participants in $Q$ go to round 2, where they also decide $c$. Participants in $F$ are detected as equivocating by $p$ and $Q$ eventually.
- Same setup as the test with [three different partitions](#three-partitions), but participants holding just less than ⅓ QAP in $Q$ equivocate after sending QUALITY by sending PREPARE and PROPOSE for $c$ to $P$ and for $c'$ to honest participants in $Q$ and $T$.
- EB: honest participants in $Q$ and $T$ decide $c'$ in the first round. Once synchrony is restored in the second round, participants in $P$ decide $c'$ too (perhaps subject to the greatest ticket being held by non-Byzantine).
- Flooding messages for future instances/rounds
- Participants holding <⅓ QAP keep sending QUALITY and PREPARE messages for future rounds/instances (as they do not require evidence), intending to flood memory.
- EB: the system can handle buffering a large load of future messages.
- Participants proposing chains that do not extend base chain (EB for all these is that these messages are discarded as the value is invalid):
- Superset (chain proposed includes the $baseChain$ head tipset but not exactly the head tipset)
- Subset (chain proposed contains a tipset included by the $baseChain$ head tipset but not exactly the head tipset)
- Disjoint tipset
- Non-liveness of GossiPBFT for 900 epochs or more. The setup for this is >⅓ QAP crash.
- Expected behavior:
- EC finalizes tipsets
- Honest participants do not participate anymore to finalize anything (handbrake)
4. Integration tests:
- Test 0th instance: all participants start by assuming $baseChain$ is epoch number ($upgradeEpoch$ minus 900).
- EB: F3 successfully bootstraps with a decision.
- Test under Byzantine behavior (<⅓ QAP) and network delays.
- Test upgrade: same as above, but perform a test in calibration (or simulated upgrade), not in isolation (as above).
- F3 decides a chain in an instance that is not extended by the current EC chain
- EB: All participants change their heaviest chain to extend the new finalized head tipset.
5. Performance tests (realistic but different geographical and weight distributions and number of participants):
- Latency
- Throughput
- Practical limit on participation
## Security Considerations
The modifications proposed in this FIP have far-reaching implications for the security of the system. This FIP changes Filecoin at a fundamental level: from considering tipsets as final after some time to finalizing them after a quorum of participants reach an agreement. We list here the security considerations this modification entails.
* **Censorship.** F3 and GossiPBFT are designed with censorship resistance in mind. The updated fork choice rule means that an adversary controlling at least more than ⅓ QAP can try to perform a censorship attack if honest participants start an instance of GossiPBFT proposing at least two distinct inputs. While this attack is theoretically possible, it is notably hard to perform on F3 given the QUALITY step of GossiPBFT and other mitigation strategies specifically put in place to protect against this. We strongly believe that, even against a majority adversary, the mitigations designed will prevent such an attack.
* **Liveness.** Implementing F3 introduces the risk that an adversary controlling at least ⅓ QAP prevents termination of a GossiPBFT instance. In that case, the F3 component would halt, not finalizing any tipset anymore. At the same time, EC would still operate, outputting tipsets and considering them final after 900 epochs. The liveness of the system is thus not affected by attacks on the liveness of F3.
* **Safety.** Implementing F3 ensures the safety of finalized outputs during regular or even congested networks against a Byzantine adversary controlling less than ⅓ QAP. For stronger adversaries, F3 provides mitigations to prevent censorship attacks, as outlined above. If deemed necessary, the punishment and recovery from coalitions in the event of an attempted attack on safety can be explored in future FIPs. Note that the (safe) finality time is already significantly improved by F3 compared to the status quo: F3 provides safety of finalized outputs two orders of magnitude faster than the current parameter of 900 epochs during regular network operation.
* **Denial-of-service (DoS).** The implementation of the F3 preserves resistance against DoS attacks currently ensured by Filecoin, thanks to the fully leaderless nature of GossiPBFT and to the use of a VRF to self-assign tickets during the CONVERGE step.
* **Committees.** This FIP proposes to have all participants run all instances of GossiPBFT. While this ensures optimal resilience against a Byzantine adversary, it can render the system unusable if the number of participants grows too large. While we are still evaluating the maximum practical number of participants in F3, it is expected to be at least one order of magnitude greater than the current number of participants in Filecoin. This introduces an attack vector: if the scalability limit is 100,000 participants, a participant holding as little as 3% of the current QAP can perform a Sybil attack to render the system unusable, with the minimum QAP required per identity. As a result, the implementation should favor the messages of the more powerful participants if the number of participants grows too large. Given that favoring more powerful participants discriminates against the rest, affecting decentralization, amending F3 to use committees in the event of the number of participants exceeding the practical limit will be the topic of a future FIP, as well as the analysis of optimized message aggregation in the presence of self-selected committees.
## Incentive Considerations
Participating in GossiPBFT only entails verifying $O(n)$ and generating $O(1)$ signatures per participant. If not enough participants follow the protocol, the liveness of F3 will be affected. This means that the service offered is affected, but also that participants do not receive rewards from block proposals for the period in which they do not participate. Consequently, we do not believe additional incentives for participation are necessary, as the modifications in this FIP significantly improve the system and the additional computational and communication costs do not substantially alter the cost structure of running a Filecoin node.
Furthermore, incentivizing all messages and verification thereof is impossible: this would require consensus on which messages have been sent (which would entail even more messages, these new ones unverified). Nevertheless, subsequent FIPs can provide more incentives, such as rewarding participants whose signatures are listed in agreed-upon PoFs or slash/denylist participants who sign equivocating messages.
## Product Considerations
1. Applications built on FVM and IPC do not need to wait for 900 epochs and can instead benefit from fast finalization times within the order of tens of seconds.
2. Other applications built on Filecoin (e.g., Glif, Axelar, Wormhole, exchanges) can also provide fast finality and low latency without compromising security.
3. Safe, verifiable, fast bridging to other networks becomes possible.
## Implementation
We refer to the stand-alone implementation in a simulated environment in the [go-f3](https://github.com/filecoin-project/go-f3) repository. Work is ongoing on a reference lotus implementation (to be provided soon).
## TODO
- [x] Complete benchmarking around message complexity and computational/networking requirements.
- [x] Decide between implicit and explicit evidence, taking into account benchmark results.
- [ ] Specify power table delay mechanism, local message queue, bounds, etc
- [ ] Re-specify timing of new instances after EC epochs
- [ ] Specify power table commitment
- [ ] Finalize and incorporate the WIP finality exchange protocol.
- [ ] Update and move proofs of correctness and other supplemental documentation into `resources`
## Copyright
Copyright and related rights waived via [CC0](https://creativecommons.org/publicdomain/zero/1.0/).