-
-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
consensus/clique: replace static 1/2 difficulties with dynamic 1-n scale #166
Conversation
@@ -51,29 +52,27 @@ type Snapshot struct { | |||
|
|||
Number uint64 `json:"number"` // Block number where the snapshot was created | |||
Hash common.Hash `json:"hash"` // Block hash where the snapshot was created | |||
Signers map[common.Address]struct{} `json:"signers"` // Set of authorized signers at this moment | |||
Signers map[common.Address]uint64 `json:"signers"` // Each authorized signer at this moment and their most recently signed block |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually becomes simpler, since we drop Recents
, but it may also be incompatible and require a chain reset.
{signer: "A", voted: "C", auth: false}, | ||
{signer: "B", voted: "C", auth: false}, | ||
{signer: "A", voted: "B", auth: false}, | ||
{signer: "A", voted: "D", auth: true}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test was identical to the previous. This is my best guess at the original intention, based on the comment.
Tally: make(map[common.Address]Tally), | ||
} | ||
for _, signer := range signers { | ||
snap.Signers[signer] = struct{}{} | ||
snap.Signers[signer] = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0
means 'no blocks signed' - we have to be sure to handle this specially and not interpret it as 'signed the genesis block', so that the initial n/2
blocks can be signed. I'm still not certain if it's important to assign distinct difficulties for those initial blocks, but it would be trivial to use the old algorithm.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have a unit test for this case?
consensus/clique/clique.go
Outdated
) | ||
|
||
const ( | ||
checkpointInterval = 1024 // Number of blocks after which to save the vote snapshot to the database | ||
inmemorySnapshots = 128 // Number of recent vote snapshots to keep in memory | ||
inmemorySignatures = 4096 // Number of recent block signatures to keep in memory | ||
|
||
wiggleTime = 500 * time.Millisecond // Random delay (per signer) to allow concurrent signers | ||
recentSignerDelay = 1 * time.Second // Full delay for most recent eligible signer. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Raised since we are doing fractions of this value, instead of multiples (though perhaps this is unwise).
consensus/clique/clique.go
Outdated
// A difficulty <= limit would be too recent; limit+1 is the most recent eligible signer. | ||
// So by subtracting limit, limit+1 becomes 1, which is a full delay. | ||
fraction := diff - limit | ||
delay = recentSignerDelay / time.Duration(fraction) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks off-by-one and I need to revisit, but the general idea is to order these delays by difficulty, since they were random before. This fractional solution deals with the the scaled difficulties nicely, since the delay asymptotically approaches 0. However, I'm thinking that the maximum delay should still be based on the number of signers (like before), so they don't get so crammed together as we scale up. I'm not really sure of the importance though, since the distinct difficulties should resolve conflicts immediately.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Duh, these only apply to diff < n
, which is then shifted, so the scale isn't really relevant. I will rework this. We can distribute at most n/2
linearly into the range used before or something like that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reworked into the much simpler: delay = time.Duration(n-diff) * wiggleTime
, which is a maximum delay of n/10
seconds for the most recent eligible signer (with the current wiggleTime
of 200ms).
) | ||
|
||
const ( | ||
checkpointInterval = 1024 // Number of blocks after which to save the vote snapshot to the database | ||
inmemorySnapshots = 128 // Number of recent vote snapshots to keep in memory | ||
inmemorySignatures = 4096 // Number of recent block signatures to keep in memory | ||
|
||
wiggleTime = 500 * time.Millisecond // Random delay (per signer) to allow concurrent signers | ||
wiggleTime = 200 * time.Millisecond // Delay step for out-of-turn signers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Restored to a multiplier, but with a reduced value since we have faster blocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall this lgtm. I added mostly stylistic comments.
consensus/clique/clique.go
Outdated
return errUnauthorized | ||
signed, authorized := snap.Signers[signer] | ||
if !authorized { | ||
return fmt.Errorf("%s not authorized to sign", signer.Hex()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change signed
to something more clear? e.g. lastSignedBlockNumber
. Right now signed
seems like it would be a bool
.
consensus/clique/clique.go
Outdated
return nil, errUnauthorized | ||
signed, authorized := snap.Signers[signer] | ||
if !authorized { | ||
return nil, fmt.Errorf("%s not authorized to sign", signer.Hex()) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update name of signed
variable here too.
consensus/clique/snapshot.go
Outdated
if signed > 0 { | ||
limit := uint64(len(snap.Signers)/2 + 1) | ||
if next := limit + signed; number < next { | ||
return nil, fmt.Errorf("%s not authorized to sign %d: signed %d, next eligible signature %d", signer.Hex(), number, signed, next) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we move the next
calculation to snap
so it's not duplicated and clearer? e.g. NextSignableBlockNumber(lastSignedBlockNumber uint64) uint64
consensus/clique/snapshot_test.go
Outdated
for name, tt := range tests { | ||
t.Run(name, tt.run) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Converting tests
from a slice to a map
is going to make tests run in a different order every time. Not the biggest deal but another alternative would be to put a name string
in the tests
struct{}
definition.
I'm beginning to think that letting the difficulty scale up unbounded could be problematic, but also that we can work around it, and cap difficulty to |
I've been resisting assigning simple difficulties from I will update the OP. |
0415bcd
to
c9e765f
Compare
lgtm 👍 |
c9e765f
to
2dd2bac
Compare
I am in favour of this code, though, I need a way to fork my existing network into accepting any new block validation algorithm. |
@rlegene You can try this (arguably a bug in the client - background here) and this (less significant, just makes a random choice a little more deterministic) - they only adjust how to handle same-difficulty blocks, and don't modify the protocol at all, so existing clients can upgrade without a fork. |
DO NOT MERGE - REQUIRES RESET
This PR proposes replacing the static clique difficulties with dynamic, scaled values derived from the last signed block.
The existing clique consensus protocol uses two static difficulties, 2 for the 'in-turn' signer, and 1 for 'out-of-turn' signers. This prioritizes in-turn signing over out-of-turn. However, it does not distinguish between 'out-of-turn' signers, and 'in-turn' is based only on the block number, with no consideration of recent history. This leads to a few problems:
n
nodes are signing (x
are down), a smoothn-x
round-robin is likely not possible, because when a node fills in out-of-turn, it may make itself ineligible (too recent) for it's own next in-turn block, requiring another out-of-turn signature. This effect can cascade or repeat depending on chance and the least common multiple ofn
andn-x
.These problems can be avoided by using a distinct, dynamic, scaled difficulty, based on the last block signed by each signer. From CalcDifficulty:
The most recent
n/2
signers are ineligible to sign, so this produces difficulties fromn/2+1
ton
, inclusive, with the 'in-turn' signer always having difficultyn
. This has several benefits which solve or reduce the aforementioned problems:n
nodes are signing, a gracefuln-x
round-robin schedule will still be prioritized, with random out-of-turn signatures only shifting or reordering the schedule.