-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to increase the version of the runtime,
Co-Authored-By: rphmeier <rphmeier@gmail.com>
Co-Authored-By: rphmeier <rphmeier@gmail.com>
Co-Authored-By: rphmeier <rphmeier@gmail.com>
b51cd5a
to
b9246b0
Compare
@rphmeier @andresilva how is this looking? |
I think it's ready for another review. I didn't approve the PR because most of the changes since the last review have been made by me and I'd like a sign off from @rphmeier. |
I'll give it another review but not sure I can get to it this week. I'll try on the weekend or Monday. |
// This can happen after a forced change (triggered by the finality tracker when finality is stalled), since | ||
// the voter will be restarted at the median last finalized block, which can be lower than the local best | ||
// finalized block. | ||
warn!(target: "afg", "Safety violation detected, tried to finalize {:?} while the current best finalized is {:?}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's a safety violation to re-finalize blocks on the canonical chain -- perhaps we can only warn when it does that?
|
||
new_set = Some((median_last_finalized, AuthoritySet { | ||
current_authorities: change.next_authorities.clone(), | ||
set_id: self.set_id + 1, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm yet in-progress of trying to understand this PR in details - so probably I'll discover the answer for my question later. But here is the question: is this possible that there will be two adjacent forced-change periods where set_id
has increased by 2 without any justification (i.e. set A#1
has changed to set B#2
AND set B#2
was also unable to finalize something AND it has been changed to set C#3
AND only C#3
has generated new justification with set_id=3
).
I'm currently relying on the fact that set_id
is always accompanied by justification in the #1669 (in this check to be specific). It isn't critical for that code - just wanted to clarify if this is still a rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I believe the situation you described can happen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@svyatonik worth noting that the offline change boots out light clients, so you can still rely on that assumption as long as you handle offline changes by shutting down the light client.
A checkpointing system is the only way to get light clients back onto the network.
@@ -292,31 +436,37 @@ mod tests { | |||
|
|||
#[test] | |||
fn changes_iterated_in_pre_order() { | |||
// TODO: include forced change and make sure it's iterated last |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Guess this TODO is already solved (below in the same test).
When this PR is approved please don't merge it right away since I'm still doing some local testing. |
Contesting the patch label -- this is probably a bit too unstable to backport to earlier testnets safely. |
d435d9f
to
cbb959f
Compare
I removed the patch label but we should still backport this and disable the finality tracker module (otherwise it will be harder to backport new fixes for grandpa). |
Co-authored-by: André Silva <andre.beat@gmail.com> * skeleton for finality tracker * dispatch events when nothing finalized for a long time * begin integrating finality tracker into grandpa * add delay field to pending change * add has_api_with function to sr_version for querying APIs * partially integrate new force changes into grandpa * implement forced changes * get srml-grandpa compiling * Update core/finality-grandpa/src/authorities.rs Co-Authored-By: rphmeier <rphmeier@gmail.com> * Update core/finality-grandpa/src/authorities.rs Co-Authored-By: rphmeier <rphmeier@gmail.com> * Update core/finality-grandpa/src/authorities.rs Co-Authored-By: rphmeier <rphmeier@gmail.com> * remove explicit dependence on CoreApi * increase node runtime version * integrate grandpa forced changes into node runtime * add some tests to finality-tracker * integrate finality tracking into node-runtime * test forced-change logic * test forced changes in the authority-set handler * kill some unneeded bounds in client * test forced-changes in finality-grandpa and fix logic * build wasm and finality-tracker is no-std * restart voter on forced change * allow returning custom error type from lock_import_and_run * extract out most DB logic to aux_schema and use atomic client ops * unify authority set writing * implement set pausing * bump runtime version * note on DB when we pause. * core: grandpa: integrate forced changes with multiple pending standard changes * core: grandpa: fix AuthoritySet tests * runtime: bump impl_version * core: clear pending justification requests after forced change import * srml: finality-tracker: use FinalizedInherentData * core: log requests for clearing justification requests * core, node: update runtimes * core: grandpa: fix tests * core: grandpa: remove todos and add comments * core: grandpa: use has_api_with from ApiExt * core: fix tests * core: grandpa: remove unnecessary mut modifier * core: replace PostImportActions bitflags with struct * core: grandpa: restrict genesis on forced authority set change * core: grandpa: add more docs * core: grandpa: prevent safety violations in Environment::finalize_block * core: grandpa: register finality tracker inherent data provider * core: grandpa: fix tests * node: update runtime blobs * core: grandpa: remove outdated todo * core: aura: fix typo in log message * core: grandpa: check re-finalization is on canonical chain * srml: finality-tracker: fix initialization * node: update runtime wasm * srml: finality-tracker: don't re-initialize config keys
Depends on #1808
Motivation
GRANDPA voting is completely off-chain, so detecting offline validators and changing sets is difficult. Furthermore, changing validator sets is usually done in conjunction with finality -- when finality is stalled, how are we supposed to switch to a new set?
The answer is to fall back onto the block production system: GRANDPA is an asynchronously safe finality gadget, while the block production system is typically safe under synchrony. So if the block authors believe that nothing has been finalized for a sufficiently long time, we can trigger a "forced" set change that allows full nodes to begin to follow the new best set.
By nature, this forks off light clients (including warp sync), who won't be able to function again until aware of a trusted checkpoint after the forced change.
Abstract
Alistair described a scheme for figuring out when finality is stalled on-chain in the GRANDPA paper: https://github.com/w3f/consensus/blob/master/pdf/grandpa.pdf
To paraphrase:
Every block producer puts the highest block number that they see as finalised in their block.
Then any participant sees that if there is an
n
such thatn + 100
n-100
ton
in their best chain have median at mostn-1050
andn
is the minimum that satisfied 1 and 2.Implementation notes
We implement this by adding an
srml-finality-tracker
which adds an inherent for block authors to report the finalized block, and records a rolling window of reports, tracking the median.Whenever the median is sufficiently low (step 2 of above),
srml-finality-tracker
signals the listeners.srml-grandpa
is one such listener: when it is notified of finality lagging, it issues a new forced authority set-change event.forced set changes are applied on block import, unlike regular set changes which are applied on finality.
Forced set changes are also applied after a delay, but upon witnessing the signal the old voters should stop voting, if they were at all. This PR also implements "pausing" of voters.
We introduce an
aux_schema
module for handling persistent data in a consolidated way.TODO:
AuthoritySet
struct