Offline fallback for GRANDPA #1619

rphmeier · 2019-01-30T03:07:08Z

Depends on #1808

Motivation

GRANDPA voting is completely off-chain, so detecting offline validators and changing sets is difficult. Furthermore, changing validator sets is usually done in conjunction with finality -- when finality is stalled, how are we supposed to switch to a new set?

The answer is to fall back onto the block production system: GRANDPA is an asynchronously safe finality gadget, while the block production system is typically safe under synchrony. So if the block authors believe that nothing has been finalized for a sufficiently long time, we can trigger a "forced" set change that allows full nodes to begin to follow the new best set.

By nature, this forks off light clients (including warp sync), who won't be able to function again until aware of a trusted checkpoint after the forced change.

Abstract

Alistair described a scheme for figuring out when finality is stalled on-chain in the GRANDPA paper: https://github.com/w3f/consensus/blob/master/pdf/grandpa.pdf

To paraphrase:

Every block producer puts the highest block number that they see as finalised in their block.

Then any participant sees that if there is an n such that

their best chain is at least length n + 100
the indicators of the last finalised block height of blocks n-100 to n in their best chain have median at most n-1050 and
n is the minimum that satisfied 1 and 2.

Implementation notes

We implement this by adding an srml-finality-tracker which adds an inherent for block authors to report the finalized block, and records a rolling window of reports, tracking the median.

Whenever the median is sufficiently low (step 2 of above), srml-finality-tracker signals the listeners.

srml-grandpa is one such listener: when it is notified of finality lagging, it issues a new forced authority set-change event.

forced set changes are applied on block import, unlike regular set changes which are applied on finality.

Forced set changes are also applied after a delay, but upon witnessing the signal the old voters should stop voting, if they were at all. This PR also implements "pausing" of voters.

We introduce an aux_schema module for handling persistent data in a consolidated way.

TODO:

Add tests for finality-tracker
Add tests for srml-grandpa
Add tests for finality-grandpa AuthoritySet struct
Add tests for finality-grandpa's handling of forced changes
Determine if this PR is breaksconsensus OR breaksapi -- I have been careful to keep storage and APIs backwards compatible but @gavofyork or @bkchr a careful review of the runtime to determine this would be appreciated.

bkchr

I think you need to increase the version of the runtime,

core/finality-grandpa/src/authorities.rs

core/finality-grandpa/src/lib.rs

Co-Authored-By: rphmeier <rphmeier@gmail.com>

…sion again.

gavofyork · 2019-02-27T17:55:44Z

@rphmeier @andresilva how is this looking?

much changed

andresilva · 2019-02-27T18:00:02Z

I think it's ready for another review. I didn't approve the PR because most of the changes since the last review have been made by me and I'd like a sign off from @rphmeier.

rphmeier · 2019-02-27T20:31:04Z

I'll give it another review but not sure I can get to it this week. I'll try on the weekend or Monday.

rphmeier · 2019-02-28T18:24:24Z

core/finality-grandpa/src/environment.rs

+			// This can happen after a forced change (triggered by the finality tracker when finality is stalled), since
+			// the voter will be restarted at the median last finalized block, which can be lower than the local best
+			// finalized block.
+			warn!(target: "afg", "Safety violation detected, tried to finalize {:?} while the current best finalized is {:?}",


I'm not sure it's a safety violation to re-finalize blocks on the canonical chain -- perhaps we can only warn when it does that?

svyatonik · 2019-03-01T08:09:37Z

core/finality-grandpa/src/authorities.rs

+
+				new_set = Some((median_last_finalized, AuthoritySet {
+					current_authorities: change.next_authorities.clone(),
+					set_id: self.set_id + 1,


I'm yet in-progress of trying to understand this PR in details - so probably I'll discover the answer for my question later. But here is the question: is this possible that there will be two adjacent forced-change periods where set_id has increased by 2 without any justification (i.e. set A#1 has changed to set B#2 AND set B#2 was also unable to finalize something AND it has been changed to set C#3 AND only C#3 has generated new justification with set_id=3).

I'm currently relying on the fact that set_id is always accompanied by justification in the #1669 (in this check to be specific). It isn't critical for that code - just wanted to clarify if this is still a rule.

Yes, I believe the situation you described can happen.

@svyatonik worth noting that the offline change boots out light clients, so you can still rely on that assumption as long as you handle offline changes by shutting down the light client.

A checkpointing system is the only way to get light clients back onto the network.

svyatonik · 2019-03-01T08:14:31Z

core/finality-grandpa/src/authorities.rs

@@ -292,31 +436,37 @@ mod tests {

 	#[test]
 	fn changes_iterated_in_pre_order() {
+		// TODO: include forced change and make sure it's iterated last


Guess this TODO is already solved (below in the same test).

andresilva · 2019-03-01T12:35:47Z

When this PR is approved please don't merge it right away since I'm still doing some local testing.

rphmeier · 2019-03-01T18:26:59Z

Contesting the patch label -- this is probably a bit too unstable to backport to earlier testnets safely.

andresilva · 2019-03-05T15:47:52Z

I removed the patch label but we should still backport this and disable the finality tracker module (otherwise it will be harder to backport new fixes for grandpa).

Co-authored-by: André Silva <andre.beat@gmail.com> * skeleton for finality tracker * dispatch events when nothing finalized for a long time * begin integrating finality tracker into grandpa * add delay field to pending change * add has_api_with function to sr_version for querying APIs * partially integrate new force changes into grandpa * implement forced changes * get srml-grandpa compiling * Update core/finality-grandpa/src/authorities.rs Co-Authored-By: rphmeier <rphmeier@gmail.com> * Update core/finality-grandpa/src/authorities.rs Co-Authored-By: rphmeier <rphmeier@gmail.com> * Update core/finality-grandpa/src/authorities.rs Co-Authored-By: rphmeier <rphmeier@gmail.com> * remove explicit dependence on CoreApi * increase node runtime version * integrate grandpa forced changes into node runtime * add some tests to finality-tracker * integrate finality tracking into node-runtime * test forced-change logic * test forced changes in the authority-set handler * kill some unneeded bounds in client * test forced-changes in finality-grandpa and fix logic * build wasm and finality-tracker is no-std * restart voter on forced change * allow returning custom error type from lock_import_and_run * extract out most DB logic to aux_schema and use atomic client ops * unify authority set writing * implement set pausing * bump runtime version * note on DB when we pause. * core: grandpa: integrate forced changes with multiple pending standard changes * core: grandpa: fix AuthoritySet tests * runtime: bump impl_version * core: clear pending justification requests after forced change import * srml: finality-tracker: use FinalizedInherentData * core: log requests for clearing justification requests * core, node: update runtimes * core: grandpa: fix tests * core: grandpa: remove todos and add comments * core: grandpa: use has_api_with from ApiExt * core: fix tests * core: grandpa: remove unnecessary mut modifier * core: replace PostImportActions bitflags with struct * core: grandpa: restrict genesis on forced authority set change * core: grandpa: add more docs * core: grandpa: prevent safety violations in Environment::finalize_block * core: grandpa: register finality tracker inherent data provider * core: grandpa: fix tests * node: update runtime blobs * core: grandpa: remove outdated todo * core: aura: fix typo in log message * core: grandpa: check re-finalization is on canonical chain * srml: finality-tracker: fix initialization * node: update runtime wasm * srml: finality-tracker: don't re-initialize config keys

rphmeier added 8 commits January 29, 2019 14:56

skeleton for finality tracker

8f9808a

dispatch events when nothing finalized for a long time

4e28e99

begin integrating finality tracker into grandpa

37af080

add delay field to pending change

1185041

add has_api_with function to sr_version for querying APIs

1a48669

partially integrate new force changes into grandpa

f84de0e

implement forced changes

5b53bea

get srml-grandpa compiling

db20c36

rphmeier added A3-in_progress Pull request is in progress. No review needed at this stage. B0-patchthis labels Jan 30, 2019

rphmeier requested review from andresilva and gavofyork January 30, 2019 03:07

bkchr reviewed Jan 30, 2019

View reviewed changes

bkchr and others added 8 commits January 30, 2019 11:38

Update core/finality-grandpa/src/authorities.rs

9df9464

Co-Authored-By: rphmeier <rphmeier@gmail.com>

Update core/finality-grandpa/src/authorities.rs

cc5d70b

Co-Authored-By: rphmeier <rphmeier@gmail.com>

Update core/finality-grandpa/src/authorities.rs

3e78042

Co-Authored-By: rphmeier <rphmeier@gmail.com>

remove explicit dependence on CoreApi

68c94bf

increase node runtime version

1fbb26c

integrate grandpa forced changes into node runtime

ecc489b

add some tests to finality-tracker

a75d5fd

integrate finality tracking into node-runtime

b9246b0

rphmeier force-pushed the rh-grandpa-offline-fallback branch from b51cd5a to b9246b0 Compare January 30, 2019 16:52

Merge branch 'master' into rh-grandpa-offline-fallback

36dfdf7

rphmeier mentioned this pull request Jan 30, 2019

Transaction signing payload is hashed first #1624

Merged

andresilva mentioned this pull request Jan 31, 2019

Support multiple pending authority set changes #1497

Closed

rphmeier added 5 commits January 31, 2019 14:01

test forced-change logic

619c7d4

test forced changes in the authority-set handler

5e0a57a

kill some unneeded bounds in client

81ad7d7

test forced-changes in finality-grandpa and fix logic

a631a8d

Merge branch 'master' into rh-grandpa-offline-fallback: bump node ver…

1ce7d40

…sion again.

andresilva added 2 commits February 26, 2019 18:31

Merge branch 'master' into rh-grandpa-offline-fallback

0976224

node: update runtime blobs

de63d65

andresilva added A0-please_review Pull request needs code review. and removed A4-gotissues labels Feb 26, 2019

rphmeier commented Feb 28, 2019

View reviewed changes

svyatonik reviewed Mar 1, 2019

View reviewed changes

andresilva added 4 commits March 1, 2019 11:36

core: grandpa: remove outdated todo

b5df296

core: aura: fix typo in log message

9799356

core: grandpa: check re-finalization is on canonical chain

fbd995b

Merge branch 'master' into rh-grandpa-offline-fallback

815410d

rphmeier mentioned this pull request Mar 1, 2019

Optimize finality tracker to handle much longer period #1908

Closed

andresilva removed the B0-patchthis label Mar 5, 2019

andresilva added 4 commits March 5, 2019 14:53

srml: finality-tracker: fix initialization

d034c36

Merge branch 'master' into rh-grandpa-offline-fallback

a391918

node: update runtime wasm

93b98f7

srml: finality-tracker: don't re-initialize config keys

cbb959f

andresilva force-pushed the rh-grandpa-offline-fallback branch from d435d9f to cbb959f Compare March 5, 2019 15:28

andresilva approved these changes Mar 5, 2019

View reviewed changes

andresilva merged commit 4399e57 into master Mar 5, 2019

andresilva deleted the rh-grandpa-offline-fallback branch March 18, 2019 23:12

andresilva mentioned this pull request Sep 26, 2022

Member request polkadot-fellows/seeding#5

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Offline fallback for GRANDPA #1619

Offline fallback for GRANDPA #1619

rphmeier commented Jan 30, 2019 •

edited

Loading

bkchr left a comment

gavofyork commented Feb 27, 2019

andresilva commented Feb 27, 2019

rphmeier commented Feb 27, 2019

rphmeier Feb 28, 2019

svyatonik Mar 1, 2019

andresilva Mar 1, 2019

rphmeier Mar 1, 2019 •

edited

Loading

svyatonik Mar 1, 2019

andresilva commented Mar 1, 2019 •

edited

Loading

rphmeier commented Mar 1, 2019

andresilva commented Mar 5, 2019

Offline fallback for GRANDPA #1619

Offline fallback for GRANDPA #1619

Conversation

rphmeier commented Jan 30, 2019 • edited Loading

Motivation

Abstract

Implementation notes

bkchr left a comment

Choose a reason for hiding this comment

gavofyork commented Feb 27, 2019

andresilva commented Feb 27, 2019

rphmeier commented Feb 27, 2019

rphmeier Feb 28, 2019

Choose a reason for hiding this comment

svyatonik Mar 1, 2019

Choose a reason for hiding this comment

andresilva Mar 1, 2019

Choose a reason for hiding this comment

rphmeier Mar 1, 2019 • edited Loading

Choose a reason for hiding this comment

svyatonik Mar 1, 2019

Choose a reason for hiding this comment

andresilva commented Mar 1, 2019 • edited Loading

rphmeier commented Mar 1, 2019

andresilva commented Mar 5, 2019

rphmeier commented Jan 30, 2019 •

edited

Loading

rphmeier Mar 1, 2019 •

edited

Loading

andresilva commented Mar 1, 2019 •

edited

Loading