-
Notifications
You must be signed in to change notification settings - Fork 2.6k
Conversation
/// applied in the runtime after those N blocks have passed. | ||
/// | ||
/// The consensus protocol will coordinate the handoff externally. | ||
pub trait Api<B: BlockT> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a convention to use a CAPS_CASE
version of the ApiTraitName
as an ApiId
("API identifier") that allows for extensible runtime versioning for each API:
pub mod id {
use super::ApiId;
pub const BLOCK_BUILDER: ApiId = *b"blkbuild";
pub const TAGGED_TRANSACTION_QUEUE: ApiId = *b"validatx";
pub const METADATA: ApiId = *b"metadata";
}
decl_apis! {
pub trait Core<Block: BlockT, AuthorityId> { ... }
pub trait Metadata<Data> { ... }
pub trait TaggedTransactionQueue<Block: BlockT> { ... }
pub trait BlockBuilder<Block: BlockT> ExtraClientSide <OverlayedChanges> { ... }
}
A terse name like Api
makes this impractical or error prone. I would suggest Grandpa
/GrandpaInterface
or some other such name to prevent clashes and confusion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some documentation on the decl_apis!
macro would be helpful; there is no way to know about this convention otherwise.
On the naming, I intended the name to be used like this:
extern crate grandpa_primitives as grandpa;
fn uses_api<T: grandpa::Api>(T) { }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gave this a first look (along with grandpa and the rest of the substrate grandpa code) and left some minor comments.
pub(crate) struct Status<H, N> { | ||
/// Whether internal changes were made. | ||
pub(crate) changed: bool, | ||
/// `Some` when underlying authority set has changed, containign the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
containing
change.effective_number(), | ||
change.canon_height.clone(), | ||
)) | ||
.unwrap_or_else(|i| i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to signal multiple changes on the same block? Because in this case the last change to be signaled would take precedence.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no, only once per block
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rphmeier are we ensuring that anywhere or is that just by convention? I couldn't find anything (in this PR) that rejects a change at a certain block if a change is already scheduled for that block. Maybe we should add such a restriction to prevent undefined behavior? (e.g. we created a change block at 10+30 and then one at 20+20 - which one is supposed to happen at block 40?)
} | ||
// Ok(None) can be returned when `block` is after `limit`. That might cause issues. | ||
// might be better to return the header itself in this (rare) case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When block
is beyond limit
can't we vote on the limit
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's not a chain containing block
, so it wouldn't be correct under the contract of the function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and this would also only happen if honest authorities are not voting according to this same rule
/// Status of the set after changes were applied. | ||
pub(crate) struct Status<H, N> { | ||
/// Whether internal changes were made. | ||
pub(crate) changed: bool, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't we remove this and use the Option
below to track whether changes ocurred? The only case where changed
is true but new_set_block
is None
is when the block imported finalizes a change at a given height which is not part of the canon chain, the change is discarded but the authority set remains unchanged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if internal changes are made (such as pruning a non-canonicalized pending transition), we have to write to disk for correctness.
core/finality-grandpa/src/lib.rs
Outdated
{ | ||
self.call_api_at( | ||
&BlockId::hash(header.parent_hash().clone()), | ||
::fg_primitives::PENDING_CHANGE_CALL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fg_primitives::PENDING_CHANGE_CALL,
Also, do we need to filter prevotes and precommits for blocks after the first pending change limit? Honest voters shouldn't be casting those votes/commits. |
We could do this filtering but I am assuming that since honest voters won't cast such votes that it is unnecessary. |
core/client/src/client.rs
Outdated
@@ -911,7 +916,7 @@ impl<B, E, Block, RA> CallApiAt<Block> for Client<B, E, Block, RA> where | |||
B: backend::Backend<Block, Blake2Hasher>, | |||
E: CallExecutor<Block, Blake2Hasher> + Clone + Send + Sync, | |||
Block: BlockT<Hash=H256>, | |||
RA: CoreAPI<Block> | |||
RA: Send + Sync, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RA: Send + Sync, | |
RA: CoreAPI<Block>, |
This should not be changed!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? everything compiles without?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it compiles without it, but it enforces that the CoreAPI
is implemented. The CoreAPI
is required to be implemented by each runtime. It internally also uses a function from the CoreAPI
, the initialise_block
function. This function is currently not called by the corresponding trait function, but putting the CoreAPI
as an requirement, makes sure the function is present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i will re-introduce and add a comment to that effect.
|
||
/// Call the given api function with strong arguments at the given block | ||
/// and returns the decoded result. | ||
fn call_api_at_strong<In: Encode, Out: Decode>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This trait should just be used by the runtime api stuff and that does not require this function. Please remove it.
/// applied in the runtime after those N blocks have passed. | ||
/// | ||
/// The consensus protocol will coordinate the handoff externally. | ||
pub trait GrandpaApi<B: BlockT> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This declaration should be placed into the decl_runtime_apis!
macro.
When using the macro, please rename B: BlockT
to Block: BlockT
. That is currently a restriction of the macro.
core/finality-grandpa/src/lib.rs
Outdated
B: Backend<Block, Blake2Hasher> + 'static, | ||
E: CallExecutor<Block, Blake2Hasher> + 'static + Clone + Send + Sync, | ||
DigestFor<Block>: Encode, | ||
RA: Send + Sync, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RA: Send + Sync, | |
RA: GrandpaApi<Block>, |
core/finality-grandpa/src/lib.rs
Outdated
fn genesis_authorities(&self) -> Result<Vec<(AuthorityId, u64)>, ClientError> { | ||
use runtime_primitives::traits::Zero; | ||
|
||
self.call_api_at_strong( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be changed to self.runtime_api().grandpa_authorities(&BlockId::Number(NumberFor::<Block>::zero())
.
core/finality-grandpa/src/lib.rs
Outdated
fn scheduled_change(&self, header: &Block::Header) | ||
-> Result<Option<ScheduledChange<NumberFor<Block>>>, ClientError> | ||
{ | ||
self.call_api_at_strong( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be changed as above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merge conflict needs resolving. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just some minor wording things.
@@ -0,0 +1,21 @@ | |||
[package] | |||
name = "substrate-fg-primitives" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we please use substrate-finality-grandpa-primitives
as the crate name? I know it is verbose, but fg
isn't really a commonly used shorthand (other than sr-
for e.g.) and it makes it harder to understand what the crate is about when included somewhere (extern crate substrate_finality_grandpa_primitves
is a lot clearer than extern crate substrate_fg_primitives
) . Also matching path and crate name is a helpful convention we use everywhere else...
use client::runtime_api::ApiId; | ||
|
||
/// ApiId for the GrandpaApi trait. | ||
pub const GRANDPA_API: ApiId = *b"fgrandpa"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any particular reason this has a prefix f
the other defined names don't have?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to make it 8 characters: "finality-grandpa"
change.effective_number(), | ||
change.canon_height.clone(), | ||
)) | ||
.unwrap_or_else(|i| i); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rphmeier are we ensuring that anywhere or is that just by convention? I couldn't find anything (in this PR) that rejects a change at a certain block if a change is already scheduled for that block. Maybe we should add such a restriction to prevent undefined behavior? (e.g. we created a change block at 10+30 and then one at 20+20 - which one is supposed to happen at block 40?)
/// Apply or prune any pending transitions. Provide a closure that can be used to check for the | ||
/// finalized block with given number. | ||
/// | ||
/// When the set has changed, the return value will be `Ok(Some((H, N)))` which is the cnaonical |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*canonical
}; | ||
|
||
if let Err(e) = write_result { | ||
warn!(target: "finality", "Failed to write updated authority set to disk. Bailing."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so we have the trace-target "finality" and "afg" in the same struct. Maybe we want to stick with one?
|
||
if let Err(e) = write_result { | ||
warn!(target: "finality", "Failed to write updated authority set to disk. Bailing."); | ||
warn!(target: "finality", "Node is in a potentially inconsistent state."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about including the error in this warn already?
* Fix building master * Fix westend chainspec
Closes #875
The rough idea behind changing authority sets in GRANDPA is that at some point, we obtain agreement for some maximum block height that the current set can finalize, and once a block with that height is finalized the next set will pick up finalization from there.
Technically speaking, this would be implemented as a voting rule which says, "if there is a signal for a change in N blocks in block B, only vote on chains with length
NUM(B) + N
if they contain B". This conditional-inclusion logic is complex to compute because it requires looking arbitrarily far back in the chain.Instead, we keep track of a list of all signals we've seen so far, sorted ascending by the block number they would be applied at. We never vote on chains with number higher than the earliest handoff block number (this is
num(signal) + N
). When finalizing a block, we either apply or prune any signaled changes based on whether the signaling block is included in the newly-finalized chain.Implementation details:
This PR adds a
BlockImport
wrapper which queries the runtime to see if the digest contains any signals of set changes. All blocks imported must go through this wrapper for correctness.When voting on blocks, the voters won't vote beyond the height of the lowest pending change.
When finalizing blocks, we check to see if any changes are meant to be applied. If the change is applied, we restart the voter with the new authority set by exiting from the current voter early with an
ExitOrError
enum. The voter is wrapped in afuture::loop_fn
which starts a new voter on intentional exit, using the new authority set.Remaining work for this PR: