Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

Slash and prove membership of prior sessions #2970

Merged
merged 35 commits into from
Jul 8, 2019
Merged

Slash and prove membership of prior sessions #2970

merged 35 commits into from
Jul 8, 2019

Conversation

rphmeier
Copy link
Contributor

@rphmeier rphmeier commented Jun 27, 2019

Fixes #1810

Background

We want Substrate chains to have a property known as "accountable safety": that misbehavior leads to a slash of bonded funds. The srml-staking library keeps those funds bonded for a certain number of eras.

Storing all historical validator sets and session keys for the bonding period would be expensive. So instead we only keep a trie root which can be used to prove historical ownership of keys. These trie roots are pruned after some time.

When we slash a validator, we want to also slash their nominators. Furthermore, we want to slash only the nominators at the time of misbehavior. And only for the amount that they were nominated on the misbehaving validator at that time.

Implementation

The staking module decides how many prior sessions we keep, but prior sessions themselves are managed by a session::historical module. Pruning is O(n) in the number of sessions to prune.

What the session::historical trie roots reference is a mapping of (KeyTypeId, Vec<u8>) -> FullIdentification. It is populated with all the of session keys of that session and the FullIdentification of the owning validator.

The FullIdentification in practice is a staking::Exposure object that tells us who to slash when this key has misbehaved. This lets us slash the correct group of people for the right amounts.

I moved session key deduplication to use raw storage APIs so we could make sure all trie keys were prefixed with a constant. This means that they no longer interfere with the growth of the rest of the trie.

There is a generic trait KeyOwnerProofSystem for key ownership proofs. Consensus modules should use something along these lines (pseudocode):

trait Trait {
    // if your key type implements `AsRef<[u8]>` _and_ the `AsRef` value is the same as the encoded
    // then use that instead of `Vec<u8>`.
    type KeyOwnerSystem: KeyOwnerProofSystem<(KeyTypeId, Vec<u8>)>;
}

impl<T: Trait> Module<Trait> {
    // called off-chain.
    fn generate_misbehavior_report_call() -> MyMisbehaviorReportCall {
        // invoke `KeyOwnerSystem::prove((key_type_id, key_data))` and put the proof in the call.
    }

    // called on-chain.
    fn check_misbehavior(call: MyMisbehaviorReportCall) {
         let session_key = call.misbehaving_key;
         let proof = call.membership_proof;

         let to_punish = match KeyOwnerSystem::check_proof((key_type_id, session_key), proof) {
             None => return,
             Some(x) => x,
         };

         // check that the key really did misbehave.

         // if so, invoke something like `slashing::slash(to_punish, misbehavior_id)`.
    }
}

@rphmeier rphmeier added the A3-in_progress Pull request is in progress. No review needed at this stage. label Jun 27, 2019
@rphmeier rphmeier added A0-please_review Pull request needs code review. and removed A3-in_progress Pull request is in progress. No review needed at this stage. A4-gotissues labels Jul 3, 2019
@rphmeier rphmeier requested a review from marcio-diaz July 3, 2019 17:06
@rphmeier rphmeier mentioned this pull request Jul 5, 2019
fn get_raw(&self, _: usize) -> &[u8] { unsafe { &std::mem::transmute::<_, &[u8; 8]>(&self.0)[..] } }
fn get<T: Decode>(&self, _: usize) -> Option<T> { self.0.using_encoded(|mut x| T::decode(&mut x)) }
fn get_raw(&self, _: KeyTypeId) -> &[u8] { unsafe {
std::slice::from_raw_parts(&self.0 as *const _ as *const u8, std::mem::size_of::<u64>())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh :D
At least format it better:

	fn get_raw(&self, _: KeyTypeId) -> &[u8] { 
		unsafe {
			std::slice::from_raw_parts(&self.0 as *const _ as *const u8, std::mem::size_of::<u64>())
		}
	}

core/sr-primitives/src/traits.rs Show resolved Hide resolved
@@ -36,6 +36,8 @@ pub use node_codec::NodeCodec;
pub use trie_db::{Trie, TrieMut, DBValue, Recorder, Query};
/// Various re-exports from the `memory-db` crate.
pub use memory_db::{KeyFunction, prefixed_key};
/// Various re-exports from the `hash-db` crate.
pub use hash_db::{HashDB as HashDBT};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub use hash_db::{HashDB as HashDBT};
pub use hash_db::HashDB as HashDBT;

srml/session/src/historical.rs Outdated Show resolved Hide resolved
srml/session/src/historical.rs Outdated Show resolved Hide resolved
srml/session/src/historical.rs Outdated Show resolved Hide resolved
<CachedObsolete<T>>::remove(&ending);

if let Some((new_validators, old_exposures))
= <I as OnSessionEnding<_, _>>::on_session_ending(ending, applied_at)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let (new_validators, old_exposures) = <I as OnSessionEnding<_, _>>::on_session_ending(ending, applied_at);

// every session from `ending+1 .. applied_at` now has obsolete `FullIdentification`
// now that a new validator election has occurred.
// we cache these in the trie until those sessions themselves end.
for obsolete in (ending + 1) .. applied_at {
 <CachedObsolete<T>>::insert(obsolete, &old_exposures);
}
			
Some(new_validators)

Is maybe a little bit better at reading?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let (new_validators, old_exposures) = <I as OnSessionEnding<_, _>>::on_session_ending(ending, applied_at)

I assume you wanted a ? there?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ohh, yes.

rphmeier and others added 3 commits July 5, 2019 12:34
Co-Authored-By: Bastian Köcher <bkchr@users.noreply.github.com>
Copy link
Contributor

@dvc94ch dvc94ch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@cheme cheme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put a few safety check remarks (I do not know everything well).
Regarding trie stuff, the mechanism seems pretty heavy, but I do not see a reason with this implementation that will make a future switch to child_trie impossible, it really depend on the size of this trie (if it is a few entry the system here seems fine).

In srml/session/src/lib.rs , I wonder if the DEDUP_KEY prefixed storage couldn't simply be a traditional mapping srml storage (the code seems pretty close to what the macro produce).

fn get_raw(&self, _: usize) -> &[u8] { unsafe { &std::mem::transmute::<_, &[u8; 8]>(&self.0)[..] } }
fn get<T: Decode>(&self, _: usize) -> Option<T> { self.0.using_encoded(|mut x| T::decode(&mut x)) }
fn get_raw(&self, _: KeyTypeId) -> &[u8] { unsafe {
std::slice::from_raw_parts(&self.0 as *const _ as *const u8, std::mem::size_of::<u64>())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it lead to issue in scenario where we got a combination of wasm execution and native exectution in a test case (for native envt not in le (wasm is le, i am not sure anymore))?
Or is get_raw only use locally to one of those execution?

Copy link
Contributor Author

@rphmeier rphmeier Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, not any more so than the old code

impl OpaqueKeys for UintAuthorityId {
fn count() -> usize { 1 }
type KeyTypeIds = std::iter::Cloned<std::slice::Iter<'static, KeyTypeId>>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

wonder if non cloned iter is possible here and could be more flexible? (totally not sure just wondering).

Copy link
Contributor Author

@rphmeier rphmeier Jul 5, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could do an iter::Once chain but I don't see why it matters. it should be fast regardless

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes it is just the key type id, please forget about that

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be replaced by std::iter::Copied

{
let i = Inner::find_author(digests)?;
ensure!(
Self::key_owner(id, key).map_or(true, |owner| &owner == who),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the &owner == who condition means that we allow exchanging key (probably a very corner case)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what it means is that we're allowed to set key A without changing key B.

validators.get(i as usize).map(|k| k.clone())
}
}
// ensure keys are without duplication.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this code as update ownership only if key change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, needs to be moved up a bit

fn dedup_trie_key<T: Trait, K: Encode>(key: &K) -> [u8; 32 + DEDUP_KEY_LEN] {
key.using_encoded(|s| {
// take at most 32 bytes from the hash of the value.
let hash = <T as system::Trait>::Hashing::hash(s);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to hash the key (are validatorId unique or keytypeid unique)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, we don't strictly-speaking have to hash the key. but it may make it harder to bias the tree

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes really depends on the nature of the keys.

// now that a new validator election has occurred.
// we cache these in the trie until those sessions themselves end.
for obsolete in (ending + 1) .. applied_at {
<CachedObsolete<T>>::insert(obsolete, &old_exposures);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is pretty heavy redundancy over old_exposures but I have no idea if the case where there is multiple index is frequent. If frequent, maybe an intermeediate maping 'session->range' and 'range->cacheobsolet' can be good.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comment -- in practice here it shouldn't be an issue because (ending + 1) .. applied_at will have only 1 item in the current runtime. The range thing is a bit harder to GC.

/// Mapping from historical session indices to session-data root hash.
HistoricalSessions get(historical_root): map SessionIndex => Option<T::Hash>;
/// Queued full identifications for queued sessions whose validators have become obsolete.
CachedObsolete get(cached_obsolete): map SessionIndex
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, cachedobsolete needs to have the session to prove something, it contains key value needed to rebuild the trie with generate trie for either building a proof (may use current validators also) or verifying a proof.
So for other comments I expect a cached obsolete to be rather big in content.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should really only ever hold one or zero items, since we buffer at most 1 validator set in the current implementation. it only ever needs to hold something when we've done a validator election

@rphmeier
Copy link
Contributor Author

rphmeier commented Jul 5, 2019

I do not see a reason with this implementation that will make a future switch to child_trie impossible, it really depend on the size of this trie (if it is a few entry the system here seems fine).

we need trie proofs for child trie, basically. other than that we could migrate it over.

@cheme
Copy link
Contributor

cheme commented Jul 5, 2019

Do https://github.com/paritytech/substrate/pull/2209/files#r300701049 reply to this need (before that the proof was using parent + child query, after this pr the proof will be parent proof and child proof in two time)?

@rphmeier
Copy link
Contributor Author

rphmeier commented Jul 5, 2019

I don't think it helps really. We'd need a method to generate merkle proofs of a child storage within the runtime.

@cheme
Copy link
Contributor

cheme commented Jul 5, 2019

Oh yes this is not creating the proof from within the runtime, would need some ext function, certainly more work than it seems indeed. Does not seems doable if we need to generate for current block (uncommited block).

@rphmeier
Copy link
Contributor Author

rphmeier commented Jul 5, 2019

Does not seems doable if we need to generate for current block (uncommited block).

well, it would have to force a root computation. and we would then kill the child trie right after. but TBH the MemoryDB approach isn't that bad and will scale a little.

Copy link
Contributor

@andresilva andresilva left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changes lgtm, but I'm not very familiar with these modules.

let up_to = rstd::cmp::min(up_to, end);

if up_to < start {
return // out of bounds. harmless.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return // out of bounds. harmless.
return; // out of bounds. harmless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's fine to omit, no?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah nitpick, I just like to have statements terminate with ;

@gavofyork
Copy link
Member

compile error should go once #2819 is merged.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Slash for previous eras
8 participants