Skip to content
This repository has been archived by the owner on Mar 24, 2023. It is now read-only.

Add state representation #32

Merged
merged 49 commits into from
Jun 16, 2020
Merged

Add state representation #32

merged 49 commits into from
Jun 16, 2020

Conversation

@adlerjohn adlerjohn added documentation Improvements or additions to documentation enhancement New feature or request labels May 19, 2020
@adlerjohn adlerjohn added this to the Pre-implementation draft milestone May 19, 2020
@adlerjohn adlerjohn self-assigned this May 19, 2020
| -------------------- | ------------------- | ---------------------------------------------------------------------------------------- |
| `balance` | `uint64` | Coin balance. |
| `isDelegating` | `bool` | Whether this account is delegating its stake or not. |
| `delegatedValidator` | [Address](#address) | _Optional._ The validator this is account is delegating to. |
Copy link
Member

@liamsi liamsi May 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't accounts delegate to several validators? Also maybe:

Suggested change
| `delegatedValidator` | [Address](#address) | _Optional._ The validator this is account is delegating to. |
| `delegatedValidator` | [Address](#address) | _Optional._ The validator this account is delegating to. |

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because that would add complexity (you'd have to hold a variable number of validators and the number of coins each of them is delegated), and if someone wants to delegate their coins to two validators they can just...split up their coins into two accounts.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scheme (which isn't obvious from the data structures, but will be written in the consensus rules) is: "all the coins in an account are either delegated or not; if they're delegated they must be delegated to a single validator."

Copy link
Member

@liamsi liamsi May 20, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a user (holding coins) and a UX perspective, splitting up coins into separate accounts whenever you want to delegate to more than one validator also adds complexity, but one the human (not on the software).

Note that in the few delegations based PoS systems that launched, you can usually delegate to one or more validators. This is also true in Cosmos / the cosmos-sdk. Most users I know delegate to several validators to keep the network decentralized (and to hedge their risks).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I don't see why this can't be abstracted away at the application layer, rather than having to be pushed to the user layer. Moving a portion of your coins to a new address then delegating them (as well as the reverse) can be done just as seamlessly in a wallet---one button press. Users don't even have to know it's happening! It's basically UTXO management; most Bitcoin wallets don't present to lay users each individual UTXO, just their total balance.

Allowing an arbitrary number of delegations per account means fraud proofs are more complex and expensive.

That being said, given that we're using the accounts data model for now, one downside of essentially emulating UTXOs for delegating stake is that the nonce of empty accounts still needs to be kept around forever (which reminds me, a nonce field needs to be added).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that it would actually be more intuitive from a UX perspective to require users to split their coins into different accounts to delegate to different validators. It's the logical purpose of accounts - the same way you might have multiple savings accounts for different purposes.

It seems more complicated from a UX perspective to have users enter each validator they want to delegate to and the amount in some kind of table.

Copy link
Member

@liamsi liamsi Jun 10, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the logical purpose of accounts - the same way you might have multiple savings accounts for different purposes.

But here you'd have multiple accounts for one purpose: namely delegating to validators. Having a different account per validator would be like opening a broker account per same type of investment (or one "account" per stock or bond you buy).

rationale/distributing_rewards.md Outdated Show resolved Hide resolved
specs/consensus.md Outdated Show resolved Hide resolved
specs/data_structures.md Outdated Show resolved Hide resolved
@adlerjohn adlerjohn linked an issue Jun 11, 2020 that may be closed by this pull request
@adlerjohn
Copy link
Member Author

Major changes since last review:

  1. Changed tables to code snippets for consensus rules.
  2. Use state subtrees to store accounts, active validator set, active validator count, and inactive validator set. The latter is used to make all delegation operations on inactive validators quite a bit cheaper (less hashing and guaranteed parallelization of hashing).

| ---------------------------- | ---------------- | ------ |
| `ACCOUNTS_SUBTREE_ID` | `StateSubtreeID` | `0x01` |
| `VALIDATORS_SUBTREE_ID` | `StateSubtreeID` | `0x02` |
| `VALIDATOR_COUNT_SUBTREE_ID` | `StateSubtreeID` | `0x03` |
Copy link
Member

@liamsi liamsi Jun 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be part of the validator subtree (e.g the left-most branch in the validator subtree contains the count)? Why do we need this again? Isn't that implicitly given by the number of leaves in the subtree that aren't default values (i.e. no active validators).

Copy link
Member Author

@adlerjohn adlerjohn Jun 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It could, but mangling multiple leaf types into a single tree (or subtree) makes things more complicated. E.g. you'd need subtree-specific logic if you want to parallelize updating the validator set and the validator count state. If they're in separate subtrees then you can have a single global whole-SMT-level process that handles updating subtrees in parallel.

The validator count is needed not for light nodes knowing they've downloaded the entire validator set, but for proving with a fraud proof that the number of active validators exceeds the maximum.

See other comment further down for some more on this.

| name | type | value |
| ---------------------------- | ---------------- | ------ |
| `ACCOUNTS_SUBTREE_ID` | `StateSubtreeID` | `0x01` |
| `VALIDATORS_SUBTREE_ID` | `StateSubtreeID` | `0x02` |
Copy link
Member

@liamsi liamsi Jun 13, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In tendermint, the header references current and next validators:
https://github.com/tendermint/tendermint/blob/206c814a8e64cb4b9eb2abbb2fdadc6933b28584/types/block.go#L352-L353

Should we have two subtrees for that? Or, further split the validator subtree? The number of validators should easily fit into that tree in any case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't needed with immediate execution. Regardless, the validator set for the current block is the next validator set of the previous block, so we don't need to maintain two trees. Only the next validator set is needed to be stored in the tree at any given time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only the next validator set is needed to be stored in the tree at any given time.

That makes sense. I'm wondering if there is a reason in tendermint for having both validator sets referenced. Might have to do with deferred execution (related tendermint/tendermint#2483) or maybe it is just for convenience 🤔

Copy link
Member

@liamsi liamsi Jun 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless, the validator set for the current block is the next validator set of the previous block,

How would a light client verify this property efficiently if only the next valset is stored & merkelized? Doesn't it need a commit to the next valst (of the previous block) that can be easyily verified on the current block (e.g. via a root included int the header) without recomputing the cur vals?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Transcluding conversation from Slack:

Under immediate execution, the next validator set is actually committed to in the last intermediate state root's active validator set subtree.

It might be worth it to have a dedicated field in the block header for this, and just making sure it matches up with the last intermediate state root's active validator subtree root.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@buchmann clarified here why both hashes are part of the header (and not just the state root/tree) in tendermint:
celestiaorg/celestia-core#3 (comment)

| `isDelegating` | `bool` | Whether this account is delegating its stake or not. Mutually exclusive with `isValidator`. |
| `delegationInfo` | [Delegation](#delegation) | _Optional_, only if `isDelegating` is set. Delegation info. |

In the accounts tree, accounts (i.e. leaves) are keyed by the [hash](#hashdigest) of their [address](#address).
In the accounts subtree, accounts (i.e. leaves) are keyed by the [hash](#hashdigest) of their [address](#address). The first byte is then replaced with `ACCOUNTS_SUBTREE_ID`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Effectively, this means that the hash used in the subtree returns 31 bytes.

Copy link
Member Author

@adlerjohn adlerjohn Jun 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not the hash function in general, but specifically how the key is calculated, yes.

Copy link
Member

@liamsi liamsi Jun 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we say it's the hash function, this is easier to formalize. e.g. we can say the key for an account in the set Accs tree is computed as:
key := subtree_id || hash_st(address),
where hash_st could be any cryptographic hash function with 31 byte/248 bit output: hash_st: Accs -> {0,1}^248

Copy link
Member Author

@adlerjohn adlerjohn Jun 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. I'm not really a fan because that would mean we need 1) a different hashing function for each subtree (not great, not terrible) and 2) to slice off the first byte of every single hashing operation. That cost will add up quickly, especially if doing proofs in smart contracts.

I'd much prefer just changing how the keys are calculated, which is a one-time calculation per leaf.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO It's just another way to writing down the same thing though (of course in 2) the slicing off could be done more efficiently by replacing the 1st byte instead; but this is just an implementation detail).

Delegation objects represent a delegation. They have two statuses:
1. `Bonded`: This delegation is enabled for a `Queued` _or_ `Bonded` validator. Delegations to a `Queued` validator can be withdrawn immediately, while delegations for a `Bonded` validator must be unbonded first.
1. `Unbonding`: This delegation is unbonding. It will remain in this status for at least `UNBONDING_DURATION` blocks, and while unbonding may still be slashed. Once the unbonding duration has expired, the delegation can be withdrawn.
Since the [validator set](#validator) is stored in a Sparse Merkle Tree, there is no compact way of proving that the number of active validators exceeds `MAX_VALIDATORS` without keeping track of the number of active validators. There is only a single leaf in the active validator count subtree, which is keyed with zero (i.e. `0x0000000000000000000000000000000000000000000000000000000000000000`), and the first byte replaced with `VALIDATOR_COUNT_SUBTREE_ID`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo this should be part of validator subtree:
f1cba75#r439773622

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unlike moving the inactive validator set into the accounts subtree, I might be okay with moving the validator count into the active validators subtree because it's guaranteed to be in a fixed location.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it semantically belongs to the validator subtree and also a single value tree seems weird.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, will migrate the count into the active validators subtree.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 4d87996.

| `lastBlockID` | [BlockID](#blockid) | Previous block's ID. |
| `lastCommitRoot` | [HashDigest](#hashdigest) | Previous block's Tendermint commit root. |
| `consensusRoot` | [HashDigest](#hashdigest) | Merkle root of [consensus parameters](#consensus-parameters) for this block. |
| `stateCommitment` | [HashDigest](#hashdigest) | The [state root](#state) after this block's transactions are applied. |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is clear that we will stick with deferred execution for a while (probably after launch even), should the first version of the spec adhere to this too? (then this would be state after the last block's transactions are applied).

Whenever (if) we decide to drop deferred execution, we can update the spec to another version.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need immediate execution for fee burning (which IMO is 100% necessary, for testnet at the latest). Minting new coins is a source, and there are two ways to have sinks:

  1. state rent
  2. fee burning

We can't really do the former for obvious reasons, so we're left with the latter. Sinks create intrinsic demand for the currency, a property that cannot be accomplished any way else (which is why taxes---a sink---are still a thing even if the government can print money---a source). And fee burning requires immediate execution.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression, we will have a first test-net without switching to immediate execution. If we will use deferred execution only in dev-nets, it's probably OK that the spec and the implementation diverge for a while I guess.

We might have to re-discuss this when we are closer to a first dev-net iteration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By testnet above I guess I meant "testnet with mainnet configuration." We could have more public testnets (e.g. PoA ones) before that, along with non-feature-complete devnets, that don't need immediate execution because there are no incentives and we can just assume the single operator isn't making invalid blocks.

@adlerjohn adlerjohn requested a review from liamsi June 15, 2020 21:41
| -------------------------------- | ---------------- | ------ |
| `ACCOUNTS_SUBTREE_ID` | `StateSubtreeID` | `0x01` |
| `ACTIVE_VALIDATORS_SUBTREE_ID` | `StateSubtreeID` | `0x02` |
| `INACTIVE_VALIDATORS_SUBTREE_ID` | `StateSubtreeID` | `0x03` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these in relation to the current block or a commitment to the next block (i.e. the next block will be signed by the vals in ACTIVE_VALIDATORS_SUBTREE?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These represent the most current state after each and every transaction. But the state root committed to in the block header is after all transactions in the block have been applied (under immediate execution).

Copy link
Member

@liamsi liamsi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing work @adlerjohn 💪

@adlerjohn adlerjohn merged commit 99732e7 into master Jun 16, 2020
@adlerjohn adlerjohn deleted the adlerjohn-state_representation branch June 16, 2020 20:56
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
documentation Improvements or additions to documentation enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Question: Single state tree or several trees?
3 participants