-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigating Tree Hashing in Lodestar #355
Comments
an interesting talk about sha256 for merkle tree regarding batch hash https://www.youtube.com/watch?v=NfK4np15E64 there are 2 implementations for now:
There are 2 ways we can do the batch hash
type HashComputation = {
src0: Node;
src1: Node;
dest: Node;
};
getHashComputation(level: number, hashCompsByLevel: Map<number, HashComputation[]>): void {
if (this.h0 === null) {
let hashComputations = hashCompsByLevel.get(level);
if (hashComputations === undefined) {
hashComputations = [];
hashCompsByLevel.set(level, hashComputations);
}
hashComputations.push({src0: this.left, src1: this.right, dest: this});
if (!this.left.isLeaf()) {
(this.left as BranchNode).getHashComputation(level + 1, hashCompsByLevel);
}
if (!this.right.isLeaf()) {
(this.right as BranchNode).getHashComputation(level + 1, hashCompsByLevel);
}
return;
}
// else stop the recursion, LeafNode should have h0
} this traversal takes <10% of our current hash time
|
in progress POC to implement batch hash in ssz/lodestar https://hackmd.io/zj9N5RIqQfCqYz8Y1Xc_hA?view |
with this branch it consumes ~1GB more heap memory in lodestar update: resolved by not to extract uint32 from Uint32Array which caused the issue |
Note to improve In order to execute this data structure To do that there are 2 ways:
|
We'll close this when batch hashing has landed. |
Current Hashing Approach
Lodestar's architecture relies heavily on maintaining a full merkle tree of
the beacon state. We represent the tree as a linked data structure, where each node is 1. immutable, 2. lazily computing but caching the hash of its children.
This allows us to minimize the number of hashes we need to perform during state transition. Since all hashes of a prestate are maintained, only the paths thru the tree to the "diff" need to be re-hashed. This reduces the computational cost of hashing.
Also, it allows us perform structural sharing, sharing the memory for beacon states with shared subtrees. For example, between epochs in a sync period, where sync committees remain constant, if we maintain a reference to two beacon states, both states will share the same underlying subtrees for the sync committees. This reduces the memory cost of maintaining several related states.
Hashing Function
We use as-sha256 with several optimizations
we rely on hash inputs always being 64 bytes - this allows us to precompute part of the sha256 internals for a decent gain
we avoid allocations inside library, only using fixed input/output buffers
Related hashing perf analysis Lodestar tree hashing performance lodestar#2206
Hash Cache Representation
The memory usage for each type of object in Javascript is not very efficient coming from a systems language intuition. We store hash objects NOT as 32-byte Uint8Array. A 32-byte Uint8Array takes 223 total bytes! There's a bunch of pointers and additional bookkeeping that's being stored behind the scenes.
We store hash objects as objects with 8 uint32 numbers. eg:
{h0: 0, h1: 0, ..., h7: 0}
. This takes somewhere between 88 bytes and 216 bytes, depending on the sizes of the indiviudal component numbers. Smaller numbers are represented as Smi (small integer) as an immediate value, while larger numbers are stored on the heap. In practice, this happens TODO.How to improve?
The hashing speed in lodestar is quite low compared to other implementations.
The memory of our hashes in a lodestar node constitute a lot of a running beacon node. And our memory usage, measured per-hash, is still very large compared to systems languages.
Results
Some experiments were made, results below
cayman/hash-object
hashtree
does exceptionally well operating on largeUint8Array
s, not as well on rustHashObject
s. (seehashtree uint8array
row) (hashtree
code has since been pulled into a repo here:@chainsafe/hashtree-js
)rust
row), also using a rust port of as-sha256 was slower (seerust object rs-sha256
)node -r ts-node/register node_modules/.bin/benchmark packages/ssz/test/perf/hash.test.ts
Unfortunately, a rust
HashObject
is more expensive, memory-wise, than the status quo.node -r ts-node/register --expose-gc packages/ssz/test/memory/hash.test.ts
cayman/hash-cache
node -r ts-node/register node_modules/.bin/benchmark packages/ssz/test/perf/eth2/hashTreeRoot.test.ts
hash cache
master
node -r ts-node/register --expose-gc packages/ssz/test/memory/eth2Objects.test.ts
hash cache
master
cayman/napi-merkle-node
Node
to javascript. This design allows most of the tree to exist in rust, with napi pointers into the tree as navigation demands.The text was updated successfully, but these errors were encountered: