Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accelerate Merkle tree hashing #205

Open
mratsim opened this issue Sep 18, 2022 · 0 comments
Open

Accelerate Merkle tree hashing #205

mratsim opened this issue Sep 18, 2022 · 0 comments

Comments

@mratsim
Copy link
Owner

mratsim commented Sep 18, 2022

Merkle tree hashing is a significant bottleneck in blockchains.

In most (all?) blockchains and in particular Ethereum, efficient storage/retrieval/update as well as integrity guarantees of data is implemented through a Merkle-Tree or other hash trees (Merkle Patricia Trie for Ethereum).

Those require computing hashes from the leaves to the root of the tree. In particular the latest way to accelerate syncing a whole blockchain, 800GB+ of data for Ethereum execution layer (accounts/transactions) involve only downloading the leaves and reconstructing the Merkle tree locally. (https://github.com/ethereum/devp2p/blob/40ab248/caps/snap.md)

Besides, the consensus layer is also extremely hashing intensive with everything being serialized in a "merkleizable" wire format, SimpleSerialize (SSZ): https://github.com/ethereum/consensus-specs/blob/9a2fcc0/ssz/simple-serialize.md#merkleization.

Note that the Execution Layer uses hexary Merkle tree (16 leaves per node) while the Consensus Layer is binary (2 leaves per node).

Writeups:

Note: Bitcoin has similar needs

Benchmarks:

Mammon being optimized for tree hashing

Nimbus Prysm Lighthouse (single thread) Mammon
NUC i5-8259U @ 2.30GHz 1112ms (AVX2) 1125ms (AVX2) 860ms (AVX2) 563 (AVX2) 596(AVX) 1042 (SSE)
Ryzen 5 @3.60 GHz 251ms (SHA) 292ms (SHA) 760ms (SHA) 161ms (SHA) 760ms (AVX2) 813ms (AVX) 1361 ms (SSE)
XPS 13 i5-7200U @ 2.50GHz 953 ms (AVX2) 1085ms (AVX2) 998ms (AVX2) 654 ms (AVX2) 681ms(AVX) 964ms (SSE)

Beyond blockchains, while there are many usecases for Merkle trees

Hash trees are also used in the IPFS, Btrfs and ZFS file systems[4] (to counter data degradation[5]); Dat protocol; Apache Wave protocol;[6] Git and Mercurial distributed revision control systems; the Tahoe-LAFS backup system; Zeronet; the Bitcoin and Ethereum peer-to-peer networks;[7] the Certificate Transparency framework; the Nix package manager and descendants like GNU Guix;[8] and a number of NoSQL systems such as Apache Cassandra, Riak, and Dynamo

mratsim added a commit that referenced this issue Sep 18, 2022
…nt specific use-cases like #205; also implement SSSE3 acceleration (2006, Intel Core 2 Duo)
mratsim added a commit that referenced this issue Sep 18, 2022
…nt specific use-cases like #205; also implement SSSE3 acceleration (2006, Intel Core 2 Duo)
mratsim added a commit that referenced this issue Sep 18, 2022
…nt specific use-cases like #205; also implement SSSE3 acceleration (2006, Intel Core 2 Duo)
mratsim added a commit that referenced this issue Sep 19, 2022
* sha256: separate message scheduling and state updates to help implement specific use-cases like #205; also implement SSSE3 acceleration (2006, Intel Core 2 Duo)

* sha256: simplify update flow, store less metadata in context

* sha256: Fix reworked update function

* Implement x86 hardware SHA acceleration

* typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant