Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Encoding/decoding spec (rfc) #1

Open
henridf opened this issue May 22, 2022 · 7 comments
Open

Encoding/decoding spec (rfc) #1

henridf opened this issue May 22, 2022 · 7 comments

Comments

@henridf
Copy link
Owner

henridf commented May 22, 2022

Outline for "EIP-4444 for pow blocks" (https://notes.ethereum.org/@ralexstokes/BJWd8saB9)

In summary, the proposal is to derive from the Beacon Chain's ExecutionPayload encoding to store blocks, and extend it with uncle headers and a new ReceiptsPayload encoding (defined below) to store receipts. A double-batched accumulator is used to provide one root per block archive file.

Block archive file

A block archive file consists of a header and body that are individually concatenated and encoded into a file.

MAX_PAYLOADS_PER_ARCHIVE = 1000000 # tbd

class ArchiveHeader(Container):
    version: uint64
    head_block_number: uint64
    block_count: uint64

class ArchiveBody(Container):
    blocks: List[Block, MAX_PAYLOADS_PER_ARCHIVE]

Note: initially, both the header fields and the block were in the same container. Splitting them out has two advantages:

  • Facilitates decoding the header without having to read in the entire file
  • Facilitates computing the hash tree root over the blocks only.

File encoding and decoding

The encoding tool will take a stream of exported blocks/receipts (RLP-encoded), encode each into the SSZ representations described below, and periodically "flush" a batch into a new ssz-encoded archive file. A simple approach would be to flush every N blocks, but it may be more practical to flush based on the size of the accumulated data, in order to have a smaller number of files. In particular, if we set a constant number of blocks per file, we would have to set that constant that based on recent blocks that are "large" (in order to reach a target file size). That would result in lots of smaller files in the earlier part of the chain history, where blocks are much smaller. I will experiment with this and propose something along with concrete numbers.

Double-batched accumulator

The encoding tool is also responsible for computing and outputting the hash tree roots along the way. There is one root per archive file, which can be used as proofs for an entire file, but also for granular proofs reaching all the way down into individual block values.

It does this via a double-batched accumulator similar to the beacon chain's BeaconState.historical_roots. The one difference is that each successive historical root does not necessarily accumulate the same (constant) number of block roots.

Block

(This section derived from beacon chain's ExecutionPayload, though the resulting structs are different. Notably, uncles and receipts are present here).

Custom types
Name SSZ equivalent Description
Transaction ByteList[MAX_BYTES_PER_TRANSACTION] either a typed transaction envelope or a legacy transaction
ExecutionAddress Bytes20 Address of account on the execution layer
Constants
Name Value
MAX_BYTES_PER_TRANSACTION uint64(2**30) (= 1,073,741,824)
MAX_TRANSACTIONS_PER_PAYLOAD uint64(2**20) (= 1,048,576)
MAX_UNCLES_PER_PAYLOAD 10 (??)
BYTES_PER_LOGS_BLOOM uint64(2**8) (= 256)
MAX_EXTRA_DATA_BYTES 2**5 (= 32)
class Block(Container):
    header: Header
    transactions: List[Transaction, MAX_TRANSACTIONS_PER_PAYLOAD]
    uncles: List[Header, MAX_UNCLES_PER_PAYLOAD]
    receipts: List[Receipt, MAX_TRANSACTIONS_PER_BLOCK]

class Header(Container):
    parent_hash: Hash32
    uncle_hash: Hash32   # not in beacon chain's ExecutionPayload
    fee_recipient: ExecutionAddress  # 'beneficiary' in the yellow paper
    state_root: Bytes32
    tx_hash: Hash32 # not in beacon chain's ExecutionPayload
    receipts_root: Bytes32
    logs_bloom: ByteVector[BYTES_PER_LOGS_BLOOM]
    prev_randao: Bytes32  # 'difficulty' in the yellow paper
    block_number: uint64  # 'number' in the yellow paper
    gas_limit: uint64
    gas_used: uint64
    timestamp: uint64
    extra_data: ByteList[MAX_EXTRA_DATA_BYTES]
    base_fee_per_gas: uint256
    mix_digest: Bytes32 # not in beacon chain's ExecutionPayload
    nonce: uint64 # not in beacon chain's ExecutionPayload
    # block_hash: Hash32   -- not sure we need this? # Hash of execution block

Receipts payload

Constants

Name Value
MAX_TOPICS_PER_LOG uint8(2**2) (= 4)
MAX_LOGS_PER_PAYLOAD uint64(2**20) (= 1,048,576)
BYTES_PER_LOGS_BLOOM uint64(2**8) (= 256)
MAX_LOG_DATA_BYTES uint64(2**22) (= 4,194,304)
class Receipt(Container):
   post_state: Bytes32
   status: uint64
   cumulative_gas_used: uint64
   logs: List[Log, MAX_LOGS_PER_PAYLOAD]



class LogPayload(Container):
   address: ExecutionAddress
   topics: List[Hash32, MAX_TOPICS_PER_LOG]
   data: ByteList[MAX_LOG_DATA_BYTES]
@henridf henridf changed the title Next step outline Encoding/decoding v2 May 23, 2022
@lightclient
Copy link

Looks good overall! Just a few thoughts:

  • Naming -- we can probably spend some time bike shedding this. "Archive" is a word I would try to avoid as it will create a bit of collision with "archive node".
  • N blocks vs uniform size -- I struggle to think of persuasive arguments either way. With that said, I do think I lean towards the uniform size approach. Curious to see what comes of your experiments here.

@henridf
Copy link
Owner Author

henridf commented Jun 5, 2022

N blocks vs uniform size -- I struggle to think of persuasive arguments either way. With that said, I do think I lean towards the uniform size approach. Curious to see what comes of your experiments here.

@lightclient a few comments on data / file size now that I've done some measurements.

  • The overall size of blocks+receipts from origin to current head is roughly 700GB uncompressed, 300GB compressed.
  • We need an upper limit S on individual file sizes. Let's say something like 4GB or maybe 8GB. The constraint driving this limit is that these files must be entirely read into memory to be ssz-decoded (update: Encoding/decoding spec (rfc) #1 (comment)).
  • Given this upper limit, there are two possible approaches:
    1. Variable block counts, fixed file sizes: Read as many block counts as fits into the S size limit, then emit a file. So we would have ~ 700/S files.
    2. Uniform block counts, variable file sizes: Read N blocks, then emit a file. What is N? It's value would have to be based on recent block sizes, which are larger. At the 14 millionth block range, 10k blocks takes order of 1.4GB storage. So to store the first 15m blocks, we would need 1500 * 1.4/S GB. In other words, for S=8GB, we would be talking about roughly 90 files for variable block counts vs roughly 260 files with uniform block counts.

Overall I would mildly lean for variable block counts, as it leads to fewer and uniformly-sized files, but I'm ok going with either approach if there is a preference for uniform.

@lightclient
Copy link

@henridf it might be worth posting in the eth r&d discord for some opinions. Maybe even asking on ACD. It's one of those simple decisions that could have a relatively big impact, so we may as well get as many opinions on it as possible. I still don't think I have strong opinion on it.

the overall size of blocks+receipts from origin to current head is roughly 700GB uncompressed, 300GB compressed

Do you know what format geth stores them on-disk? Last fall I did a geth inspect and there were was only ~250 GB of history, so a bit surprised with the 700 GB number!

We need an upper limit S on individual file sizes.

Again, an important decision, but I don't think I lean strongly in either direction. I think I empathize the most with minimizing the size so that the constraints we impose on memory aren't too large. But memory gets cheaper by the day and blocks bigger by the year, so may be preferable to be more future-proof and err on the larger side. Would also be a good discord / ACD question.

@henridf
Copy link
Owner Author

henridf commented Jun 5, 2022

I agree this is an important decision deserving broader exposure. I'd like to make a little progress on the accumulator and verifying to see if that informs anything. (I'll be starting this next now that encoding is mostly implemented at https://github.com/henridf/eip44s-proto/tree/wip-v2).

Do you know what format geth stores them on-disk? Last fall I did a geth inspect and there were was only ~250 GB of history, so a bit surprised with the 700 GB number!

Yep, the vast majority are in the freezer which uses snappy-compression for these tables (https://github.com/ethereum/go-ethereum/blob/997f1c4f0abcd78f645e6e7ced6db4b42ad59c9d/core/rawdb/schema.go#L133).

From compressing a few history files with gzip, I saw a 2.5x-ish compression gain, while geth inspect shows 335GB as of recently. So the numbers line up pretty well.

| Ancient store   | Headers            | 6.60 GiB   |   14807920 |
| Ancient store   | Bodies             | 215.84 GiB |   14807920 |
| Ancient store   | Receipt lists      | 104.95 GiB |   14807920 |

@lightclient
Copy link

Makes sense! Thanks

@henridf
Copy link
Owner Author

henridf commented Jun 8, 2022

Another argument (from @karalabe, relayed by @lightclient) for variable block counts, fixed file sizes: It avoids situations where a chain has many many empty blocks leading to needless additional files

@henridf henridf changed the title Encoding/decoding v2 Encoding/decoding spec (rfc) Jun 8, 2022
@henridf
Copy link
Owner Author

henridf commented Jun 9, 2022

The constraint driving this limit is that these files must be entirely read into memory to be ssz-decoded. (Which also means that this limit is for uncompressed file sizes).

An update / correction on this: with the revamped layout, it would be feasible to write a custom ssz streaming decoder for the List[Block, MAX_PAYLOADS_PER_ARCHIVE] that would read in all the ssz offsets, then stream in one block at a time. These could be fed to a rlp writer (when converting to RLP) or to a custom hash-tree-root implementation.

Of course this is some one-off work wrt e.g. using the fastssz-generated marshalling, but once the format is settled it would be worthwhile. Most importantly, we probably don't need to worry so much about erring on the size of small files.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants