core, eth, les, light: avoid storing computable receipt metadata #19345

Matthalp-zz · 2019-03-27T16:15:30Z

These changes omit storing the transaction hash (TxHash), gas
cost (GasCost), and contract address (ContractAddress) fields
within its corresponding the transaction receipt. The big wins come
from not storing the transaction hash and contract address since it is
not amenable to the snappy compression used by LevelDB, as
@karalabe and @rjl493456442 have pointed out.

The storage savings comes at the expense of additional processing
time to load receipts from storage. Specifically this comes from
additional reads needed to load the receipts' corresponding block
body to populate some fields as well as the chain configuration
needed to compute the contract address for contract creation
transactions.

However, this performance overhead should not affect the majority
of go-ethereum users. The two main places this read is needed is
for (1) subscribers watching block data during reorganizations and
(2) users querying JSON-RPC for transaction receipts. Substantial
block reorganizations do not happen often enough for (1) to be
significantly affected and while there are some use cases that
heavily use (2), the overheads should not be too horrible even when
traversing the entire historical chain.

Note that there here are some situations where all of the receipt
metadata is not populated, such as in light clients. The new code
supports this behavior by not failing when all of the metadata
cannot be constructed.

Matthalp-zz · 2019-03-27T16:40:31Z

@karalabe @holiman I did the bare minimum to exclude these fields. It appears to be able to sync just fine through block 1M, so figured we could run it on the benchmarking systems and see what happens. If things look favorable, I'll move on to the complete integration (and making all the tests that now fail actually pass).

fjl · 2019-03-28T15:47:17Z

I think this won't do much for storage size because the TxHash will still be stored. The diff so far just makes it all-zero. Would be better to delete those fields from receiptStorageRLP.

Matthalp-zz · 2019-03-28T15:56:00Z

@fjl This actually still reduces the storage size because a singe byte (the RLP empty string representation) is stored instead of the 32 additional bytes for the actual hash (and variable number of bytes for the gas cost). This was done as a "quick an dirty" test to see what the savings will be. If the savings is seen to be considerable then I'll put forth the effort to do this the right way.

holiman · 2019-03-28T20:33:44Z

@fjl nope, @matthalp is correct! It's not quite finished, but 138G vs 149G.

Matthalp-zz · 2019-03-29T00:56:58Z

I want to compare eth_getTransactionReceipt for every transaction on the canonical chain against an unmodified client before giving the go ahead to review.

Matthalp-zz · 2019-03-29T04:02:17Z

@zsfelfoldi It looks like the light protocol complicates this change a bit. I plan to debug this last test case tomorrow, but would appreciate any feedback you have.

core/types/receipt.go

karalabe · 2019-03-29T07:39:31Z

FYI

Matthalp-zz · 2019-03-29T13:45:13Z

@karable Interpolating the y-axis of the graph looks like a savings of about 12 GB. Is that right?

karalabe · 2019-03-29T14:37:26Z

Ah, sorry, didn't leave the markers in. The saving it ~14GB.

core/rawdb/accessors_chain.go

karalabe · 2019-04-15T09:47:27Z

Hey all, so I've pushed a followup PR (figured it was easier than to go back and forth for a number of days for nitpicks cleanups). In essence, I've made the following fixes:

Originally in this PR, reading the receipts via rawdb and deriving the computed fields also entailed accessing the canonical hash of the genesis block and the chain configs from leveldb. These were pretty pointless data to access, since almost all our code is aware of the chain config, so I modified the code to pass this as an argument, not pull it from disk.
Deriving the computed fields (originally SetReceiptsData, renamed to Receipts.DeriveFields) was moved from rawdb to types at Gary's suggestion, since really it's not a data access functionality, rather tightly coupled to the receipts. I've also changed the signature to use only types.Transactions, not a full types.Body + I attached it to the types.Receipts instead of a dangling method.
There was an accidental bug introduced in the PR: when an empty block's receipts were read, the original code on master returned an empty slice, the new code returned nil. This is wrong, because nil represents the concept of missing receipts, whereas the empty one obviously represents receipts being available, but no transactions in the block. This cause les to fail some test. Fixed.
The fakepeer in downloader used the old ReadReceipts to deliver full receipts, but AFAIK we should only deliver consensus receipts there. Fixed to ReadRawReceipts, but we should test that it works correctly.

…gasCost

les/handler.go

karalabe

This PR seems kind of good to me now. Would be nice to have some more eyes on it before merging.

rjl493456442

LGTM

Matthalp-zz · 2019-04-15T15:45:42Z

Thanks @karalabe and @rjl493456442 ! The follow up commits make sense to me.

I just noticed DeriveFields was not setting receipt bloom filters, so I added a follow up commit on that.

There was an accidental bug introduced in the PR: when an empty block's receipts were read, the original code on master returned an empty slice, the new code returned nil. This is wrong, because nil represents the concept of missing receipts, whereas the empty one obviously represents receipts being available, but no transactions in the block.

@karalabe Thank you for catching this. As an aside, it's more idiomatic go to use nil slices instead of empty slices, but I understand the signaling problem here. I feel like this will not be the last time this kind of issue/bug comes up on the code base, so I would recommend modifying these methods to return two return values: ReadFoo() (<Foo>, bool) and then doing a check on the bool to know if this exists. I'm more than happy to do this.

karalabe · 2019-04-15T16:03:03Z

core/types/receipt.go

@@ -310,6 +310,8 @@ func (r Receipts) DeriveFields(config *params.ChainConfig, hash common.Hash, num
 		r[i].BlockNumber = new(big.Int).SetUint64(number)
 		r[i].TransactionIndex = uint(i)

+		r[i].Bloom = CreateBloom(Receipts{r[i]})


Aren't the blooms already handled during decoding?

As far as I see it those are already covered.

@karalabe You are right -- my bad. I hadn't had my morning ☕️ yet.

This does raise the question: where should we be computing the bloom? It's kind of weird to be computing these fields in different places, but at the same time the bloom filter is immediately computable when decoding and the overhead is negligible. What would you prefer?

karalabe · 2019-04-15T18:56:02Z

Bloom seems fit in decoding because it's consensus. Safer not to screw up.

…

On Mon, Apr 15, 2019, 19:16 Matthew Halpern ***@***.***> wrote: ***@***.**** commented on this pull request. ------------------------------ In core/types/receipt.go <#19345 (comment)> : > @@ -310,6 +310,8 @@ func (r Receipts) DeriveFields(config *params.ChainConfig, hash common.Hash, num r[i].BlockNumber = new(big.Int).SetUint64(number) r[i].TransactionIndex = uint(i) + r[i].Bloom = CreateBloom(Receipts{r[i]}) @karalabe <https://github.com/karalabe> You are right -- my bad. I hadn't had my morning ☕️ yet. This does raise the question: where should we be computing the bloom? It's kind of weird to be computing these fields in different places, but at the same time the bloom filter is immediately computable when decoding and the overhead is negligible. What would you prefer? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#19345 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAH6GeBkozRg9P2daXF5300c-tsa3K8Kks5vhKX0gaJpZM4cONof> .

Matthalp-zz · 2019-04-15T19:11:05Z

@karalabe Just pushed a commit to revert the bloom population in Receipt.DeriveFields and its associated test logic.

karalabe · 2019-04-16T06:51:56Z

🎉

Matthalp-zz requested review from holiman and karalabe as code owners March 27, 2019 16:15

Matthalp-zz force-pushed the optimize-receipt-storage branch from e65b8f2 to b1a232c Compare March 29, 2019 00:37

Matthalp-zz changed the title ~~[BENCHMARK] Store transaction receipts without txHash and gasCost~~ core/*: omit txHash and gasCost when storing receipts Mar 29, 2019

Matthalp-zz force-pushed the optimize-receipt-storage branch from b1a232c to f87136c Compare March 29, 2019 00:54

Matthalp-zz force-pushed the optimize-receipt-storage branch 2 times, most recently from bac14a6 to d774042 Compare March 29, 2019 02:53

Matthalp-zz requested a review from zsfelfoldi as a code owner March 29, 2019 02:53

Matthalp-zz force-pushed the optimize-receipt-storage branch from d774042 to 0930195 Compare March 29, 2019 03:00

karalabe reviewed Mar 29, 2019

View reviewed changes

core/types/receipt.go Outdated Show resolved Hide resolved

Matthalp-zz force-pushed the optimize-receipt-storage branch 11 times, most recently from 5efcc1c to efcce69 Compare April 1, 2019 03:14

karalabe reviewed Apr 15, 2019

View reviewed changes

core/rawdb/accessors_chain.go Outdated Show resolved Hide resolved

karalabe force-pushed the optimize-receipt-storage branch from f3ccca9 to 6235665 Compare April 15, 2019 09:37

Matthalp-zz requested a review from gballet as a code owner April 15, 2019 09:37

core, eth, les, light: store transaction receipts without txHash and …

6b0ddd1

…gasCost

karalabe force-pushed the optimize-receipt-storage branch from 6235665 to cd96c8e Compare April 15, 2019 10:16

karalabe changed the title ~~core/*: avoid storing computable receipt metadata~~ core, eth, les, light: avoid storing computable receipt metadata Apr 15, 2019

karalabe removed the status:triage label Apr 15, 2019

karalabe added this to the 1.9.0 milestone Apr 15, 2019

karalabe force-pushed the optimize-receipt-storage branch from cd96c8e to b2b131d Compare April 15, 2019 10:25

karalabe reviewed Apr 15, 2019

View reviewed changes

les/handler.go Outdated Show resolved Hide resolved

core, eth, les, light: scope receipt functionality a bit cleaner

7221cb1

karalabe force-pushed the optimize-receipt-storage branch from b2b131d to 2a937ce Compare April 15, 2019 10:42

karalabe requested review from holiman, zsfelfoldi and rjl493456442 April 15, 2019 10:49

karalabe approved these changes Apr 15, 2019

View reviewed changes

karalabe requested a review from fjl April 15, 2019 10:50

rjl493456442 approved these changes Apr 15, 2019

View reviewed changes

karalabe reviewed Apr 15, 2019

View reviewed changes

core/types: fix cummulative gas bug and legacy decoding tests

ce9a289

karalabe force-pushed the optimize-receipt-storage branch from ed464e7 to ce9a289 Compare April 16, 2019 06:50

karalabe merged commit 78d90c4 into ethereum:master Apr 16, 2019

non-fungible-nelson mentioned this pull request Aug 17, 2023

Receipt Compression Strategy hyperledger/besu#5767

Closed

7 tasks

gfukushima mentioned this pull request Dec 11, 2023

Remove redundant field from receipts storage hyperledger/besu#6271

Closed

JukLee0ira mentioned this pull request Jul 4, 2024

reduce the receipt RLP size in chain database XinFinOrg/XDPoSChain#570

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

core, eth, les, light: avoid storing computable receipt metadata #19345

core, eth, les, light: avoid storing computable receipt metadata #19345

Matthalp-zz commented Mar 27, 2019 •

edited

Loading

Matthalp-zz commented Mar 27, 2019

fjl commented Mar 28, 2019

Matthalp-zz commented Mar 28, 2019

holiman commented Mar 28, 2019

Matthalp-zz commented Mar 29, 2019

Matthalp-zz commented Mar 29, 2019

karalabe commented Mar 29, 2019

Matthalp-zz commented Mar 29, 2019

karalabe commented Mar 29, 2019

karalabe commented Apr 15, 2019 •

edited

Loading

karalabe left a comment

rjl493456442 left a comment

Matthalp-zz commented Apr 15, 2019

karalabe Apr 15, 2019

karalabe Apr 15, 2019

Matthalp-zz Apr 15, 2019

karalabe commented Apr 15, 2019 via email

Matthalp-zz commented Apr 15, 2019

karalabe commented Apr 16, 2019

core, eth, les, light: avoid storing computable receipt metadata #19345

core, eth, les, light: avoid storing computable receipt metadata #19345

Conversation

Matthalp-zz commented Mar 27, 2019 • edited Loading

Matthalp-zz commented Mar 27, 2019

fjl commented Mar 28, 2019

Matthalp-zz commented Mar 28, 2019

holiman commented Mar 28, 2019

Matthalp-zz commented Mar 29, 2019

Matthalp-zz commented Mar 29, 2019

karalabe commented Mar 29, 2019

Matthalp-zz commented Mar 29, 2019

karalabe commented Mar 29, 2019

karalabe commented Apr 15, 2019 • edited Loading

karalabe left a comment

Choose a reason for hiding this comment

rjl493456442 left a comment

Choose a reason for hiding this comment

Matthalp-zz commented Apr 15, 2019

karalabe Apr 15, 2019

Choose a reason for hiding this comment

karalabe Apr 15, 2019

Choose a reason for hiding this comment

Matthalp-zz Apr 15, 2019

Choose a reason for hiding this comment

karalabe commented Apr 15, 2019 via email

Matthalp-zz commented Apr 15, 2019

karalabe commented Apr 16, 2019

Matthalp-zz commented Mar 27, 2019 •

edited

Loading

karalabe commented Apr 15, 2019 •

edited

Loading