Archivel nodes should GC ColPartialChunks #6242

mina86 · 2022-02-03T19:42:48Z

ColPartialChunks is the largest column in archival nodes while at the same time all information stored in it are available in ColChunks column. It would be big win if we could GC the column in archival nodes.

Issue: near#6242

Extract two more methods from `process_partial_encoded_chunk_request` which correspond to `if` bodies that used to be in it. This makes each function shorter and thus easier to read especially as in the future more branches will be added to the method. Furthermore, move sending of the message to the method changing `maybe_send_partial_encoded_chunk_response` into method which prepares the response only. Issue: near#6242

#6355) Issue: #6242

Extract two more methods from `process_partial_encoded_chunk_request` which correspond to `if` bodies that used to be in it. This makes each function shorter and thus easier to read especially as in the future more branches will be added to the method. Furthermore, move sending of the message to the method changing `maybe_send_partial_encoded_chunk_response` into method which prepares the response only. This is a pure refactoring with no changes to the behaviour. Issue: #6242

Replace BaseNode.get_all_heights method with BaseNode.get_all_blocks which returns hashes alongside heights of all the blocks known to a node. This feature will be used in future commit. Issue: #6242

…#6370) Extend sanity/block_sync_archival.py test documentation describing in more detail what it’s doing and mentioning that both nodes are archival. Furthermore, avoid starting the observer node at the beginning just to kill it immediately; it’s now started only once the validator generates the blocks. Finally, add explicit comparison of all the blocks in the validator and observer nodes. Issue: #6242

The ShardManager keeps cache of encoded chunks going back 1024 heights which means that nodes which request partial chunks for a recent block will be satisfied by data in the cache. This means that since only a hundred blocks are generated in block_sync_archival.py test, the code path where data is read from storage is never executed. Extend the test such that it generates 1500 blocks to make sure that both code paths are executed. Issue: #6242

…6376) The ShardManager keeps cache of encoded chunks going back 1024 heights which means that nodes which request partial chunks for a recent block will be satisfied by data in the cache. This means that since only a hundred blocks are generated in block_sync_archival.py test, the code path where data is read from storage is never executed. Extend the test such that it generates 1500 blocks to make sure that both code paths are executed. Issue: #6242

Add ability to respond to PartialEncodedChunkRequest from ShardChunk objects in addition to PartialEncodedChunk. In practice this is currently dead code since there is no scenario in which the former is in the storage while the latter isn’t but the plan is to start garbage collecting ColPartialChunks column at which point we’ll have to serve requests from data in ColChunks column. Issue: #6242

…unk (#6377) Add ability to respond to PartialEncodedChunkRequest from ShardChunk objects in addition to PartialEncodedChunk. In practice this is currently dead code since there is no scenario in which the former is in the storage while the latter isn’t but the plan is to start garbage collecting ColPartialChunks column at which point we’ll have to serve requests from data in ColChunks column. Issue: #6242

Add near_partial_encoded_chunk_request_processing_time metric which returns how much time processing partial encoded chunk requests took. The metric is split by the method used to create a response and also whether in the end the response has been prepared or not. Issue: near#6242

…6431) Add near_partial_encoded_chunk_request_processing_time metric which returns how much time processing partial encoded chunk requests took. The metric is split by the method used to create a response and also whether in the end the response has been prepared or not. Issue: #6242

Specify max_block_production_delay in addition to min delay in node’s configuration in block_sync_archival.py test. This speeds up generation of blocks by the node and shortens the test’s run time. Issue: #6242

By mistake archive_gc_partial_chunks setting has been added to the test in previous commit changing it. The option is meant for future commits and currently causes test failures. Fix that. Issue: near#6242

By mistake archive_gc_partial_chunks setting has been added to the test in previous commit changing it. The option is meant for future commits and currently causes test failures. Fix that. Issue: #6242

Add code for observing the partial chunks request processing time metrics to make sure that the expected code paths are executed when handling the request. Issue: #6242

Add --clean-partial-chunks and --clear-trie-changes options to clear out the two respective columns. Data in ColPartialChunks can be recomputed and data in ColTrieChanges is only used by non-archival nodes and can be deleted when running archival node. Issue: near#6119 Issue: near#6242 Issue: near#6250

Start garbage collecting ColPartialChunks and ColInvalidChunks on archival nodes. The former is quite sizeable column and its data can be recovered from ColChunks. The latter is only needed when operating at head. Note that this is likely insufficient for the garbage collection to happen in reasonable time (since with current default options we’re garbage collecting only two heights at a time). It’s best to clean out the two columns. Issue: #6242

When recompressing database of an archival node, skip ColPartialChunks, ColInvalidChunks and ColTrieChanges columns which can be safely deleted. Data in the first one can be reconstructed from ColChunks, ColInvalidChunks is only needed at head and the last is never read by archival nodes. Mostly for testing, if someone wants to keep those columns, offer --keep-partial-chunks, --keep-invalid-chunks and --keep-trie-changes switches. They are always on when dealing with non-archival node. Issue: #6119 Issue: #6242 Issue: #6250

…6356) Extract two more methods from `process_partial_encoded_chunk_request` which correspond to `if` bodies that used to be in it. This makes each function shorter and thus easier to read especially as in the future more branches will be added to the method. Furthermore, move sending of the message to the method changing `maybe_send_partial_encoded_chunk_response` into method which prepares the response only. This is a pure refactoring with no changes to the behaviour. This is commit 62aa75a upstream. Issue: near#6242

…unk (near#6377) This is commit 09041ec upstream. Add ability to respond to PartialEncodedChunkRequest from ShardChunk objects in addition to PartialEncodedChunk. In practice this is currently dead code since there is no scenario in which the former is in the storage while the latter isn’t but the plan is to start garbage collecting ColPartialChunks column at which point we’ll have to serve requests from data in ColChunks column. Issue: near#6242

…ear#6431) This is commit e92e894 upstream. Add near_partial_encoded_chunk_request_processing_time metric which returns how much time processing partial encoded chunk requests took. The metric is split by the method used to create a response and also whether in the end the response has been prepared or not. Issue: near#6242

This is commit 6be2e0e upstream. Start garbage collecting ColPartialChunks and ColInvalidChunks on archival nodes. The former is quite sizeable column and its data can be recovered from ColChunks. The latter is only needed when operating at head. Note that this is likely insufficient for the garbage collection to happen in reasonable time (since with current default options we’re garbage collecting only two heights at a time). It’s best to clean out the two columns. Issue: near#6242

This is commit da7a465 upstream. When recompressing database of an archival node, skip ColPartialChunks, ColInvalidChunks and ColTrieChanges columns which can be safely deleted. Data in the first one can be reconstructed from ColChunks, ColInvalidChunks is only needed at head and the last is never read by archival nodes. Mostly for testing, if someone wants to keep those columns, offer --keep-partial-chunks, --keep-invalid-chunks and --keep-trie-changes switches. They are always on when dealing with non-archival node. Issue: near#6119 Issue: near#6242 Issue: near#6250

Since commit 6be2e0e: ‘gc partial chunks on archival nodes (near#6439)’, archival nodes set chunk_tail without setting tail. However, store validation expects both of those to be set or unset. Change the code to allow unset tail on archival nodes. Issue: near#6242

#6563) Since commit 6be2e0e: ‘gc partial chunks on archival nodes (#6439)’, archival nodes set chunk_tail without setting tail. However, store validation expects both of those to be set or unset. Change the code to allow unset tail on archival nodes. Issue: #6242

…ation Partial encoded chunks can be calculated on the fly when requested and we are now garbage collecting them in archival nodes. There’s no point in populating the column during 9→10 database version migration. Issue: near#6242

#6563) Since commit 6be2e0e: ‘gc partial chunks on archival nodes (#6439)’, archival nodes set chunk_tail without setting tail. However, store validation expects both of those to be set or unset. Change the code to allow unset tail on archival nodes. Issue: #6242

…ation (#6615) Partial encoded chunks can be calculated on the fly when requested and we are now garbage collecting them in archival nodes. There’s no point in populating the column during 9→10 database version migration. Issue: #6242

bowenwang1996 added A-storage Area: storage and databases T-node Team: issues relevant to the node experience team labels Feb 4, 2022

mina86 self-assigned this Feb 8, 2022

mina86 added a commit to mina86/nearcore that referenced this issue Feb 28, 2022

core: preallocate known vector capacity in encode_transaction_receipts

8fb22ba

Issue: near#6242

mina86 mentioned this issue Feb 28, 2022

core: preallocate known vector capacity in encode_transaction_receipts #6355

Merged

mina86 mentioned this issue Feb 28, 2022

chain: refactor process_partial_encoded_chunk_request method #6356

Merged

near-bulldozer bot pushed a commit that referenced this issue Mar 1, 2022

core: preallocate known vector capacity in encode_transaction_receipts (

5f0a0bb

#6355) Issue: #6242

This was referenced Mar 2, 2022

pytest: extend get_all_heights to return hashes as well #6369

Closed

pytest: explicitly check synced blocks in block_sync_archival.py test #6370

Merged

mina86 mentioned this issue Mar 2, 2022

pytest: check reading chunks from storage in block_sync_archival.py #6376

Merged

mina86 mentioned this issue Mar 2, 2022

chain: ShardManager: add ability to serve partial chunks from ShardChunk #6377

Merged

mina86 mentioned this issue Mar 14, 2022

chain: ShardManager: add metrics for partial encoded chunk response #6431

Merged

This was referenced Mar 15, 2022

chain: gc partial chunks on archival nodes #6439

Merged

pytest: reduce block production delay in block_sync_archival.py #6440

Merged

mina86 mentioned this issue Mar 17, 2022

pytest: fix block_sync_archival.py #6449

Merged

mina86 added a commit that referenced this issue Mar 17, 2022

pytest: observe metrics in block_sync_archival.py test

74cba3d

Add code for observing the partial chunks request processing time metrics to make sure that the expected code paths are executed when handling the request. Issue: #6242

mina86 mentioned this issue Mar 17, 2022

pytest: observe metrics in block_sync_archival.py test #6452

Merged

mina86 mentioned this issue Mar 23, 2022

neard: recompress_storage: clean out unnecessary columns #6477

Merged

mina86 mentioned this issue Apr 8, 2022

chain: allow chain_tail to be set w/o tail being set on archival nodes #6563

Merged

mina86 mentioned this issue Apr 14, 2022

db: don’t populate ColPartialChunks during 9→10 database version migration #6615

Merged

mina86 closed this as completed Apr 20, 2022

gmilescu added the Node Node team label Oct 19, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Archivel nodes should GC ColPartialChunks #6242

Archivel nodes should GC ColPartialChunks #6242

mina86 commented Feb 3, 2022

Archivel nodes should GC ColPartialChunks #6242

Archivel nodes should GC ColPartialChunks #6242

Comments

mina86 commented Feb 3, 2022