-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please note the additional template specializations for fc::datastream
in the PR Description.
If there is documentation on the file format changes introduced by this split please link them as well. If not please talk to @nksanthosh about a task to document the blocks.log format in light of these (and any other changes)
libraries/chain/block_log.cpp
Outdated
|
||
auto [itr, _] = collection.emplace(log.first_block_num(), | ||
mapped_type{ log.last_block_num(), path_without_extension }); | ||
this->active_index = collection.index_of(itr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only safe if the files are opened in-order, right? Otherwise the index of this entry may change as new entries are added.
Furthermore, it seems like the ordering from for_each_file_in_dir_matches
is based on boost::filesystem::directory_iterator
which does not guaranteed an order:
from: https://www.boost.org/doc/libs/1_70_0/libs/filesystem/doc/reference.html#Class-directory_iterator
The order of directory entries obtained by dereferencing successive increments of a directory_iterator is unspecified.
Even if it did so lexicographically, the filenames do not contain leading zeros, so you would expect blocks-101-110.log
to come before blocks-21-30.log
in the test cases below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm not off base about this we should probably extend the test cases to cover situations where lexicographically ordered filenames are not properly ordered by block number.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test case modified to cover it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test cases include the new 3 digit names but, they do not close and re-open the block catalog and therefore wouldn't trigger the behavior I'm concerned with. Can we add a test that writes out the segmented blocks log (2 and 3 digit spans), then closes that controller and re-opens a new tester with the same block log directory and ensure that the expected blocks can be read AND have the proper block number after they are read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test case added in test_split_log_replay
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@b1bart We cannot ensure we test this optimization directly, since we don't know the order the file system will decide to give it to us. The iterator will be invalidated by subsequent file additions if they are earlier, but then active_index will be reset and we will drop the invalid iterator without using it.
1. fix bug for command line options type inconsistency 2. rename block-split-factor to block-log-strides 3. modify test to prove that the lexicological order of block log filenames is irrelevant
@huangminghuang I think the documentation should indicate that retained block log files should not be manipulated by the user and that all files in the archive directory are completely in the users control |
the fc::datastream was copied from the PR 9104 which has been merge to develop.
There is no block file format change for the PR. Only the additional options to control how to split the log files. |
I would say that this is an extension to the block log format, in that now instead of it just being a single file representing a block log, we have daisy chained files, with an understanding of the ordering based on the file name. This does need to be added to the documentation, which will include things like: if the file name must match the range of blocks in the file, that a more recent range of blocks obfuscates the same blocks from a lower range. These are assumptions about functionality, I have not looked at code yet. |
libraries/chain/block_log.cpp
Outdated
|
||
auto [itr, _] = collection.emplace(log.first_block_num(), | ||
mapped_type{ log.last_block_num(), path_without_extension }); | ||
this->active_index = collection.index_of(itr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test cases include the new 3 digit names but, they do not close and re-open the block catalog and therefore wouldn't trigger the behavior I'm concerned with. Can we add a test that writes out the segmented blocks log (2 and 3 digit spans), then closes that controller and re-opens a new tester with the same block log directory and ensure that the expected blocks can be read AND have the proper block number after they are read?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this work with starting with a block log version == 1 (starting at block 1) and then will append to it? I think it will, but in looking through the code I see that we only care about the version if we are calling reset. I am wondering if we need a check for starting at block num 1 if version ==1 and stride is set. Would also be nice to have a test for older version, but obviously we are not setup for doing that.
if (!fc::is_directory(data_dir)) | ||
fc::create_directories(data_dir); | ||
else | ||
catalog.open(data_dir); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we retro-actively adding multi-file support for all post-version 1 block log formats? (haven't thought if that is possible, just know that it definitely couldn't work if format requires first block to be block num 1) Even if we are, wouldn't we need a check here for version > 1, or will that happen later on in the logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly unpopular thought: discontinue v1 block log format support in nodeos. Instead provide a utility to convert v1 format to v2 format which can be run manually prior to upgrading to this version of nodeos.
Otherwise need continue to maintain backwards support (forever)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test case added to split log from v1 to the latest version; i.e. if the system starts with v1 block log, it will continue using the v1 log until it reaches the block number to split. After splitting, the newer block log would use the latest block log version regardless of the original block log version.
# Conflicts: # libraries/chain/block_log.cpp
unittests/restart_chain_tests.cpp
Outdated
} | ||
|
||
BOOST_AUTO_TEST_CASE(test_trim_blocklog_front_v1) { | ||
block_log::set_version(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need this anymore.
#9184 (comment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also mentioned 2 tests I would like to add in a text.
this->active_index = this->active_index == npos ? npos : this->active_index - items_to_erase; | ||
} | ||
this->collection.emplace(start_block_num, mapped_type{end_block_num, filename_base}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation indicates that inserting here could invalidate the active_index (unless it is a stable_vector, not sure how we know). Also, do we know that the flat_map storage is fixed size per iterator, since the filename_base is not going to be guaranteed to be the same size? I think the safest thing here is to just set active_index = npos after an add. (another corner case on top of this is that we could have an archived file that has a higher start_block_num than this start_block_num, and if we have not had any call that has changed active_index, it could be pointing at that iterator and then it would be invalid at this point)
libraries/chain/block_log.cpp
Outdated
if (!index_matches_data(index_path, log)) | ||
block_log::construct_index(log_path, index_path); | ||
|
||
auto [itr, _] = collection.emplace(log.first_block_num(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to check if iterator is not added and if it was not, then need to log a warning and set active_index to npos, since the cached information isn't really in the collection.
libraries/chain/block_log.cpp
Outdated
|
||
block_log::~block_log() {} | ||
|
||
bool detail::block_log_impl::recover_from_incomplete_block_head(block_log_data& log_data, block_log_index& index) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@b1bart we need to discus this, it is a change to how nodeos functions with an invalid block log.
"split the block log file when the head block number is the multiple of the split factor") | ||
("max-retained-block-files", bpo::value<uint16_t>()->default_value(config::default_max_retained_block_files), | ||
"the maximum number of blocks files to retain so that the blocks in those files can be queried.\n" | ||
"When the number is reached, the oldest block file would be move to archive dir or deleted if the archive dir is empty." ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this match the documentation in the PR.
"When the number is reached, the oldest block file would be move to archive dir or deleted if the archive dir is empty." ) | ||
("blocks-archive-dir", bpo::value<bfs::path>()->default_value(config::default_blocks_archive_dir_name), | ||
"the location of the blocks archive directory (absolute path or relative to blocks dir).\n" | ||
"If the value is empty, blocks files beyond the retained limit will be deleted.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this match the documentation in the PR.
@@ -228,6 +227,14 @@ void chain_plugin::set_program_options(options_description& cli, options_descrip | |||
cfg.add_options() | |||
("blocks-dir", bpo::value<bfs::path>()->default_value("blocks"), | |||
"the location of the blocks directory (absolute path or relative to application data dir)") | |||
("blocks-log-stride", bpo::value<uint32_t>()->default_value(config::default_blocks_log_stride), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add here (and also in PR API documentation) something like "When the stride is breached, the current blog log and index will be renamed 'blocks--.log/index' and a new current block log and index will be created with the most recent block. All files following this format will be used to construct an extended block log."
1. add allow-block-log-auto-fix option 2. reimplement recover_from_incomplete_block_head() to work with v3 log 3. improve some chain plugin argument descriptions
libraries/chain/block_log.cpp
Outdated
@@ -185,6 +185,12 @@ namespace eosio { namespace chain { | |||
|
|||
using log_entry = std::variant<log_entry_v4, signed_block_v0>; | |||
|
|||
const block_header& get_block_header(const log_entry& entry) { | |||
return std::visit(overloaded{ [](const signed_block_v0& v) -> const block_header& { return v; }, | |||
[](const log_entry_v4& v) -> const block_header& { return v.block; } }, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indentation here is a little wierd.
@@ -8,7 +8,7 @@ namespace eosio { namespace chain { | |||
namespace detail { class block_log_impl; } | |||
|
|||
/* The block log is an external append only log of the blocks with a header. Blocks should only | |||
* be written to the log after they irreverisble as the log is append only. The log is a doubly | |||
* be written to the log after they irreversible as the log is append only. The log is a doubly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"after they are irreversible"
("max-retained-block-files", bpo::value<uint16_t>()->default_value(config::default_max_retained_block_files), | ||
"the maximum number of blocks files to retain so that the blocks in those files can be queried.\n" | ||
"When the number is reached, the oldest block file would be move to archive dir or deleted if the archive dir is empty." ) | ||
"When the number is reached, the oldest block file would be move to archive dir or deleted if the archive dir is empty.\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"would be moved to archive dir"
unittests/restart_chain_tests.cpp
Outdated
@@ -402,6 +404,7 @@ BOOST_FIXTURE_TEST_CASE(restart_from_block_log_with_incomplete_head,restart_from | |||
logfile.open("ab"); | |||
const char random_data[] = "12345678901231876983271649837"; | |||
logfile.write(random_data, sizeof(random_data)); | |||
allow_block_log_auto_fix = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused here, why is this being set at the very end of the test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think that we should have a test for all paths: truncate index don't set flag and verify block log and index are at the block that block log pointed to, same setup but with flag and verify block log points to block that index pointed to, set flag and have block index point to invalid position (possibly both to far and wrong place in file) and verify that it then matches the block log's block (and index is fixed).
@@ -228,13 +228,20 @@ void chain_plugin::set_program_options(options_description& cli, options_descrip | |||
("blocks-dir", bpo::value<bfs::path>()->default_value("blocks"), | |||
"the location of the blocks directory (absolute path or relative to application data dir)") | |||
("blocks-log-stride", bpo::value<uint32_t>()->default_value(config::default_blocks_log_stride), | |||
"split the block log file when the head block number is the multiple of the split factor") | |||
"split the block log file when the head block number is the multiple of the split factor\n" | |||
"When the stride is reached, the current block log and index will be renamed 'blocks-num_begin-num_end.log/index'\n" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use 'block--.log/index' to make it clearer? (here and in the PR documentation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, markup removed my text
block-<start num>-<end num>.log/index
"If the value is empty, blocks files beyond the retained limit will be deleted.\n" | ||
"All files in the archive directory are completely under user's control, i.e. they won't be accessed by nodeos anymore.") | ||
("allow-block-log-auto-fix", bpo::value<bool>()->default_value("false"), | ||
"When the existing block log is inconsistent with the index, allows fixing the block log and index files") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should add that it will take the highest indexed block, if it is valid, otherwise it will repair the block log and reconstruct the index.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, add the same to the PR description.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Change Description
This PR support splitting block logs automatically based on block numbers. It also allows user to specify how many split block log files to retains. Once the limit is reached, the older block files can be move to another directory or deleted.
This PR also support recovering from the head block is not completely written to disk automatically which would eliminate most use cases for using eosio-blocklog to manually trim the end of block log file.
Change Type
Select ONE
Consensus Changes
API Changes
Documentation Additions
Some new options for chain plugin are added: