-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[fix](partial update) duplicate key occur when BE restart after conflict concurrent partial update #35739
[fix](partial update) duplicate key occur when BE restart after conflict concurrent partial update #35739
Conversation
Thank you for your contribution to Apache Doris. Since 2024-03-18, the Document has been moved to doris-website. |
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
@@ -1114,7 +1114,7 @@ void BaseTablet::_remove_sentinel_mark_from_delete_bitmap(DeleteBitmapPtr delete | |||
} | |||
} | |||
|
|||
Status BaseTablet::update_delete_bitmap(const BaseTabletSPtr& self, const TabletTxnInfo* txn_info, | |||
Status BaseTablet::update_delete_bitmap(const BaseTabletSPtr& self, TabletTxnInfo* txn_info, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'update_delete_bitmap' has cognitive complexity of 58 (threshold 50) [readability-function-cognitive-complexity]
Status BaseTablet::update_delete_bitmap(const BaseTabletSPtr& self, TabletTxnInfo* txn_info,
^
Additional context
be/src/olap/base_tablet.cpp:1127: +1, including nesting penalty of 0, nesting level increased to 1
if (txn_info->partial_update_info && txn_info->partial_update_info->is_partial_update) {
^
be/src/olap/base_tablet.cpp:1127: +1
if (txn_info->partial_update_info && txn_info->partial_update_info->is_partial_update) {
^
be/src/olap/base_tablet.cpp:1128: nesting level increased to 2
transient_rs_writer = DORIS_TRY(self->create_transient_rowset_writer(
^
be/src/common/status.h:694: expanded from macro 'DORIS_TRY'
({ \
^
be/src/olap/base_tablet.cpp:1128: +3, including nesting penalty of 2, nesting level increased to 3
transient_rs_writer = DORIS_TRY(self->create_transient_rowset_writer(
^
be/src/common/status.h:697: expanded from macro 'DORIS_TRY'
if (!res.has_value()) [[unlikely]] { \
^
be/src/olap/base_tablet.cpp:1140: +1, including nesting penalty of 0, nesting level increased to 1
RETURN_IF_ERROR(std::dynamic_pointer_cast<BetaRowset>(rowset)->load_segments(&segments));
^
be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/olap/base_tablet.cpp:1140: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(std::dynamic_pointer_cast<BetaRowset>(rowset)->load_segments(&segments));
^
be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/olap/base_tablet.cpp:1146: +1, including nesting penalty of 0, nesting level increased to 1
if (self->tablet_state() == TABLET_NOTREADY) {
^
be/src/olap/base_tablet.cpp:1151: +1, including nesting penalty of 0, nesting level increased to 1
RETURN_IF_ERROR(self->get_all_rs_id_unlocked(cur_version - 1, &cur_rowset_ids));
^
be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/olap/base_tablet.cpp:1151: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(self->get_all_rs_id_unlocked(cur_version - 1, &cur_rowset_ids));
^
be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/olap/base_tablet.cpp:1170: +1, including nesting penalty of 0, nesting level increased to 1
if (segments.size() <= 1) {
^
be/src/olap/base_tablet.cpp:1171: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(calc_delete_bitmap(self, rowset, segments, specified_rowsets, delete_bitmap,
^
be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/olap/base_tablet.cpp:1171: +3, including nesting penalty of 2, nesting level increased to 3
RETURN_IF_ERROR(calc_delete_bitmap(self, rowset, segments, specified_rowsets, delete_bitmap,
^
be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/olap/base_tablet.cpp:1174: +1, nesting level increased to 1
} else {
^
be/src/olap/base_tablet.cpp:1176: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(calc_delete_bitmap(self, rowset, segments, specified_rowsets, delete_bitmap,
^
be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/olap/base_tablet.cpp:1176: +3, including nesting penalty of 2, nesting level increased to 3
RETURN_IF_ERROR(calc_delete_bitmap(self, rowset, segments, specified_rowsets, delete_bitmap,
^
be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/olap/base_tablet.cpp:1179: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(token->wait());
^
be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/olap/base_tablet.cpp:1179: +3, including nesting penalty of 2, nesting level increased to 3
RETURN_IF_ERROR(token->wait());
^
be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/olap/base_tablet.cpp:1183: +1, including nesting penalty of 0, nesting level increased to 1
if (watch.get_elapse_time_us() < 1 * 1000 * 1000) {
^
be/src/olap/base_tablet.cpp:1185: +1, nesting level increased to 1
} else {
^
be/src/olap/base_tablet.cpp:1193: nesting level increased to 1
[](size_t sum, const segment_v2::SegmentSharedPtr& s) { return sum += s->num_rows(); });
^
be/src/olap/base_tablet.cpp:1200: +1, including nesting penalty of 0, nesting level increased to 1
if (config::enable_merge_on_write_correctness_check && rowset->num_rows() != 0) {
^
be/src/olap/base_tablet.cpp:1200: +1
if (config::enable_merge_on_write_correctness_check && rowset->num_rows() != 0) {
^
be/src/olap/base_tablet.cpp:1205: +2, including nesting penalty of 1, nesting level increased to 2
if (!st.ok()) {
^
be/src/olap/base_tablet.cpp:1210: +1, including nesting penalty of 0, nesting level increased to 1
if (transient_rs_writer) {
^
be/src/olap/base_tablet.cpp:1211: +2, including nesting penalty of 1, nesting level increased to 2
DBUG_EXECUTE_IF("Tablet.update_delete_bitmap.partial_update_write_rowset_fail", {
^
be/src/util/debug_points.h:36: expanded from macro 'DBUG_EXECUTE_IF'
if (UNLIKELY(config::enable_debug_points)) { \
^
be/src/olap/base_tablet.cpp:1211: +3, including nesting penalty of 2, nesting level increased to 3
DBUG_EXECUTE_IF("Tablet.update_delete_bitmap.partial_update_write_rowset_fail", {
^
be/src/util/debug_points.h:38: expanded from macro 'DBUG_EXECUTE_IF'
if (dp) { \
^
be/src/olap/base_tablet.cpp:1212: +4, including nesting penalty of 3, nesting level increased to 4
if (rand() % 100 < (100 * dp->param("percent", 0.5))) {
^
be/src/olap/base_tablet.cpp:1220: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(transient_rs_writer->flush());
^
be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/olap/base_tablet.cpp:1220: +3, including nesting penalty of 2, nesting level increased to 3
RETURN_IF_ERROR(transient_rs_writer->flush());
^
be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/olap/base_tablet.cpp:1222: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(transient_rs_writer->build(transient_rowset));
^
be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/olap/base_tablet.cpp:1222: +3, including nesting penalty of 2, nesting level increased to 3
RETURN_IF_ERROR(transient_rs_writer->build(transient_rowset));
^
be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
be/src/olap/base_tablet.cpp:1231: +1, including nesting penalty of 0, nesting level increased to 1
RETURN_IF_ERROR(self->save_delete_bitmap(txn_info, txn_id, delete_bitmap,
^
be/src/common/status.h:612: expanded from macro 'RETURN_IF_ERROR'
do { \
^
be/src/olap/base_tablet.cpp:1231: +2, including nesting penalty of 1, nesting level increased to 2
RETURN_IF_ERROR(self->save_delete_bitmap(txn_info, txn_id, delete_bitmap,
^
be/src/common/status.h:614: expanded from macro 'RETURN_IF_ERROR'
if (UNLIKELY(!_status_.ok())) { \
^
@@ -1114,7 +1114,7 @@ | |||
} | |||
} | |||
|
|||
Status BaseTablet::update_delete_bitmap(const BaseTabletSPtr& self, const TabletTxnInfo* txn_info, | |||
Status BaseTablet::update_delete_bitmap(const BaseTabletSPtr& self, TabletTxnInfo* txn_info, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: function 'update_delete_bitmap' exceeds recommended size/complexity thresholds [readability-function-size]
Status BaseTablet::update_delete_bitmap(const BaseTabletSPtr& self, TabletTxnInfo* txn_info,
^
Additional context
be/src/olap/base_tablet.cpp:1116: 117 lines including whitespace and comments (threshold 80)
Status BaseTablet::update_delete_bitmap(const BaseTabletSPtr& self, TabletTxnInfo* txn_info,
^
TPC-H: Total hot run time: 40497 ms
|
TeamCity be ut coverage result: |
TPC-H: Total hot run time: 41270 ms
|
TPC-DS: Total hot run time: 169926 ms
|
ClickBench: Total hot run time: 30.62 s
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
…ict concurrent partial update apache#35739
…ict concurrent partial update (#35739) ## Proposed changes Issue Number: close #xxx 1. In #30366 , in order to avoid that some incomplete delete bitmap left in `txn_info->delete_bitmap` when publish failed, we make a copy of `txn_info->delete_bitmap` before we start to compute the delete bitmap 2. this copy is not updated back to `txn_info->delete_bitmap` after `rowset->rowset_meta()->merge_rowset_meta()` is successful 3. `txnManager::publish_txn()` saves the contents of `txn_info->delete_bitmap` to RocksDB after the call to `update_delete_bitmap()`, due to the issue in step 2, bitmap generated during publish is not saved to RocksDB, so if BE restarts at this point, this part of the incremental delete bitmap will be lost 4. it will result in duplicated keys on querying
…ict concurrent partial update (apache#35739) ## Proposed changes Issue Number: close #xxx 1. In apache#30366 , in order to avoid that some incomplete delete bitmap left in `txn_info->delete_bitmap` when publish failed, we make a copy of `txn_info->delete_bitmap` before we start to compute the delete bitmap 2. this copy is not updated back to `txn_info->delete_bitmap` after `rowset->rowset_meta()->merge_rowset_meta()` is successful 3. `txnManager::publish_txn()` saves the contents of `txn_info->delete_bitmap` to RocksDB after the call to `update_delete_bitmap()`, due to the issue in step 2, bitmap generated during publish is not saved to RocksDB, so if BE restarts at this point, this part of the incremental delete bitmap will be lost 4. it will result in duplicated keys on querying
…ict concurrent partial update apache#35739 (apache#35765)
Proposed changes
Issue Number: close #xxx
txn_info->delete_bitmap
when publish failed, we make a copy oftxn_info->delete_bitmap
before we start to compute the delete bitmaptxn_info->delete_bitmap
afterrowset->rowset_meta()->merge_rowset_meta()
is successfultxnManager::publish_txn()
saves the contents oftxn_info->delete_bitmap
to RocksDB after the call toupdate_delete_bitmap()
, due to the issue in step 2, bitmap generated during publish is not saved to RocksDB, so if BE restarts at this point, this part of the incremental delete bitmap will be lost