-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[enhencement](segcompaction) cancel inflight segcompaction tasks faster when load finish #28901
[enhencement](segcompaction) cancel inflight segcompaction tasks faster when load finish #28901
Conversation
…er when load finish [Goal] When building the rowset writer, avoid waiting for inflight segcompaction to elimite long tail latency for load. [Current situation] 1. The segcompaction of a rowset is executed serially. During the build phase, we need to wait for the completion of the inflight segcompaction task. 2. If the rowset writer finishes writing and starts building meta, then segments that have not been compacted will not be submitted to segcompaction worker. We simply ignore them to accelerate the build process. 3. But this is not enough. If a segcompaction task has already been submitted to the worker thread pool, we will set a cancelled flag for the worker, and nothing will be done during execution to complete the task ASAP. 4. But this is still not enough. Although the latency of the segcompaction task has been shortened by aforemetioned method, tasks may still be queuing in the thread pool. [Solution] We can increase the worker thread pool to avoid queuing congestion, but this is not the best solution. Segcompaction should be a best effort work, and should not use too many CPU and memory resources. So we adopted the strategy of unbinding build and segcompaction, specifically: 1. For the segcompaction task that is performing compaction operations, we should not interrupt it, otherwise it may cause file corruption 2. For those tasks still queued, we no longer care about their results (because these tasks will know they are cancelled and will not perform any actual operations), so we just ignore them and continue with the subsequent rowset build process Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
run buildall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clang-tidy made some suggestions
@@ -997,17 +997,22 @@ Status StorageEngine::submit_compaction_task(TabletSharedPtr tablet, CompactionT | |||
return _submit_compaction_task(tablet, compaction_type, force); | |||
} | |||
|
|||
Status StorageEngine::_handle_seg_compaction(SegcompactionWorker* worker, | |||
SegCompactionCandidatesSharedPtr segments) { | |||
Status StorageEngine::_handle_seg_compaction(std::shared_ptr<SegcompactionWorker> worker, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method '_handle_seg_compaction' can be made static [readability-convert-member-functions-to-static]
Status StorageEngine::_handle_seg_compaction(std::shared_ptr<SegcompactionWorker> worker, | |
static Status StorageEngine::_handle_seg_compaction(std::shared_ptr<SegcompactionWorker> worker, |
worker->compact_segments(segments); | ||
// return OK here. error will be reported via BetaRowsetWriter::_segcompaction_status | ||
return Status::OK(); | ||
} | ||
|
||
Status StorageEngine::submit_seg_compaction_task(SegcompactionWorker* worker, | ||
Status StorageEngine::submit_seg_compaction_task(std::shared_ptr<SegcompactionWorker> worker, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
warning: method 'submit_seg_compaction_task' can be made static [readability-convert-member-functions-to-static]
Status StorageEngine::submit_seg_compaction_task(std::shared_ptr<SegcompactionWorker> worker, | |
static Status StorageEngine::submit_seg_compaction_task(std::shared_ptr<SegcompactionWorker> worker, |
TeamCity be ut coverage result: |
(From new machine)TeamCity pipeline, clickbench performance test result: |
TPC-H test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
PR approved by at least one committer and no changes requested. |
PR approved by anyone and no changes requested. |
…er when load finish (apache#28901) [Goal] When building the rowset writer, avoid waiting for inflight segcompaction to elimite long tail latency for load. [Current situation] 1. The segcompaction of a rowset is executed serially. During the build phase, we need to wait for the completion of the inflight segcompaction task. 2. If the rowset writer finishes writing and starts building meta, then segments that have not been compacted will not be submitted to segcompaction worker. We simply ignore them to accelerate the build process. 3. But this is not enough. If a segcompaction task has already been submitted to the worker thread pool, we will set a cancelled flag for the worker, and nothing will be done during execution to complete the task ASAP. 4. But this is still not enough. Although the latency of the segcompaction task has been shortened by aforemetioned method, tasks may still be queuing in the thread pool. [Solution] We can increase the worker thread pool to avoid queuing congestion, but this is not the best solution. Segcompaction should be a best effort work, and should not use too many CPU and memory resources. So we adopted the strategy of unbinding build and segcompaction, specifically: 1. For the segcompaction task that is performing compaction operations, we should not interrupt it, otherwise it may cause file corruption 2. For those tasks still queued, we no longer care about their results (because these tasks will know they are cancelled and will not perform any actual operations), so we just ignore them and continue with the subsequent rowset build process Signed-off-by: freemandealer <freeman.zhang1992@gmail.com>
[Goal]
When building the rowset writer, avoid waiting for inflight segcompaction to elimite long tail latency for load.
[Current situation]
The segcompaction of a rowset is executed serially. During the build phase, we need to wait for the completion of the inflight segcompaction task.
If the rowset writer finishes writing and starts building meta, then segments that have not been compacted will not be submitted to segcompaction worker. We simply ignore them to accelerate the build process.
But this is not enough. If a segcompaction task has already been submitted to the worker thread pool, we will set a cancelled flag for the worker, and nothing will be done during execution to complete the task ASAP.
But this is still not enough. Although the latency of the segcompaction task has been shortened by aforemetioned method, tasks may still be queuing in the thread pool.
[Solution]
We can increase the worker thread pool to avoid queuing congestion, but this is not the best solution.
Segcompaction should be a best effort work, and should not use too many CPU and memory resources. So we adopted the strategy of unbinding build and segcompaction, specifically:
For the segcompaction task that is performing compaction operations, we should not interrupt it, otherwise it may cause file corruption
For those tasks still queued, we no longer care about their results (because these tasks will know they are cancelled and will not perform any actual operations), so we just ignore them and continue with the subsequent rowset build process
Proposed changes
Issue Number: close #xxx
Further comments
If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...