-
Notifications
You must be signed in to change notification settings - Fork 411
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ParallelInputsProcessor
should be able to parallelize additional_input_at_end
#5263
Labels
type/enhancement
The issue or PR belongs to an enhancement.
Comments
12 tasks
SeaRise
pushed a commit
to SeaRise/tiflash
that referenced
this issue
Jul 8, 2022
…tsProcessor` (pingcap#5274) close pingcap#4856, close pingcap#5263
Lloyd-Pottiger
pushed a commit
to Lloyd-Pottiger/tiflash
that referenced
this issue
Jul 12, 2022
…s in README (pingcap#5182) close pingcap#5172, ref pingcap#5178 Enhancement: add a integrated test on DDL module (pingcap#5130) ref pingcap#5129 Revert "Revise default background threads size" (pingcap#5176) close pingcap#5177 chore: remove extra dyn cast (pingcap#5186) close pingcap#5185 Add MPPReceiverSet, which includes ExchangeReceiver and CoprocessorReader (pingcap#5175) ref pingcap#5095 DDL: Use Column Name Instead of Offset to Find the common handle cluster index (pingcap#5166) close pingcap#5154 Add random failpoint in critical paths (pingcap#4876) close pingcap#4807 Segment test framework (pingcap#5150) close pingcap#5151 optimize ps v3 restore (pingcap#5163) ref pingcap#4914 Fix build failed (pingcap#5196) close pingcap#5195 feat: delta tree dispatching (pingcap#5199) close pingcap#5200 feat: introduce specialized API to write fixed length data rapidly (pingcap#5181) close pingcap#5183 Add gtest for Limit, TopN, Projection (pingcap#5187) (pingcap#5188) close pingcap#5187 add `MPPTask::handleError()` (pingcap#5202) ref pingcap#5095 Check result of starting grpc server (pingcap#5257) close pingcap#5255 feat: add optimized routines for aarch64 (pingcap#5231) close pingcap#5240 fix: aarch64-quick-fix (pingcap#5259) close pingcap#5260 Update client-c to support ipv6 (pingcap#5270) close pingcap#5247 upgrade prometheus-cpp to v1.0.1 (pingcap#5279) ref pingcap#2103, close pingcap#5278 Fix README type error (pingcap#5273) ref pingcap#5178 fix(cmake): make sure libc++ is utilized by tiflash-proxy (pingcap#5281) close pingcap#5282 fix the wrong order of execution summary for list based executors (pingcap#5242) close pingcap#5241 Schema: allow loading empty schema diff when the version grows up. (pingcap#5245) close pingcap#5244 Optimize apply speed under heavy write pressure (pingcap#4883) ref pingcap#4728 update proxy to raftstore-proxy-6.2 (pingcap#5287) ref pingcap#4982 Flush segment cache when doing the compaction (pingcap#5284) close pingcap#5179 metrics: Fix incorrect metrics for delta_merge tasks (pingcap#5061) close pingcap#5055 dep: upgrade jemalloc (pingcap#5197) close pingcap#5258 *: TiFlash pagectl/dttool use only-decryption mode (pingcap#5271) close pingcap#5122 suppresion false positive report from tsan (pingcap#5303) close pingcap#5088 Refine test framework code and tests (pingcap#5261) close pingcap#5262 feat: add logical cpu cores and memory into grafana (pingcap#5124) close pingcap#3821 Implement TimeToSec function push down (pingcap#5235) close pingcap#5116 feat: implement shiftRight function push down (pingcap#5156) close pingcap#5100 schema : make update to partition tables when 'set tiflash replica' (pingcap#5267) close pingcap#5266 Replace initializer_list with vector for planner test framework (pingcap#5307) close pingcap#5295 KVStore: decouple flush region and CompactLog with a new FFI fn_try_flush_data (pingcap#5283) ref pingcap#5170 refine error message in mpptask (pingcap#5304) ref pingcap#5095 Implement ReverseUTF8/Reverse function push down (pingcap#5233) close pingcap#5111 Optimize comparision for collation `UTF8_BIN` and `UTF8MB4_BIN` (pingcap#5299) ref pingcap#5294 feat : support set tiflash mode ddl action (pingcap#5256) ref pingcap#5252 Add non-blocking functions for MPMCQueue (pingcap#5311) close pingcap#5310 add random segment test for CI weekly (pingcap#5300) close pingcap#5301 *: tidy FunctionString.cpp (pingcap#5312) close pingcap#5313 ci: fix check-license github action (pingcap#5318) close pingcap#5317 update proxy to raftstore-proxy-6.2 (pingcap#5316) ref pingcap#4982 Change one `additional_input_at_end` to many streams in `ParallelInputsProcessor` (pingcap#5274) close pingcap#4856, close pingcap#5263 support fine grained shuffle for window function (pingcap#5048) close pingcap#5142 feat: pushdown get_format into TiFlash (pingcap#5269) close pingcap#5115 fix: format throw data truncated error (pingcap#5272) close pingcap#4891 Print content of columns for gtest (pingcap#5243) close pingcap#5203 *: also enable O3 for aarch64 (pingcap#5338) close pingcap#5342 Add debug image build target for CentOS7 (pingcap#5344) close pingcap#5343 *: mini refactor (pingcap#5326) close pingcap#4739 Refactor initialize of background pool (pingcap#5190) close pingcap#5189 delete copy/move ctor of MPMCQueue explicitly (pingcap#5328) close pingcap#5329 Introduce proxy_server and new-mock-engine-store (pingcap#5319) ref pingcap#5170 fix: incorrect uptime in grafana panel Signed-off-by: Lloyd-Pottiger <yan1579196623@gmail.com>
Lloyd-Pottiger
pushed a commit
to Lloyd-Pottiger/tiflash
that referenced
this issue
Jul 19, 2022
…tsProcessor` (pingcap#5274) close pingcap#4856, close pingcap#5263
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Enhancement
Consider the TPC-H sql 13:
Execution plan:
The
customer
table is considered as hash table and theorders
table is considered as probe table.After the probe phase is done, due to
left join
, some rows that are not used incustomer
hash table should also be outputted, which is handled byNonJoinedBlockInputStream
. Also, there are severalNonJoinedBlockInputStream
used to speed up this work.The aggregation(
count(o_orderkey)
) is handled byParallelAggregatingBlockInputStream
. It uses aParallelInputsProcessor
to parallelize the aggregation operation.Then we can describe what's the problem.
tiflash/dbms/src/DataStreams/ParallelInputsProcessor.h
Lines 243 to 267 in cbe6ab5
The
additional_input_at_end
is only one input stream and it is processed by the last thread to finish the task. It's typically used forNonJoinedBlockInputStream
because it must be processed after all probe tasks are done.tiflash/dbms/src/Flash/Coprocessor/DAGQueryBlockInterpreter.cpp
Lines 385 to 403 in cbe6ab5
Several input streams(in the above case, is
NonJoinedBlockInputStream
) are combined into only one input stream.So the data from several
NonJoinedBlockInputStream
can only be aggregated in one thread.The larger the amount of data, the slower the speed.
The text was updated successfully, but these errors were encountered: