Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[opt](hash_join) Merging all build blocks at once will cause performance issue #37471

Merged
merged 1 commit into from
Jul 10, 2024

Conversation

mrhhsg
Copy link
Member

@mrhhsg mrhhsg commented Jul 8, 2024

Proposed changes

For better performance, blocks should be merged as soon as possible in the sink operator, allowing parallel execution with the upstream pipeline.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@mrhhsg
Copy link
Member Author

mrhhsg commented Jul 8, 2024

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

@@ -100,30 +100,33 @@ size_t PartitionedHashJoinSinkLocalState::revocable_mem_size(RuntimeState* state
}

Status PartitionedHashJoinSinkLocalState::_revoke_unpartitioned_block(RuntimeState* state) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: function '_revoke_unpartitioned_block' has cognitive complexity of 52 (threshold 50) [readability-function-cognitive-complexity]

Status PartitionedHashJoinSinkLocalState::_revoke_unpartitioned_block(RuntimeState* state) {
                                          ^
Additional context

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:108: +1, including nesting penalty of 0, nesting level increased to 1

    if (inner_sink_state_) {
    ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:113: +1, including nesting penalty of 0, nesting level increased to 1

    if (build_block.rows() <= 1) {
    ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:119: +1, including nesting penalty of 0, nesting level increased to 1

    if (build_block.columns() > num_slots) {
    ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:123: nesting level increased to 1

    auto spill_func = [build_block = std::move(build_block), state, this]() mutable {
                      ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:130: nesting level increased to 2

                      [](std::vector<uint32_t>& indices) { indices.reserve(reserved_size); });
                      ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:132: nesting level increased to 2

        auto flush_rows = [&state, this](std::unique_ptr<vectorized::MutableBlock>& partition_block,
                          ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:137: +3, including nesting penalty of 2, nesting level increased to 3

            if (!status.ok()) {
            ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:149: +2, including nesting penalty of 1, nesting level increased to 2

        while (offset < total_rows) {
        ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:153: +3, including nesting penalty of 2, nesting level increased to 3

            for (size_t i = 0; i != build_block.columns(); ++i) {
            ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:167: +3, including nesting penalty of 2, nesting level increased to 3

            for (size_t i = 0; i != sub_block.rows(); ++i) {
            ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:171: +3, including nesting penalty of 2, nesting level increased to 3

            for (uint32_t partition_idx = 0; partition_idx != p._partition_count; ++partition_idx) {
            ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:177: +4, including nesting penalty of 3, nesting level increased to 4

                if (UNLIKELY(!partition_block)) {
                ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:185: +4, including nesting penalty of 3, nesting level increased to 4

                    if (!st.ok()) {
                    ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:195: +4, including nesting penalty of 3, nesting level increased to 4

                if (partition_block->rows() >= reserved_size || is_last_block) {
                ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:195: +1

                if (partition_block->rows() >= reserved_size || is_last_block) {
                                                             ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:196: +5, including nesting penalty of 4, nesting level increased to 5

                    if (!flush_rows(partition_block, spilling_stream)) {
                    ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:208: nesting level increased to 1

    auto exception_catch_func = [spill_func, this]() mutable {
                                ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:209: nesting level increased to 2

        auto status = [&]() {
                      ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:210: +3, including nesting penalty of 2, nesting level increased to 3

            RETURN_IF_CATCH_EXCEPTION(spill_func());
            ^

be/src/common/exception.h:89: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

    do {                                                                                         \
    ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:210: +4, including nesting penalty of 3, nesting level increased to 4

            RETURN_IF_CATCH_EXCEPTION(spill_func());
            ^

be/src/common/exception.h:94: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

        } catch (const doris::Exception& e) {                                                    \
          ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:210: +5, including nesting penalty of 4, nesting level increased to 5

            RETURN_IF_CATCH_EXCEPTION(spill_func());
            ^

be/src/common/exception.h:95: expanded from macro 'RETURN_IF_CATCH_EXCEPTION'

            if (e.code() == doris::ErrorCode::MEM_ALLOC_FAILED) {                                \
            ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:214: +2, including nesting penalty of 1, nesting level increased to 2

        if (!status.ok()) {
        ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:228: +1, including nesting penalty of 0, nesting level increased to 1

    DBUG_EXECUTE_IF(
    ^

be/src/util/debug_points.h:36: expanded from macro 'DBUG_EXECUTE_IF'

    if (UNLIKELY(config::enable_debug_points)) {                              \
    ^

be/src/pipeline/exec/partitioned_hash_join_sink_operator.cpp:228: +2, including nesting penalty of 1, nesting level increased to 2

    DBUG_EXECUTE_IF(
    ^

be/src/util/debug_points.h:38: expanded from macro 'DBUG_EXECUTE_IF'

        if (dp) {                                                             \
        ^

@doris-robot
Copy link

TPC-H: Total hot run time: 40248 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 65f88c41bcf4a90b900dec5ad6af4e3491ed5201, data reload: false

------ Round 1 ----------------------------------
q1	17807	4410	4326	4326
q2	2028	192	189	189
q3	10507	1294	1170	1170
q4	10213	850	774	774
q5	7522	2724	2717	2717
q6	227	138	136	136
q7	974	601	601	601
q8	9220	2092	2113	2092
q9	8671	6604	6628	6604
q10	8665	3771	3780	3771
q11	452	241	240	240
q12	399	242	239	239
q13	17767	3013	2975	2975
q14	277	240	241	240
q15	552	495	503	495
q16	505	384	378	378
q17	973	697	687	687
q18	8110	7608	7487	7487
q19	5331	1486	1358	1358
q20	687	316	332	316
q21	4975	3117	3222	3117
q22	397	344	336	336
Total cold run time: 116259 ms
Total hot run time: 40248 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4362	4263	4276	4263
q2	383	271	268	268
q3	3053	2827	2755	2755
q4	1876	1634	1646	1634
q5	5296	5322	5321	5321
q6	226	130	131	130
q7	2131	1702	1744	1702
q8	3210	3397	3309	3309
q9	8430	8375	8312	8312
q10	3918	3713	3675	3675
q11	579	499	470	470
q12	835	603	634	603
q13	17487	2966	2965	2965
q14	299	270	271	270
q15	509	467	481	467
q16	463	441	427	427
q17	1769	1468	1499	1468
q18	7696	7603	7459	7459
q19	1736	1624	1653	1624
q20	1994	1772	1764	1764
q21	4847	4816	4731	4731
q22	620	557	533	533
Total cold run time: 71719 ms
Total hot run time: 54150 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174236 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 65f88c41bcf4a90b900dec5ad6af4e3491ed5201, data reload: false

query1	904	379	372	372
query2	6448	2432	2199	2199
query3	6647	204	214	204
query4	27394	17446	17346	17346
query5	4191	489	480	480
query6	258	170	176	170
query7	4606	315	301	301
query8	330	311	303	303
query9	8436	2399	2409	2399
query10	628	325	283	283
query11	10937	10199	10258	10199
query12	135	86	82	82
query13	1657	377	377	377
query14	10256	7007	7508	7007
query15	239	188	192	188
query16	7788	339	314	314
query17	1788	548	549	548
query18	1895	288	285	285
query19	198	153	153	153
query20	90	82	87	82
query21	223	132	127	127
query22	4721	4396	4441	4396
query23	33790	33044	33158	33044
query24	10665	2942	2854	2854
query25	617	391	387	387
query26	707	164	166	164
query27	2217	324	324	324
query28	5670	2086	2091	2086
query29	876	642	630	630
query30	290	152	151	151
query31	982	744	734	734
query32	99	56	59	56
query33	680	307	304	304
query34	869	480	506	480
query35	718	653	634	634
query36	1093	943	940	940
query37	134	75	76	75
query38	2898	2776	2761	2761
query39	865	803	811	803
query40	212	131	128	128
query41	60	54	53	53
query42	126	99	105	99
query43	620	521	559	521
query44	1066	754	739	739
query45	196	169	173	169
query46	1086	723	749	723
query47	1897	1779	1839	1779
query48	388	303	311	303
query49	1055	432	440	432
query50	774	400	405	400
query51	6937	6772	6837	6772
query52	108	93	94	93
query53	365	289	293	289
query54	947	471	460	460
query55	76	75	77	75
query56	318	293	288	288
query57	1159	1087	1105	1087
query58	271	361	265	265
query59	3373	3128	3019	3019
query60	303	268	286	268
query61	96	96	107	96
query62	639	441	437	437
query63	316	286	298	286
query64	9549	2195	7035	2195
query65	3204	3110	3081	3081
query66	831	339	339	339
query67	15722	15150	15049	15049
query68	9093	560	567	560
query69	822	463	366	366
query70	1440	1118	1140	1118
query71	556	288	293	288
query72	9205	5681	5997	5681
query73	1793	328	327	327
query74	6154	5538	5536	5536
query75	6485	2664	2733	2664
query76	5745	938	931	931
query77	759	301	301	301
query78	9832	9035	9070	9035
query79	11485	536	518	518
query80	1231	477	485	477
query81	581	232	223	223
query82	262	102	105	102
query83	348	169	165	165
query84	274	84	86	84
query85	1090	304	299	299
query86	361	276	296	276
query87	3442	3144	3108	3108
query88	4907	2466	2445	2445
query89	517	372	380	372
query90	2094	183	185	183
query91	143	104	103	103
query92	68	48	49	48
query93	5710	513	514	513
query94	1497	216	211	211
query95	407	306	308	306
query96	616	281	268	268
query97	3168	3010	3004	3004
query98	217	197	199	197
query99	1082	840	846	840
Total cold run time: 306017 ms
Total hot run time: 174236 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.31 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 65f88c41bcf4a90b900dec5ad6af4e3491ed5201, data reload: false

query1	0.04	0.03	0.04
query2	0.08	0.03	0.04
query3	0.23	0.05	0.06
query4	1.67	0.09	0.09
query5	0.51	0.49	0.49
query6	1.14	0.73	0.73
query7	0.02	0.01	0.01
query8	0.05	0.04	0.05
query9	0.55	0.47	0.49
query10	0.54	0.54	0.54
query11	0.15	0.11	0.12
query12	0.15	0.12	0.12
query13	0.59	0.59	0.59
query14	0.77	0.77	0.80
query15	0.85	0.81	0.81
query16	0.39	0.36	0.34
query17	1.03	0.98	1.00
query18	0.21	0.25	0.25
query19	1.72	1.71	1.71
query20	0.02	0.01	0.01
query21	15.39	0.77	0.65
query22	3.84	7.91	1.64
query23	18.34	1.41	1.29
query24	2.15	0.22	0.23
query25	0.16	0.09	0.08
query26	0.28	0.21	0.21
query27	0.46	0.23	0.23
query28	13.31	1.03	0.99
query29	12.63	3.28	3.24
query30	0.25	0.06	0.05
query31	2.85	0.39	0.38
query32	3.32	0.47	0.47
query33	2.90	2.89	2.95
query34	17.18	4.35	4.37
query35	4.42	4.42	4.45
query36	0.66	0.49	0.51
query37	0.18	0.15	0.14
query38	0.14	0.13	0.15
query39	0.04	0.03	0.04
query40	0.15	0.13	0.12
query41	0.09	0.05	0.04
query42	0.06	0.04	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.55 s
Total hot run time: 30.31 s

@mrhhsg
Copy link
Member Author

mrhhsg commented Jul 9, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39759 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

------ Round 1 ----------------------------------
q1	17639	4275	4288	4275
q2	2013	202	201	201
q3	10433	1150	1152	1150
q4	10213	800	841	800
q5	7525	2691	2648	2648
q6	219	141	141	141
q7	959	598	616	598
q8	9219	2076	2075	2075
q9	8707	6514	6575	6514
q10	8808	3767	3737	3737
q11	476	244	228	228
q12	422	218	219	218
q13	17802	2928	2987	2928
q14	281	225	238	225
q15	525	472	498	472
q16	510	379	369	369
q17	981	645	678	645
q18	8029	7401	7325	7325
q19	8607	1401	1413	1401
q20	679	308	313	308
q21	4811	3217	3165	3165
q22	375	339	336	336
Total cold run time: 119233 ms
Total hot run time: 39759 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4442	4213	4199	4199
q2	388	279	265	265
q3	3008	2954	2913	2913
q4	2000	1649	1737	1649
q5	5653	5518	5447	5447
q6	225	139	132	132
q7	2166	1873	1818	1818
q8	3269	3404	3441	3404
q9	8794	8770	8899	8770
q10	4109	3832	3730	3730
q11	592	483	500	483
q12	825	639	643	639
q13	17140	3189	3182	3182
q14	319	286	281	281
q15	548	476	491	476
q16	486	426	436	426
q17	1833	1540	1535	1535
q18	8176	7930	7841	7841
q19	1871	1552	1611	1552
q20	3064	1905	1866	1866
q21	7454	4921	4738	4738
q22	651	559	526	526
Total cold run time: 77013 ms
Total hot run time: 55872 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173918 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

query1	914	379	357	357
query2	6455	2397	2345	2345
query3	6631	214	216	214
query4	28150	17525	17261	17261
query5	3709	503	495	495
query6	268	179	168	168
query7	4599	271	283	271
query8	314	299	298	298
query9	8641	2409	2402	2402
query10	573	313	284	284
query11	11073	9972	10019	9972
query12	125	91	80	80
query13	1652	362	363	362
query14	9940	7505	7572	7505
query15	239	184	181	181
query16	7407	337	314	314
query17	1344	561	520	520
query18	1948	267	265	265
query19	201	145	143	143
query20	83	82	77	77
query21	204	136	132	132
query22	4317	4060	3991	3991
query23	33832	33957	33578	33578
query24	11159	2911	2874	2874
query25	594	408	386	386
query26	709	145	146	145
query27	2335	264	307	264
query28	6083	2100	2093	2093
query29	901	632	632	632
query30	256	161	152	152
query31	975	792	762	762
query32	97	56	54	54
query33	643	302	300	300
query34	937	479	509	479
query35	702	603	621	603
query36	1151	984	967	967
query37	151	86	82	82
query38	2852	2741	2798	2741
query39	869	804	797	797
query40	201	119	115	115
query41	53	52	51	51
query42	125	98	99	98
query43	589	537	578	537
query44	1125	736	736	736
query45	199	168	163	163
query46	1070	693	709	693
query47	1822	1755	1767	1755
query48	357	291	285	285
query49	817	406	426	406
query50	770	382	397	382
query51	6915	6804	6685	6685
query52	98	96	92	92
query53	363	287	286	286
query54	850	446	430	430
query55	74	71	74	71
query56	286	268	270	268
query57	1137	1052	1031	1031
query58	238	240	284	240
query59	3490	3279	3207	3207
query60	309	275	290	275
query61	120	117	115	115
query62	791	651	652	651
query63	324	289	291	289
query64	9206	2185	1640	1640
query65	3151	3123	3097	3097
query66	799	326	331	326
query67	15383	14915	14871	14871
query68	4534	522	522	522
query69	589	462	359	359
query70	1100	1153	1125	1125
query71	396	288	276	276
query72	7169	5610	5712	5610
query73	744	316	320	316
query74	5920	5454	5453	5453
query75	3384	2686	2669	2669
query76	2324	957	915	915
query77	636	296	305	296
query78	10393	9524	8976	8976
query79	2270	507	502	502
query80	2069	544	477	477
query81	598	219	216	216
query82	745	133	128	128
query83	285	165	169	165
query84	247	88	85	85
query85	1496	330	311	311
query86	473	319	308	308
query87	3308	3101	3077	3077
query88	4103	2343	2334	2334
query89	472	392	389	389
query90	1764	185	188	185
query91	132	104	103	103
query92	65	49	50	49
query93	2108	492	495	492
query94	1106	210	219	210
query95	408	312	309	309
query96	596	263	266	263
query97	3203	3054	3016	3016
query98	218	201	201	201
query99	1734	1240	1303	1240
Total cold run time: 278284 ms
Total hot run time: 173918 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 29.82 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.22	0.06	0.05
query4	1.67	0.08	0.08
query5	0.50	0.49	0.50
query6	1.13	0.74	0.74
query7	0.02	0.01	0.01
query8	0.06	0.04	0.04
query9	0.55	0.50	0.49
query10	0.55	0.55	0.54
query11	0.15	0.12	0.11
query12	0.14	0.12	0.12
query13	0.59	0.58	0.58
query14	0.77	0.77	0.78
query15	0.83	0.80	0.81
query16	0.37	0.37	0.36
query17	0.96	0.94	0.99
query18	0.23	0.21	0.21
query19	1.76	1.73	1.73
query20	0.01	0.01	0.01
query21	15.42	0.76	0.66
query22	4.61	7.77	1.20
query23	18.24	1.34	1.27
query24	2.13	0.21	0.22
query25	0.16	0.09	0.09
query26	0.29	0.21	0.21
query27	0.46	0.23	0.24
query28	13.22	1.02	1.00
query29	12.58	3.30	3.28
query30	0.25	0.06	0.06
query31	2.86	0.39	0.40
query32	3.27	0.48	0.46
query33	2.85	2.90	2.87
query34	16.97	4.34	4.37
query35	4.42	4.44	4.38
query36	0.65	0.45	0.46
query37	0.19	0.15	0.15
query38	0.15	0.15	0.15
query39	0.04	0.03	0.04
query40	0.15	0.13	0.12
query41	0.10	0.05	0.05
query42	0.06	0.04	0.05
query43	0.04	0.04	0.04
Total cold run time: 109.74 s
Total hot run time: 29.82 s

@mrhhsg
Copy link
Member Author

mrhhsg commented Jul 10, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 39702 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

------ Round 1 ----------------------------------
q1	17615	4345	4241	4241
q2	2011	194	187	187
q3	10445	1173	1073	1073
q4	10190	822	847	822
q5	7542	2693	2660	2660
q6	217	136	137	136
q7	949	593	605	593
q8	9215	2054	2066	2054
q9	9215	6517	6514	6514
q10	8735	3753	3747	3747
q11	443	233	235	233
q12	430	231	226	226
q13	19046	2946	2967	2946
q14	277	230	244	230
q15	522	483	495	483
q16	489	403	373	373
q17	972	655	615	615
q18	8192	7598	7344	7344
q19	6565	1443	1516	1443
q20	673	314	325	314
q21	4939	3133	3172	3133
q22	384	346	335	335
Total cold run time: 119066 ms
Total hot run time: 39702 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4362	4225	4224	4224
q2	373	277	266	266
q3	2996	2905	2933	2905
q4	1974	1638	1707	1638
q5	5590	5514	5465	5465
q6	218	132	131	131
q7	2210	1848	1829	1829
q8	3284	3376	3419	3376
q9	8742	8712	8914	8712
q10	4024	3914	3744	3744
q11	586	497	526	497
q12	849	645	624	624
q13	16064	3160	3190	3160
q14	336	276	289	276
q15	527	484	493	484
q16	482	426	454	426
q17	1812	1563	1503	1503
q18	8105	8060	7871	7871
q19	1861	1561	1551	1551
q20	2197	1862	1882	1862
q21	7266	4844	4804	4804
q22	657	555	537	537
Total cold run time: 74515 ms
Total hot run time: 55885 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 173814 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

query1	907	367	364	364
query2	6940	2403	2526	2403
query3	6651	206	215	206
query4	25985	17494	17287	17287
query5	3672	500	484	484
query6	296	165	177	165
query7	4572	296	285	285
query8	307	285	299	285
query9	8542	2384	2374	2374
query10	438	277	261	261
query11	10676	10218	10010	10010
query12	119	82	83	82
query13	1631	373	373	373
query14	10066	7621	7008	7008
query15	248	183	180	180
query16	7410	326	311	311
query17	1346	543	541	541
query18	1859	265	265	265
query19	201	150	152	150
query20	87	89	82	82
query21	209	134	124	124
query22	4343	4015	4000	4000
query23	34125	33659	33560	33560
query24	11358	2983	2866	2866
query25	601	421	382	382
query26	1066	148	146	146
query27	2341	279	280	279
query28	6736	2121	2138	2121
query29	894	634	642	634
query30	251	152	153	152
query31	1012	751	753	751
query32	97	56	52	52
query33	748	301	285	285
query34	1054	498	505	498
query35	698	636	611	611
query36	1137	1002	977	977
query37	141	95	87	87
query38	2962	2892	2840	2840
query39	898	866	861	861
query40	229	129	126	126
query41	55	53	57	53
query42	124	109	137	109
query43	581	548	552	548
query44	1217	731	731	731
query45	193	165	162	162
query46	1087	744	745	744
query47	1844	1762	1748	1748
query48	370	294	298	294
query49	849	409	411	409
query50	783	401	398	398
query51	6944	6827	6678	6678
query52	107	103	95	95
query53	365	296	302	296
query54	892	454	443	443
query55	76	73	73	73
query56	289	265	268	265
query57	1145	1035	1057	1035
query58	240	256	247	247
query59	3463	3239	3211	3211
query60	335	272	276	272
query61	94	108	96	96
query62	794	641	666	641
query63	322	295	291	291
query64	9171	2211	1686	1686
query65	3176	3099	3139	3099
query66	779	366	335	335
query67	15307	14886	15016	14886
query68	5398	539	544	539
query69	631	463	350	350
query70	1088	1107	1102	1102
query71	438	278	278	278
query72	7017	5389	5749	5389
query73	772	328	326	326
query74	5958	5718	5495	5495
query75	3448	2687	2681	2681
query76	3138	917	947	917
query77	616	296	308	296
query78	9603	8919	10248	8919
query79	1317	515	508	508
query80	1610	481	466	466
query81	601	226	221	221
query82	345	136	135	135
query83	261	182	166	166
query84	257	85	87	85
query85	1061	309	305	305
query86	438	322	325	322
query87	3320	3125	3063	3063
query88	3417	2518	2441	2441
query89	482	389	390	389
query90	1603	188	189	188
query91	131	100	102	100
query92	59	49	48	48
query93	1267	516	505	505
query94	1080	211	211	211
query95	406	309	310	309
query96	581	267	271	267
query97	3242	3029	3035	3029
query98	218	194	191	191
query99	1637	1260	1276	1260
Total cold run time: 274895 ms
Total hot run time: 173814 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.66 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

query1	0.04	0.03	0.03
query2	0.08	0.04	0.04
query3	0.22	0.04	0.04
query4	1.69	0.07	0.07
query5	0.48	0.48	0.48
query6	1.14	0.72	0.72
query7	0.02	0.01	0.02
query8	0.05	0.04	0.04
query9	0.56	0.49	0.48
query10	0.55	0.54	0.54
query11	0.16	0.11	0.11
query12	0.14	0.12	0.13
query13	0.61	0.59	0.58
query14	0.76	0.78	0.78
query15	0.84	0.81	0.82
query16	0.37	0.37	0.36
query17	0.99	1.00	0.96
query18	0.23	0.22	0.22
query19	1.77	1.70	1.70
query20	0.02	0.01	0.01
query21	15.39	0.75	0.68
query22	4.86	6.30	1.86
query23	18.29	1.35	1.26
query24	2.09	0.24	0.21
query25	0.15	0.08	0.09
query26	0.30	0.21	0.21
query27	0.45	0.22	0.23
query28	13.27	1.02	1.00
query29	12.59	3.38	3.38
query30	0.26	0.06	0.05
query31	2.87	0.40	0.40
query32	3.25	0.48	0.47
query33	2.87	2.95	2.92
query34	17.06	4.34	4.35
query35	4.41	4.43	4.43
query36	0.64	0.47	0.47
query37	0.18	0.16	0.15
query38	0.15	0.15	0.15
query39	0.04	0.03	0.04
query40	0.16	0.12	0.12
query41	0.10	0.05	0.05
query42	0.05	0.05	0.04
query43	0.04	0.04	0.04
Total cold run time: 110.19 s
Total hot run time: 30.66 s

@mrhhsg
Copy link
Member Author

mrhhsg commented Jul 10, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40510 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

------ Round 1 ----------------------------------
q1	18157	4517	4441	4441
q2	2615	207	201	201
q3	11920	1321	1077	1077
q4	10411	832	811	811
q5	7653	2723	2726	2723
q6	226	143	142	142
q7	1044	627	625	625
q8	9509	2099	2089	2089
q9	8692	6598	6553	6553
q10	8678	3770	3773	3770
q11	470	239	242	239
q12	408	233	230	230
q13	17752	3002	2987	2987
q14	277	254	250	250
q15	520	487	496	487
q16	505	380	384	380
q17	960	694	665	665
q18	8175	7503	7510	7503
q19	2312	1527	1474	1474
q20	704	328	340	328
q21	5021	3204	3359	3204
q22	399	331	342	331
Total cold run time: 116408 ms
Total hot run time: 40510 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4362	4321	4266	4266
q2	381	280	271	271
q3	3205	2794	2819	2794
q4	1893	1608	1572	1572
q5	5311	5335	5319	5319
q6	225	133	134	133
q7	2128	1706	1744	1706
q8	3201	3353	3337	3337
q9	8415	8373	8361	8361
q10	3873	3748	3708	3708
q11	607	502	505	502
q12	781	643	613	613
q13	16362	2958	2985	2958
q14	296	265	268	265
q15	526	484	475	475
q16	471	427	418	418
q17	1775	1511	1490	1490
q18	7542	7693	7389	7389
q19	1772	1686	1628	1628
q20	1973	1774	1818	1774
q21	4809	4581	4734	4581
q22	653	550	546	546
Total cold run time: 70561 ms
Total hot run time: 54106 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 174220 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

query1	916	371	366	366
query2	6459	2417	2346	2346
query3	6679	207	221	207
query4	28543	17355	17313	17313
query5	4191	539	492	492
query6	270	176	197	176
query7	4588	304	290	290
query8	338	305	293	293
query9	8452	2392	2373	2373
query10	446	296	280	280
query11	12825	9991	9917	9917
query12	138	87	86	86
query13	1638	385	370	370
query14	10268	7622	7714	7622
query15	245	192	188	188
query16	7822	315	311	311
query17	1552	572	531	531
query18	1990	279	276	276
query19	198	149	154	149
query20	91	83	90	83
query21	213	133	132	132
query22	4465	4134	4045	4045
query23	33711	33183	33241	33183
query24	12121	2903	2845	2845
query25	670	370	373	370
query26	1790	154	160	154
query27	2974	274	275	274
query28	7584	2061	2063	2061
query29	1128	637	628	628
query30	298	153	176	153
query31	957	732	743	732
query32	88	56	62	56
query33	773	307	305	305
query34	947	496	494	494
query35	685	590	589	589
query36	1122	954	946	946
query37	211	89	82	82
query38	2964	2751	2736	2736
query39	843	802	807	802
query40	289	124	124	124
query41	53	51	55	51
query42	120	101	104	101
query43	600	559	574	559
query44	1266	761	744	744
query45	202	168	173	168
query46	1102	727	728	727
query47	1819	1742	1798	1742
query48	405	297	301	297
query49	1190	419	424	419
query50	796	416	411	411
query51	6962	6866	6717	6717
query52	106	96	96	96
query53	365	296	301	296
query54	1039	467	459	459
query55	75	77	77	77
query56	296	274	275	274
query57	1127	1046	1047	1046
query58	269	271	266	266
query59	3440	3267	3166	3166
query60	310	319	293	293
query61	99	94	97	94
query62	832	651	640	640
query63	327	298	300	298
query64	10463	2301	1682	1682
query65	3445	3137	3128	3128
query66	1391	356	349	349
query67	15532	14829	14909	14829
query68	4618	553	556	553
query69	480	334	332	332
query70	1182	1069	1156	1069
query71	408	290	289	289
query72	7605	6131	5618	5618
query73	765	332	328	328
query74	5888	5497	5450	5450
query75	3589	2716	2706	2706
query76	2755	969	940	940
query77	493	320	327	320
query78	10133	8953	8909	8909
query79	3930	536	533	533
query80	1845	499	503	499
query81	581	226	228	226
query82	1070	145	136	136
query83	328	175	181	175
query84	281	94	94	94
query85	1948	388	303	303
query86	473	322	316	316
query87	3254	3113	3181	3113
query88	4484	2463	2464	2463
query89	497	389	386	386
query90	1852	194	191	191
query91	133	104	104	104
query92	67	51	53	51
query93	5174	525	525	525
query94	1074	217	215	215
query95	414	320	317	317
query96	621	269	272	269
query97	3204	3026	3073	3026
query98	219	206	196	196
query99	1685	1264	1259	1259
Total cold run time: 296163 ms
Total hot run time: 174220 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.4 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 34af1619476c952e9e29c256726cfb9ca4cffd96, data reload: false

query1	0.04	0.04	0.04
query2	0.08	0.04	0.04
query3	0.23	0.06	0.05
query4	1.68	0.08	0.08
query5	0.49	0.49	0.48
query6	1.14	0.73	0.73
query7	0.02	0.02	0.02
query8	0.06	0.04	0.04
query9	0.56	0.51	0.49
query10	0.56	0.54	0.55
query11	0.15	0.11	0.12
query12	0.14	0.12	0.12
query13	0.60	0.58	0.60
query14	0.79	0.80	0.77
query15	0.87	0.84	0.83
query16	0.34	0.36	0.36
query17	1.01	1.02	1.05
query18	0.24	0.23	0.22
query19	1.92	1.88	1.74
query20	0.01	0.01	0.01
query21	15.40	0.80	0.67
query22	4.18	7.55	1.68
query23	18.32	1.33	1.21
query24	2.10	0.22	0.22
query25	0.14	0.08	0.09
query26	0.32	0.20	0.21
query27	0.46	0.24	0.23
query28	13.29	1.02	1.00
query29	12.62	3.28	3.29
query30	0.25	0.06	0.06
query31	2.87	0.38	0.38
query32	3.30	0.47	0.46
query33	2.86	2.93	2.89
query34	16.82	4.32	4.35
query35	4.41	4.48	4.41
query36	0.65	0.46	0.47
query37	0.19	0.15	0.15
query38	0.16	0.16	0.14
query39	0.04	0.04	0.04
query40	0.16	0.13	0.12
query41	0.10	0.04	0.04
query42	0.05	0.05	0.05
query43	0.05	0.04	0.05
Total cold run time: 109.67 s
Total hot run time: 30.4 s

Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions github-actions bot added approved Indicates a PR has been approved by one committer. reviewed labels Jul 10, 2024
Copy link
Contributor

PR approved by anyone and no changes requested.

@mrhhsg mrhhsg merged commit 37baf6c into apache:master Jul 10, 2024
26 of 30 checks passed
@mrhhsg mrhhsg deleted the opt_hash_join branch July 10, 2024 14:28
seawinde pushed a commit to seawinde/doris that referenced this pull request Jul 17, 2024
…nce issue (apache#37471)

## Proposed changes

For better performance, blocks should be merged as soon as possible in
the sink operator, allowing parallel execution with the upstream
pipeline.
dataroaring pushed a commit that referenced this pull request Jul 17, 2024
…nce issue (#37471)

## Proposed changes

For better performance, blocks should be merged as soon as possible in
the sink operator, allowing parallel execution with the upstream
pipeline.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/3.0.1-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants