Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement](function) optimization for substr with ascii string #29799

Merged
merged 3 commits into from
Jan 11, 2024

Conversation

BiteTheDDDDt
Copy link
Contributor

@BiteTheDDDDt BiteTheDDDDt commented Jan 10, 2024

Proposed changes

optimization for substr with ascii string

select count(substr(l_comment, 1, 10)) from lineitem;

before:
图片

after:
图片

Further comments

If this is a relatively large or complex change, kick off the discussion at dev@doris.apache.org by explaining why you chose the solution you did and what alternatives you considered, etc...

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@BiteTheDDDDt
Copy link
Contributor Author

run buildall

Copy link
Contributor

clang-tidy review says "All clean, LGTM! 👍"

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 10, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@doris-robot
Copy link

TPC-H: Total hot run time: 38933 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 1aaebfa423e117854be6701404451afd7ea5f9a3, data reload: false

------ Round 1 ----------------------------------
q1	17941	5012	5006	5006
q2	2035	146	145	145
q3	10666	1112	1110	1110
q4	10272	854	817	817
q5	7834	3244	3136	3136
q6	205	125	126	125
q7	892	521	494	494
q8	9279	2016	2034	2016
q9	6679	6523	6516	6516
q10	8257	3077	3101	3077
q11	418	219	216	216
q12	359	194	199	194
q13	18099	3469	3449	3449
q14	238	214	217	214
q15	558	520	520	520
q16	443	385	401	385
q17	957	504	547	504
q18	7375	6753	6649	6649
q19	1582	1462	1383	1383
q20	542	294	294	294
q21	2821	2381	2414	2381
q22	347	302	307	302
Total cold run time: 107799 ms
Total hot run time: 38933 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4998	4965	5044	4965
q2	305	202	201	201
q3	3335	3308	3295	3295
q4	2246	2268	2262	2262
q5	5829	5810	5803	5803
q6	193	120	118	118
q7	2331	1870	1870	1870
q8	3458	3609	3604	3604
q9	8883	8850	8854	8850
q10	3728	3845	3836	3836
q11	550	433	416	416
q12	792	604	600	600
q13	6313	3234	3167	3167
q14	297	258	273	258
q15	576	513	511	511
q16	513	469	461	461
q17	2104	2021	2009	2009
q18	8788	8364	8284	8284
q19	1624	1646	1629	1629
q20	2197	1944	1917	1917
q21	6101	5709	5829	5709
q22	556	479	459	459
Total cold run time: 65717 ms
Total hot run time: 60224 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 178536 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 1aaebfa423e117854be6701404451afd7ea5f9a3, data reload: false

query1	937	340	337	337
query2	6716	2060	1824	1824
query3	6705	209	206	206
query4	26249	22234	22215	22215
query5	5546	485	520	485
query6	276	199	183	183
query7	4619	275	273	273
query8	221	204	199	199
query9	9175	2832	2743	2743
query10	609	271	243	243
query11	16226	15444	15376	15376
query12	137	82	77	77
query13	1650	350	329	329
query14	12148	7283	7275	7275
query15	238	192	203	192
query16	6536	271	274	271
query17	1930	489	492	489
query18	1952	286	266	266
query19	291	143	135	135
query20	87	84	77	77
query21	191	92	92	92
query22	5184	4811	4906	4811
query23	31709	31026	30977	30977
query24	12166	2833	2845	2833
query25	577	348	337	337
query26	1757	157	150	150
query27	2853	278	286	278
query28	6826	1935	1925	1925
query29	2088	391	386	386
query30	289	148	145	145
query31	963	771	792	771
query32	95	69	60	60
query33	752	272	284	272
query34	901	448	437	437
query35	928	756	749	749
query36	1263	1237	1306	1237
query37	186	69	74	69
query38	3414	3273	3280	3273
query39	1333	1290	1299	1290
query40	302	98	90	90
query41	39	36	35	35
query42	100	96	92	92
query43	543	506	513	506
query44	1092	709	739	709
query45	211	183	178	178
query46	1045	656	674	656
query47	1717	1583	1615	1583
query48	391	320	316	316
query49	1277	314	316	314
query50	742	328	320	320
query51	5425	5257	5358	5257
query52	100	83	90	83
query53	227	155	155	155
query54	1422	580	584	580
query55	100	92	89	89
query56	221	197	208	197
query57	1020	978	971	971
query58	244	215	216	215
query59	2711	2514	2502	2502
query60	242	233	248	233
query61	83	83	82	82
query62	602	419	418	418
query63	179	163	155	155
query64	5857	1722	1626	1626
query65	3354	3263	3266	3263
query66	1326	342	334	334
query67	15707	15262	15216	15216
query68	11378	533	523	523
query69	577	279	283	279
query70	1685	1560	1438	1438
query71	516	211	238	211
query72	4854	2858	2875	2858
query73	2059	326	321	321
query74	7000	6463	6429	6429
query75	4908	2298	2277	2277
query76	6319	1057	1136	1057
query77	694	281	272	272
query78	9670	8755	8682	8682
query79	1043	520	497	497
query80	562	352	349	349
query81	455	212	211	211
query82	208	88	94	88
query83	164	140	134	134
query84	270	59	64	59
query85	961	299	296	296
query86	385	401	366	366
query87	3544	3457	3458	3457
query88	2779	2225	2246	2225
query89	339	260	253	253
query90	1862	193	206	193
query91	176	128	141	128
query92	62	56	52	52
query93	1013	434	428	428
query94	764	191	193	191
query95	463	426	418	418
query96	615	318	313	313
query97	4277	4153	4210	4153
query98	207	203	182	182
query99	1024	756	734	734
Total cold run time: 292835 ms
Total hot run time: 178536 ms

@doris-robot
Copy link

(From new machine)TeamCity pipeline, clickbench performance test result:
the sum of best hot time: 45.82 seconds
stream load tsv: 566 seconds loaded 74807831229 Bytes, about 126 MB/s
stream load json: 19 seconds loaded 2358488459 Bytes, about 118 MB/s
stream load orc: 66 seconds loaded 1101869774 Bytes, about 15 MB/s
stream load parquet: 32 seconds loaded 861443392 Bytes, about 25 MB/s
insert into select: 21.2 seconds inserted 10000000 Rows, about 471K ops/s
storage size: 17183830963 Bytes

@doris-robot
Copy link

ClickBench: Total hot run time: 30.97 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 1aaebfa423e117854be6701404451afd7ea5f9a3, data reload: false

query1	0.09	0.06	0.06
query2	0.06	0.03	0.03
query3	0.25	0.11	0.12
query4	1.74	0.13	0.12
query5	0.52	0.51	0.51
query6	1.33	0.66	0.64
query7	0.02	0.02	0.01
query8	0.04	0.02	0.02
query9	0.56	0.49	0.50
query10	0.54	0.54	0.53
query11	0.14	0.09	0.09
query12	0.12	0.09	0.09
query13	0.60	0.60	0.60
query14	0.77	0.81	0.79
query15	0.81	0.80	0.78
query16	0.34	0.35	0.35
query17	0.99	0.96	1.01
query18	0.27	0.25	0.25
query19	1.84	1.77	1.78
query20	0.02	0.01	0.01
query21	15.40	0.57	0.55
query22	2.63	2.52	2.27
query23	17.40	0.92	0.77
query24	2.51	0.60	5.59
query25	1.80	0.15	0.15
query26	0.14	0.14	0.12
query27	0.16	0.14	0.14
query28	9.38	0.80	0.82
query29	12.57	3.20	3.28
query30	0.54	0.52	0.52
query31	2.78	0.35	0.37
query32	3.34	0.49	0.48
query33	3.21	3.26	3.21
query34	15.71	4.16	4.14
query35	4.16	4.16	4.16
query36	1.10	1.05	1.04
query37	0.07	0.05	0.05
query38	0.04	0.03	0.03
query39	0.02	0.02	0.02
query40	0.16	0.14	0.12
query41	0.07	0.02	0.01
query42	0.02	0.01	0.02
query43	0.02	0.02	0.02
Total cold run time: 104.28 s
Total hot run time: 30.97 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.69% (8654/23584)
Line Coverage: 28.73% (70472/245297)
Region Coverage: 27.67% (36432/131678)
Branch Coverage: 24.37% (18635/76482)
Coverage Report: http://coverage.selectdb-in.cc/coverage/1aaebfa423e117854be6701404451afd7ea5f9a3_1aaebfa423e117854be6701404451afd7ea5f9a3/report/index.html

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 1aaebfa423e117854be6701404451afd7ea5f9a3 with default session variables
Stream load json:         19 seconds loaded 2358488459 Bytes, about 118 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      33 seconds loaded 861443392 Bytes, about 24 MB/s
Insert into select:       12.4 seconds inserted 10000000 Rows, about 806K ops/s

Copy link
Contributor

@HappenLee HappenLee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@yanyxin
Copy link

yanyxin commented Jul 24, 2024

@BiteTheDDDDt Will the modifications be synchronized to 2.1.5

@BiteTheDDDDt
Copy link
Contributor Author

@BiteTheDDDDt Will the modifications be synchronized to 2.1.5

Already included

@yanyxin
Copy link

yanyxin commented Jul 24, 2024

@BiteTheDDDDt Will the modifications be synchronized to 2.1.5

Already included

The position of line 235 in the function_string. h file in 2.1.5 is inconsistent with the current modification, lacking the judgment of fixed_pos>index.size(), resulting in incorrect results in 2.1.5

@BiteTheDDDDt
Copy link
Contributor Author

@BiteTheDDDDt Will the modifications be synchronized to 2.1.5

Already included

The position of line 235 in the function_string. h file in 2.1.5 is inconsistent with the current modification, lacking the judgment of fixed_pos>index.size(), resulting in incorrect results in 2.1.5

The code 2.1.5 and master in this regard should be the same. Can you provide a specific example to illustrate the problem of wrong results?

@yanyxin
Copy link

yanyxin commented Jul 24, 2024

The code 2.1.5 and master in this regard should be the same. Can you provide a specific example to illustrate the problem of wrong results?

select substr('老人年轻并不等于动的金币第七十二',17); The result of executing 2.0.13 is null,2.1.5 The result is '老人年轻并不等于动的金币第七十二',Due to the presence of fixed_pos>index.size() in line 247 of the function_string. h file in 2.0.13, but not in branch 2.1.5, 2.0.13 is consistent with master

@BiteTheDDDDt
Copy link
Contributor Author

The code 2.1.5 and master in this regard should be the same. Can you provide a specific example to illustrate the problem of wrong results?

select substr('老人年轻并不等于动的金币第七十二',17); The result of executing 2.0.13 is null,2.1.5 The result is '老人年轻并不等于动的金币第七十二',Due to the presence of fixed_pos>index.size() in line 247 of the function_string. h file in 2.0.13, but not in branch 2.1.5, 2.0.13 is consistent with master

this behavior is changed by #28352, so I think 2.1 is same with master, and 2.0 still remained old behavior

@yanyxin
Copy link

yanyxin commented Jul 24, 2024

The code 2.1.5 and master in this regard should be the same. Can you provide a specific example to illustrate the problem of wrong results?

select substr('老人年轻并不等于动的金币第七十二',17); The result of executing 2.0.13 is null,2.1.5 The result is '老人年轻并不等于动的金币第七十二',Due to the presence of fixed_pos>index.size() in line 247 of the function_string. h file in 2.0.13, but not in branch 2.1.5, 2.0.13 is consistent with master

this behavior is changed by #28352, so I think 2.1 is same with master, and 2.0 still remained old behavior

Substring (VARCHAR str, INT pos, [INT len]), so when pos exceeds the length of str, is the correct result null? However, 2.1.5 is currently not available;

@yanyxin
Copy link

yanyxin commented Jul 24, 2024

The code 2.1.5 and master in this regard should be the same. Can you provide a specific example to illustrate the problem of wrong results?

select substr('老人年轻并不等于动的金币第七十二',17); The result of executing 2.0.13 is null,2.1.5 The result is '老人年轻并不等于动的金币第七十二',Due to the presence of fixed_pos>index.size() in line 247 of the function_string. h file in 2.0.13, but not in branch 2.1.5, 2.0.13 is consistent with master

this behavior is changed by #28352, so I think 2.1 is same with master, and 2.0 still remained old behavior

In line 243 of function_string. h in 2.1.5, if we do not check whether fixed_pos is greater than index.size(), will there be index access errors

@BiteTheDDDDt
Copy link
Contributor Author

@koarz hi, could you help me answer the question? I'm not very get familiar with new logic

@koarz
Copy link
Contributor

koarz commented Jul 29, 2024

@BiteTheDDDDt Will the modifications be synchronized to 2.1.5

Already included

The position of line 235 in the function_string. h file in 2.1.5 is inconsistent with the current modification, lacking the judgment of fixed_pos>index.size(), resulting in incorrect results in 2.1.5

sorry, when I changed the logic of this part of the code, ignored this part, the return of the empty result is correct, I did not test the Chinese case led to the error was not found, i'll fix it soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.0.4-merged dev/3.0.0-merged p0_b reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants