Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](csv-reader) fix column split error when there is escape character #34364 #34505

Merged

Conversation

liaoxin01
Copy link
Contributor

cherry pick from #34364

@liaoxin01
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

Copy link
Contributor

github-actions bot commented May 8, 2024

clang-tidy review says "All clean, LGTM! 👍"

@doris-robot
Copy link

TPC-H: Total hot run time: 49725 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 69f7da67ae67b3c7511d422d418f8e3c9f2fbc5d, data reload: false

------ Round 1 ----------------------------------
q1	17908	4402	4306	4306
q2	2024	146	146	146
q3	10448	1900	1909	1900
q4	10293	1233	1346	1233
q5	8741	3875	3925	3875
q6	231	120	122	120
q7	2035	1614	1593	1593
q8	9255	2721	2723	2721
q9	11074	10396	10359	10359
q10	8595	3538	3520	3520
q11	413	245	250	245
q12	467	300	314	300
q13	18370	3972	4023	3972
q14	357	314	338	314
q15	500	459	451	451
q16	665	575	567	567
q17	1128	956	972	956
q18	7952	7000	6843	6843
q19	1679	1640	1539	1539
q20	513	290	307	290
q21	4416	4098	4097	4097
q22	486	378	401	378
Total cold run time: 117550 ms
Total hot run time: 49725 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4354	4286	4306	4286
q2	322	221	223	221
q3	4166	4139	4140	4139
q4	2738	2747	2722	2722
q5	7165	7151	7097	7097
q6	236	115	121	115
q7	3239	2828	2814	2814
q8	4349	4477	4479	4477
q9	17343	17128	17060	17060
q10	4218	4288	4244	4244
q11	768	665	701	665
q12	1028	861	870	861
q13	6891	3743	3770	3743
q14	462	436	418	418
q15	496	456	457	456
q16	744	687	684	684
q17	3838	3834	3857	3834
q18	8796	8716	8898	8716
q19	1715	1731	1653	1653
q20	2383	2202	2123	2123
q21	8447	8521	8479	8479
q22	1048	955	966	955
Total cold run time: 84746 ms
Total hot run time: 79762 ms

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 37.81% (8078/21364)
Line Coverage: 29.45% (65936/223882)
Region Coverage: 28.92% (33946/117382)
Branch Coverage: 24.77% (17417/70308)
Coverage Report: http://coverage.selectdb-in.cc/coverage/69f7da67ae67b3c7511d422d418f8e3c9f2fbc5d_69f7da67ae67b3c7511d422d418f8e3c9f2fbc5d/report/index.html

@doris-robot
Copy link

TPC-DS: Total hot run time: 203178 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 69f7da67ae67b3c7511d422d418f8e3c9f2fbc5d, data reload: false

query1	927	387	381	381
query2	6524	2811	2877	2811
query3	6918	199	199	199
query4	20862	17831	17802	17802
query5	19725	6531	6483	6483
query6	289	222	250	222
query7	4170	300	295	295
query8	260	246	240	240
query9	3077	2662	2594	2594
query10	418	316	308	308
query11	11220	10687	10770	10687
query12	129	78	70	70
query13	5569	679	667	667
query14	17518	13267	13536	13267
query15	359	215	231	215
query16	6456	277	262	262
query17	1762	1441	869	869
query18	2346	418	413	413
query19	212	148	148	148
query20	80	76	78	76
query21	189	98	98	98
query22	6001	5075	4970	4970
query23	32534	31839	31836	31836
query24	6986	6518	6476	6476
query25	512	425	418	418
query26	531	169	159	159
query27	1841	296	292	292
query28	6202	2384	2348	2348
query29	2978	2772	2675	2675
query30	240	171	161	161
query31	905	711	789	711
query32	69	58	60	58
query33	406	260	253	253
query34	853	472	484	472
query35	1125	928	942	928
query36	1322	1152	1351	1152
query37	91	59	58	58
query38	3027	2949	2961	2949
query39	1367	1319	1326	1319
query40	199	100	101	100
query41	41	36	36	36
query42	84	81	82	81
query43	900	615	691	615
query44	1108	722	720	720
query45	244	227	229	227
query46	1250	977	951	951
query47	1841	1788	1635	1635
query48	996	705	682	682
query49	620	361	371	361
query50	848	614	610	610
query51	4727	4597	4672	4597
query52	97	78	81	78
query53	453	324	309	309
query54	2667	2462	2472	2462
query55	80	78	76	76
query56	225	203	194	194
query57	1214	1157	1171	1157
query58	216	197	202	197
query59	4129	4092	4323	4092
query60	206	209	210	209
query61	90	87	90	87
query62	868	462	471	462
query63	463	334	327	327
query64	2440	1570	1371	1371
query65	3654	3593	3588	3588
query66	793	366	380	366
query67	15991	15839	15473	15473
query68	9330	652	671	652
query69	573	351	363	351
query70	1571	1362	1483	1362
query71	425	299	309	299
query72	6455	3450	3427	3427
query73	729	325	313	313
query74	6354	5854	5828	5828
query75	5417	3642	3683	3642
query76	5720	1132	1213	1132
query77	985	248	253	248
query78	12664	12012	11698	11698
query79	9237	642	643	642
query80	1352	409	388	388
query81	482	230	231	230
query82	1285	95	102	95
query83	161	138	138	138
query84	261	69	67	67
query85	864	299	298	298
query86	326	326	294	294
query87	3258	3008	2991	2991
query88	4943	2327	2326	2326
query89	482	316	319	316
query90	1927	205	207	205
query91	168	145	132	132
query92	56	50	50	50
query93	6300	574	570	570
query94	686	206	208	206
query95	1096	1039	1028	1028
query96	654	331	328	328
query97	6543	6435	6439	6435
query98	185	171	173	171
query99	3000	945	911	911
Total cold run time: 315933 ms
Total hot run time: 203178 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.02 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 69f7da67ae67b3c7511d422d418f8e3c9f2fbc5d, data reload: false

query1	0.03	0.02	0.02
query2	0.07	0.02	0.02
query3	0.25	0.04	0.04
query4	1.80	0.07	0.07
query5	0.52	0.52	0.53
query6	1.24	0.62	0.62
query7	0.01	0.00	0.01
query8	0.03	0.02	0.03
query9	0.52	0.49	0.47
query10	0.54	0.54	0.54
query11	0.13	0.08	0.08
query12	0.11	0.09	0.09
query13	0.62	0.61	0.61
query14	0.78	0.78	0.78
query15	0.79	0.76	0.75
query16	0.36	0.37	0.38
query17	1.02	0.98	1.01
query18	0.23	0.27	0.25
query19	1.93	1.88	1.85
query20	0.02	0.01	0.01
query21	15.48	0.57	0.55
query22	1.97	1.78	2.03
query23	17.38	0.95	0.94
query24	4.64	4.60	1.14
query25	0.40	0.12	0.05
query26	0.84	0.15	0.16
query27	0.04	0.04	0.04
query28	4.84	0.75	0.76
query29	12.72	2.36	2.21
query30	0.55	0.53	0.52
query31	2.81	0.39	0.36
query32	3.40	0.49	0.48
query33	3.11	3.05	3.05
query34	15.27	4.83	4.80
query35	4.88	4.86	4.89
query36	1.08	1.01	1.01
query37	0.06	0.04	0.05
query38	0.04	0.02	0.02
query39	0.02	0.02	0.01
query40	0.16	0.15	0.14
query41	0.07	0.01	0.01
query42	0.02	0.02	0.01
query43	0.02	0.01	0.02
Total cold run time: 100.8 s
Total hot run time: 31.02 s

@doris-robot
Copy link

Load test result on machine: 'aliyun_ecs.c7a.8xlarge_32C64G'

Load test result on commit 69f7da67ae67b3c7511d422d418f8e3c9f2fbc5d with default session variables
Stream load json:         20 seconds loaded 2358488459 Bytes, about 112 MB/s
Stream load orc:          59 seconds loaded 1101869774 Bytes, about 17 MB/s
Stream load parquet:      32 seconds loaded 861443392 Bytes, about 25 MB/s
Insert into select:       21.7 seconds inserted 10000000 Rows, about 460K ops/s

@dataroaring dataroaring merged commit 0a38945 into apache:branch-2.0 May 8, 2024
24 of 26 checks passed
mongo360 pushed a commit to mongo360/doris that referenced this pull request Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants