Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](decompressor) impl ZstdDecompressor #40315

Merged
merged 4 commits into from
Sep 6, 2024

Conversation

suxiaogang223
Copy link
Contributor

Proposed changes

Impl ZstdDecompressor and support read hive text table which is compressed by zstd.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@suxiaogang223
Copy link
Contributor Author

run buildall

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

be/src/exec/decompressor.h Show resolved Hide resolved
@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.83% (9393/25502)
Line Coverage: 28.27% (77462/273988)
Region Coverage: 27.67% (39973/144458)
Branch Coverage: 24.32% (20341/83656)
Coverage Report: http://coverage.selectdb-in.cc/coverage/8b08fbc6db96263f30541adcffd311a25129c7fc_8b08fbc6db96263f30541adcffd311a25129c7fc/report/index.html

@morningman
Copy link
Contributor

run buildall

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.77% (9381/25512)
Line Coverage: 28.19% (77314/274229)
Region Coverage: 27.62% (39936/144601)
Branch Coverage: 24.27% (20330/83764)
Coverage Report: http://coverage.selectdb-in.cc/coverage/b4724a20ccbb99a90216173b24b3cf23233a3122_b4724a20ccbb99a90216173b24b3cf23233a3122/report/index.html

@suxiaogang223
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 40035 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 223ba7f03fb1755cbd93c5b292856efbe689deff, data reload: false

------ Round 1 ----------------------------------
q1	18081	5093	4788	4788
q2	2023	191	183	183
q3	10510	1282	1144	1144
q4	10164	766	742	742
q5	7763	3061	3024	3024
q6	252	140	139	139
q7	1041	619	617	617
q8	9325	2190	2192	2190
q9	7811	6969	6986	6969
q10	7107	2321	2297	2297
q11	483	253	245	245
q12	439	223	220	220
q13	17773	3089	3046	3046
q14	289	241	238	238
q15	574	471	487	471
q16	526	440	419	419
q17	1021	675	732	675
q18	7624	6946	7042	6946
q19	1403	1136	1231	1136
q20	726	321	336	321
q21	4285	3331	3217	3217
q22	1156	1020	1008	1008
Total cold run time: 110376 ms
Total hot run time: 40035 ms

----- Round 2, with runtime_filter_mode=off -----
q1	4807	4755	4740	4740
q2	411	263	273	263
q3	3075	2766	2739	2739
q4	2075	1842	1794	1794
q5	5859	5566	5589	5566
q6	247	130	127	127
q7	2216	1735	1724	1724
q8	3441	3630	3640	3630
q9	8512	8535	8556	8535
q10	3657	3316	3360	3316
q11	614	489	488	488
q12	817	612	584	584
q13	13920	3147	3074	3074
q14	326	274	277	274
q15	544	477	489	477
q16	556	478	469	469
q17	1966	1592	1576	1576
q18	7823	7484	7583	7484
q19	1845	1601	1700	1601
q20	2172	1806	1840	1806
q21	5635	5314	5360	5314
q22	1139	1064	1046	1046
Total cold run time: 71657 ms
Total hot run time: 56627 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187372 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 223ba7f03fb1755cbd93c5b292856efbe689deff, data reload: false

query1	902	373	374	373
query2	6483	1853	1813	1813
query3	6651	210	217	210
query4	29385	23264	23268	23264
query5	4182	486	478	478
query6	260	170	166	166
query7	4598	290	288	288
query8	276	215	209	209
query9	8383	2472	2446	2446
query10	436	270	274	270
query11	17762	15012	15132	15012
query12	146	99	98	98
query13	1651	367	358	358
query14	9861	7321	7011	7011
query15	268	169	172	169
query16	8087	450	433	433
query17	1563	557	559	557
query18	2140	283	275	275
query19	319	141	141	141
query20	121	109	108	108
query21	211	109	102	102
query22	4447	4074	4234	4074
query23	34499	34158	33979	33979
query24	11733	2887	2818	2818
query25	657	392	386	386
query26	1721	154	151	151
query27	2812	267	270	267
query28	7563	2007	2003	2003
query29	1013	403	418	403
query30	305	159	176	159
query31	980	725	772	725
query32	96	53	58	53
query33	750	279	276	276
query34	996	471	469	469
query35	839	736	720	720
query36	1103	936	927	927
query37	180	85	94	85
query38	4032	3875	3878	3875
query39	1430	1510	1390	1390
query40	280	111	113	111
query41	48	45	43	43
query42	112	94	92	92
query43	493	457	450	450
query44	1201	775	731	731
query45	195	168	170	168
query46	1087	751	728	728
query47	1923	1792	1803	1792
query48	362	293	288	288
query49	1163	474	428	428
query50	814	396	395	395
query51	7012	6889	6837	6837
query52	99	87	84	84
query53	253	185	184	184
query54	854	441	446	441
query55	79	75	76	75
query56	266	272	250	250
query57	1193	1081	1094	1081
query58	241	225	257	225
query59	2889	2694	2948	2694
query60	302	272	257	257
query61	106	97	103	97
query62	777	636	674	636
query63	217	190	183	183
query64	5276	676	648	648
query65	3217	3199	3128	3128
query66	1201	342	355	342
query67	15593	15241	15089	15089
query68	3238	581	564	564
query69	374	274	281	274
query70	1162	1082	1099	1082
query71	332	274	275	274
query72	6343	4060	3997	3997
query73	738	318	313	313
query74	9066	8892	8745	8745
query75	3404	2741	2681	2681
query76	1956	954	983	954
query77	441	327	318	318
query78	9601	8948	8979	8948
query79	1031	531	525	525
query80	769	504	492	492
query81	458	245	233	233
query82	241	137	133	133
query83	178	155	150	150
query84	227	76	75	75
query85	683	287	281	281
query86	320	291	303	291
query87	4380	4261	4287	4261
query88	3317	2260	2286	2260
query89	369	285	288	285
query90	1851	185	187	185
query91	122	97	102	97
query92	58	49	47	47
query93	1081	541	538	538
query94	663	295	291	291
query95	347	257	249	249
query96	581	264	264	264
query97	3214	3153	3110	3110
query98	213	205	202	202
query99	1490	1235	1287	1235
Total cold run time: 284342 ms
Total hot run time: 187372 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.38 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 223ba7f03fb1755cbd93c5b292856efbe689deff, data reload: false

query1	0.05	0.04	0.04
query2	0.09	0.04	0.03
query3	0.24	0.05	0.05
query4	1.70	0.07	0.07
query5	0.51	0.51	0.50
query6	1.13	0.72	0.73
query7	0.02	0.01	0.02
query8	0.04	0.04	0.04
query9	0.53	0.49	0.48
query10	0.55	0.55	0.54
query11	0.15	0.12	0.11
query12	0.15	0.12	0.12
query13	0.59	0.59	0.59
query14	1.42	1.46	1.39
query15	0.84	0.81	0.82
query16	0.37	0.38	0.38
query17	1.06	1.05	1.07
query18	0.21	0.20	0.20
query19	1.96	1.88	1.71
query20	0.01	0.00	0.01
query21	15.39	0.68	0.67
query22	4.99	6.45	1.98
query23	18.28	1.35	1.30
query24	2.15	0.23	0.21
query25	0.15	0.08	0.08
query26	0.27	0.18	0.19
query27	0.08	0.07	0.08
query28	13.23	1.01	0.99
query29	12.60	3.32	3.31
query30	0.24	0.06	0.06
query31	2.86	0.41	0.39
query32	3.25	0.46	0.48
query33	2.92	3.00	2.99
query34	16.93	4.32	4.43
query35	4.42	4.48	4.44
query36	0.66	0.48	0.48
query37	0.19	0.16	0.16
query38	0.15	0.16	0.15
query39	0.05	0.04	0.04
query40	0.16	0.13	0.12
query41	0.09	0.05	0.05
query42	0.06	0.05	0.05
query43	0.04	0.04	0.04
Total cold run time: 110.78 s
Total hot run time: 31.38 s

@doris-robot
Copy link

TeamCity be ut coverage result:
Function Coverage: 36.85% (9381/25455)
Line Coverage: 28.24% (77373/274004)
Region Coverage: 27.63% (39949/144563)
Branch Coverage: 24.27% (20335/83784)
Coverage Report: http://coverage.selectdb-in.cc/coverage/223ba7f03fb1755cbd93c5b292856efbe689deff_223ba7f03fb1755cbd93c5b292856efbe689deff/report/index.html

Copy link
Contributor

github-actions bot commented Sep 6, 2024

PR approved by anyone and no changes requested.

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 6, 2024
Copy link
Contributor

github-actions bot commented Sep 6, 2024

PR approved by at least one committer and no changes requested.

@morningman morningman merged commit 9908a18 into apache:master Sep 6, 2024
22 of 28 checks passed
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Sep 9, 2024
Impl ZstdDecompressor and support read hive text table which is
compressed by zstd.
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Sep 25, 2024
Impl ZstdDecompressor and support read hive text table which is
compressed by zstd.
morningman pushed a commit that referenced this pull request Sep 27, 2024
## Proposed changes
pick prs:
#38549
#40183
#40315

---------

Co-authored-by: Calvin Kirs <kirs@apache.org>
@suxiaogang223 suxiaogang223 deleted the zstd_decompressor branch October 10, 2024 08:03
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Oct 10, 2024
Impl ZstdDecompressor and support read hive text table which is
compressed by zstd.
suxiaogang223 added a commit to suxiaogang223/doris that referenced this pull request Oct 10, 2024
Impl ZstdDecompressor and support read hive text table which is
compressed by zstd.
morningman pushed a commit that referenced this pull request Oct 10, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.7-merged dev/3.0.3-merged reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants