Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improvement](jdbc catalog) Optimize JdbcCatalog case mapping stability #40891

Merged
merged 1 commit into from
Sep 29, 2024

Conversation

zy-kkk
Copy link
Member

@zy-kkk zy-kkk commented Sep 17, 2024

This PR makes the following changes to the uppercase and lowercase mapping of JdbcCatalog

  1. The identifierMapping is managed by JdbcExternalCatalog instead of JdbcClient to better control its lifecycle
  2. The identifierMapping no longer loads remoteName alone, but Catalog controls the loading uniformly
  3. The identifierMapping will be loaded when each FE performs makeSureInitialized() to ensure that each FE has a mapping
  4. The initialization of mapping will only be performed once in makeSureInitialized(), which means that even if you use metaCache, if your source data is updated when identifierMapping is enabled, you must refresh the catalog to query normally.
  5. The identifierMapping is only responsible for the properties of the Catalog and is no longer affected by the fe config, simplifying the processing logic
  6. If lower_case_mete_names is false and meta_names_mapping is empty in the catalog properties, the identifierMapping will no longer take effect, further enhancing the stability of the default settings
  7. The JdbcClient is no longer closed during onRefreshCache, reducing the repeated creation of resources, improving reuse, and reducing the leakage of some global shared threads

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@zy-kkk
Copy link
Member Author

zy-kkk commented Sep 17, 2024

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 41326 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 600e5d4f46de694367b70368d7cbedbf8332b6b6, data reload: false

------ Round 1 ----------------------------------
q1	17597	7618	7271	7271
q2	2041	154	159	154
q3	10617	1109	1174	1109
q4	10562	755	756	755
q5	7752	3067	3050	3050
q6	237	160	155	155
q7	1015	624	598	598
q8	9444	2022	2067	2022
q9	6809	6369	6406	6369
q10	7007	2296	2279	2279
q11	441	252	251	251
q12	419	221	224	221
q13	17793	2979	2977	2977
q14	248	221	219	219
q15	565	517	523	517
q16	674	607	607	607
q17	969	786	789	786
q18	7210	6753	6681	6681
q19	1403	989	973	973
q20	585	296	279	279
q21	3961	3230	3070	3070
q22	1088	996	983	983
Total cold run time: 108437 ms
Total hot run time: 41326 ms

----- Round 2, with runtime_filter_mode=off -----
q1	7213	7215	7292	7215
q2	318	226	230	226
q3	2943	2955	3007	2955
q4	1963	1804	1807	1804
q5	5615	5592	5568	5568
q6	232	146	152	146
q7	2253	1822	1806	1806
q8	3336	3414	3409	3409
q9	8733	8947	8709	8709
q10	3520	3403	3474	3403
q11	578	495	467	467
q12	818	613	597	597
q13	10587	3147	3146	3146
q14	290	285	287	285
q15	583	536	527	527
q16	728	668	674	668
q17	1797	1585	1557	1557
q18	8164	7799	7825	7799
q19	1754	1582	1392	1392
q20	2102	1935	1920	1920
q21	5509	5251	5335	5251
q22	1154	1061	1076	1061
Total cold run time: 70190 ms
Total hot run time: 59911 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 198860 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 600e5d4f46de694367b70368d7cbedbf8332b6b6, data reload: false

query1	1280	888	876	876
query2	6400	2086	2040	2040
query3	10776	3988	3855	3855
query4	64243	29078	23467	23467
query5	5032	493	470	470
query6	418	169	185	169
query7	5445	309	286	286
query8	316	222	230	222
query9	8626	2656	2623	2623
query10	441	282	284	282
query11	17133	15231	15660	15231
query12	159	106	107	106
query13	1501	450	403	403
query14	10961	7530	7647	7530
query15	206	177	176	176
query16	6873	497	453	453
query17	1230	606	577	577
query18	1245	308	314	308
query19	210	157	153	153
query20	128	116	112	112
query21	220	107	108	107
query22	4916	4449	4526	4449
query23	34640	34548	34231	34231
query24	6009	2917	2891	2891
query25	514	403	404	403
query26	637	156	158	156
query27	1653	279	288	279
query28	4195	2470	2411	2411
query29	659	438	432	432
query30	229	154	155	154
query31	952	743	797	743
query32	77	53	56	53
query33	436	290	296	290
query34	912	482	487	482
query35	803	725	715	715
query36	1065	914	943	914
query37	144	86	87	86
query38	3946	3779	3937	3779
query39	1460	1414	1411	1411
query40	208	96	98	96
query41	50	49	47	47
query42	120	95	97	95
query43	520	489	497	489
query44	1133	813	802	802
query45	197	167	165	165
query46	1117	738	823	738
query47	1904	1796	1847	1796
query48	453	357	361	357
query49	686	396	404	396
query50	830	401	404	401
query51	7026	6882	6914	6882
query52	99	86	94	86
query53	246	173	178	173
query54	561	436	455	436
query55	76	75	76	75
query56	290	261	263	261
query57	1185	1084	1079	1079
query58	230	222	230	222
query59	3046	2935	2944	2935
query60	304	265	266	265
query61	122	101	107	101
query62	764	633	629	629
query63	224	184	181	181
query64	1401	666	671	666
query65	3231	3166	3170	3166
query66	641	299	312	299
query67	16020	15915	15305	15305
query68	1435	551	562	551
query69	490	284	291	284
query70	1224	1121	1079	1079
query71	344	268	270	268
query72	6080	4176	3982	3982
query73	754	328	328	328
query74	9383	8885	8959	8885
query75	3294	2635	2712	2635
query76	1724	838	907	838
query77	437	289	300	289
query78	10284	9495	9605	9495
query79	1163	885	866	866
query80	865	561	566	561
query81	518	254	263	254
query82	1217	233	228	228
query83	227	154	158	154
query84	283	108	99	99
query85	751	423	364	364
query86	328	320	318	318
query87	4387	4389	4264	4264
query88	4231	4009	4004	4004
query89	389	361	362	361
query90	1632	305	304	304
query91	159	168	164	164
query92	77	71	72	71
query93	921	870	871	870
query94	598	372	378	372
query95	443	408	404	404
query96	484	480	479	479
query97	3132	3121	3097	3097
query98	228	220	222	220
query99	1440	1328	1304	1304
Total cold run time: 307696 ms
Total hot run time: 198860 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.77 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 600e5d4f46de694367b70368d7cbedbf8332b6b6, data reload: false

query1	0.05	0.04	0.04
query2	0.06	0.03	0.03
query3	0.23	0.06	0.06
query4	1.65	0.10	0.10
query5	0.53	0.51	0.50
query6	1.14	0.74	0.73
query7	0.02	0.01	0.01
query8	0.04	0.03	0.03
query9	0.57	0.51	0.49
query10	0.55	0.55	0.56
query11	0.14	0.11	0.11
query12	0.14	0.11	0.11
query13	0.61	0.58	0.58
query14	3.00	3.05	3.08
query15	0.90	0.82	0.81
query16	0.39	0.38	0.36
query17	1.03	1.06	1.06
query18	0.19	0.20	0.19
query19	1.86	1.78	2.04
query20	0.01	0.02	0.01
query21	15.36	0.61	0.59
query22	3.13	3.81	1.55
query23	17.47	0.87	0.74
query24	2.39	0.47	0.79
query25	0.20	0.09	0.04
query26	0.45	0.13	0.14
query27	0.03	0.04	0.03
query28	11.65	1.10	1.06
query29	12.65	3.20	3.16
query30	0.25	0.06	0.05
query31	2.87	0.37	0.39
query32	3.28	0.47	0.46
query33	2.94	3.00	3.02
query34	16.92	4.37	4.38
query35	4.44	4.46	4.38
query36	0.69	0.48	0.47
query37	0.08	0.06	0.06
query38	0.04	0.04	0.03
query39	0.03	0.02	0.02
query40	0.15	0.12	0.12
query41	0.08	0.02	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.03
Total cold run time: 108.27 s
Total hot run time: 31.77 s

yiguolei pushed a commit that referenced this pull request Sep 26, 2024
…ping stability (#41330)

pick #40891
This PR makes the following changes to the uppercase and lowercase
mapping of JdbcCatalog
1. The identifierMapping is managed by JdbcExternalCatalog instead of
JdbcClient to better control its lifecycle
2. The identifierMapping no longer loads remoteName alone, but Catalog
controls the loading uniformly
3. The identifierMapping will be loaded when each FE performs
makeSureInitialized() to ensure that each FE has a mapping
4. The initialization of mapping will only be performed once in
makeSureInitialized(), which means that even if you use metaCache, if
your source data is updated when identifierMapping is enabled, you must
refresh the catalog to query normally.
5. The identifierMapping is only responsible for the properties of the
Catalog and is no longer affected by the fe config, simplifying the
processing logic
6. If lower_case_mete_names is false and meta_names_mapping is empty in
the catalog properties, the identifierMapping will no longer take
effect, further enhancing the stability of the default settings
7. The JdbcClient is no longer closed during onRefreshCache, reducing
the repeated creation of resources, improving reuse, and reducing the
leakage of some global shared threads
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Sep 29, 2024
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

PR approved by anyone and no changes requested.

@morningman morningman merged commit d4b9c5e into apache:master Sep 29, 2024
26 of 29 checks passed
hello-stephen pushed a commit that referenced this pull request Sep 29, 2024
@zy-kkk zy-kkk deleted the jdbc_lower_mapping branch September 30, 2024 09:47
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Sep 30, 2024
…ty (apache#40891)

This PR makes the following changes to the uppercase and lowercase
mapping of JdbcCatalog
1. The identifierMapping is managed by JdbcExternalCatalog instead of
JdbcClient to better control its lifecycle
2. The identifierMapping no longer loads remoteName alone, but Catalog
controls the loading uniformly
3. The identifierMapping will be loaded when each FE performs
makeSureInitialized() to ensure that each FE has a mapping
4. The initialization of mapping will only be performed once in
makeSureInitialized(), which means that even if you use metaCache, if
your source data is updated when identifierMapping is enabled, you must
refresh the catalog to query normally.
5. The identifierMapping is only responsible for the properties of the
Catalog and is no longer affected by the fe config, simplifying the
processing logic
6. If lower_case_mete_names is false and meta_names_mapping is empty in
the catalog properties, the identifierMapping will no longer take
effect, further enhancing the stability of the default settings
7. The JdbcClient is no longer closed during onRefreshCache, reducing
the repeated creation of resources, improving reuse, and reducing the
leakage of some global shared threads
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.7-merged dev/3.0.x p0_b reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants