Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix](date_function) fix str_to_date function return wrong microsecond issue #47129

Merged
merged 2 commits into from
Jan 20, 2025

Conversation

Yulei-Yang
Copy link
Contributor

@Yulei-Yang Yulei-Yang commented Jan 17, 2025

What problem does this PR solve?

Issue Number: close #47105

Related PR: #24932

Problem Summary:

Release note

str_to_date always return microsecond part for datetime even if user does not specfic %f in date format string. This is wrong.
mysql> select id,str_to_date(dt, '%Y-%m-%d %H:%i:%s') from test1 limit 1;
+------+--------------------------------------+
| id | str_to_date(dt, '%Y-%m-%d %H:%i:%s') |
+------+--------------------------------------+
| 2 | 2024-12-28 10:11:12.000000 |
+------+--------------------------------------+

and constant fold scenario is wrong too:
mysql> select cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d %H:%i:%s') as string);

+--------------------------------------------------------------------------+
| cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d %H:%i:%s') as TEXT) |
+--------------------------------------------------------------------------+
| 2025-01-17 11:59:30.000000 |
+--------------------------------------------------------------------------+

Check List (For Author)

  • Test

    • Regression test
    • Unit Test
    • Manual test (add detailed scripts or steps below)
    • No need to test or manual test. Explain why:
      • This is a refactor/code format and no logic has been changed.
      • Previous test can cover this change.
      • No code files have been changed.
      • Other reason
  • Behavior changed:

    • No.
    • Yes.
  • Does this need documentation?

    • No.
    • Yes.

Check List (For Reviewer who merge this PR)

  • Confirm the release note
  • Confirm test cases
  • Confirm document
  • Add branch pick label

@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@Yulei-Yang
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32247 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit bd1f81b3d1e8d6a1f7db89001ab9451817213af1, data reload: false

------ Round 1 ----------------------------------
q1	17585	5588	5356	5356
q2	2055	288	160	160
q3	10421	1309	747	747
q4	10236	994	516	516
q5	7925	2447	2156	2156
q6	192	176	133	133
q7	903	773	627	627
q8	9249	1398	1193	1193
q9	5375	4876	4918	4876
q10	6883	2348	1898	1898
q11	473	275	257	257
q12	338	354	219	219
q13	17750	3663	3075	3075
q14	221	220	201	201
q15	498	470	456	456
q16	634	606	577	577
q17	558	867	322	322
q18	7236	6478	6452	6452
q19	1207	964	521	521
q20	316	329	195	195
q21	2902	2209	2003	2003
q22	369	335	307	307
Total cold run time: 103326 ms
Total hot run time: 32247 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5396	5424	5629	5424
q2	238	333	235	235
q3	2241	2601	2332	2332
q4	1401	1798	1384	1384
q5	4295	4776	4616	4616
q6	170	160	125	125
q7	2042	1969	1833	1833
q8	2656	2794	2758	2758
q9	7393	7092	7313	7092
q10	2975	3291	2743	2743
q11	587	515	503	503
q12	679	742	640	640
q13	3538	3923	3328	3328
q14	277	309	294	294
q15	518	468	450	450
q16	667	692	669	669
q17	1250	1749	1298	1298
q18	7531	7604	7384	7384
q19	788	1161	1024	1024
q20	1990	2060	1861	1861
q21	5675	5289	4987	4987
q22	632	625	563	563
Total cold run time: 52939 ms
Total hot run time: 51543 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 188727 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit bd1f81b3d1e8d6a1f7db89001ab9451817213af1, data reload: false

query1	924	429	390	390
query2	5846	2185	2087	2087
query3	6729	215	215	215
query4	32812	23213	23185	23185
query5	4074	635	488	488
query6	279	224	187	187
query7	4531	479	297	297
query8	300	247	232	232
query9	9350	2567	2562	2562
query10	473	305	240	240
query11	17699	15272	14888	14888
query12	155	109	111	109
query13	1675	505	389	389
query14	8938	6335	7209	6335
query15	213	200	202	200
query16	7359	579	463	463
query17	1566	713	538	538
query18	1968	402	289	289
query19	208	191	146	146
query20	113	111	108	108
query21	212	123	99	99
query22	4580	4662	4579	4579
query23	34443	33718	33747	33718
query24	6642	2362	2328	2328
query25	492	449	399	399
query26	896	267	153	153
query27	2254	458	360	360
query28	5708	2457	2408	2408
query29	692	565	414	414
query30	234	201	162	162
query31	1001	873	809	809
query32	75	59	56	56
query33	512	355	315	315
query34	753	868	500	500
query35	863	877	769	769
query36	1024	1079	957	957
query37	116	96	75	75
query38	4363	4463	4292	4292
query39	1528	1442	1402	1402
query40	204	121	106	106
query41	57	61	48	48
query42	127	111	105	105
query43	549	564	508	508
query44	1366	836	818	818
query45	179	175	165	165
query46	864	1054	653	653
query47	1921	1884	1899	1884
query48	369	397	328	328
query49	753	494	389	389
query50	666	676	400	400
query51	7010	7006	6851	6851
query52	106	97	91	91
query53	224	253	183	183
query54	472	528	418	418
query55	90	75	76	75
query56	254	265	236	236
query57	1168	1120	1088	1088
query58	239	229	229	229
query59	3223	3102	3095	3095
query60	271	264	261	261
query61	118	114	113	113
query62	821	711	654	654
query63	214	187	179	179
query64	3499	1042	652	652
query65	3271	3155	3123	3123
query66	911	386	311	311
query67	15762	15936	15505	15505
query68	2690	831	559	559
query69	459	285	268	268
query70	1223	1097	1142	1097
query71	367	292	260	260
query72	4986	3781	3956	3781
query73	638	750	354	354
query74	9801	9178	8811	8811
query75	3218	3163	2667	2667
query76	3239	1158	753	753
query77	468	370	278	278
query78	9946	10114	9354	9354
query79	962	875	586	586
query80	1120	523	523	523
query81	526	278	243	243
query82	350	151	123	123
query83	264	171	161	161
query84	235	92	80	80
query85	771	368	308	308
query86	396	311	278	278
query87	4383	4588	4323	4323
query88	3459	2155	2110	2110
query89	389	321	282	282
query90	1675	187	189	187
query91	137	135	107	107
query92	58	59	54	54
query93	959	851	533	533
query94	672	422	296	296
query95	337	267	257	257
query96	487	606	282	282
query97	2815	2883	2723	2723
query98	226	202	199	199
query99	1280	1379	1279	1279
Total cold run time: 274844 ms
Total hot run time: 188727 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 31.59 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit bd1f81b3d1e8d6a1f7db89001ab9451817213af1, data reload: false

query1	0.03	0.03	0.02
query2	0.08	0.03	0.03
query3	0.24	0.07	0.07
query4	1.61	0.10	0.10
query5	0.44	0.43	0.42
query6	1.18	0.66	0.65
query7	0.02	0.02	0.01
query8	0.04	0.04	0.03
query9	0.58	0.51	0.52
query10	0.56	0.58	0.56
query11	0.14	0.11	0.11
query12	0.14	0.10	0.10
query13	0.60	0.60	0.61
query14	2.71	2.74	2.90
query15	0.90	0.83	0.83
query16	0.39	0.38	0.39
query17	1.01	1.07	1.03
query18	0.22	0.21	0.22
query19	1.98	1.79	1.93
query20	0.01	0.01	0.01
query21	15.36	0.97	0.58
query22	0.77	0.80	0.89
query23	15.05	1.45	0.56
query24	3.34	1.72	2.13
query25	0.16	0.14	0.23
query26	0.19	0.15	0.13
query27	0.07	0.05	0.05
query28	14.85	0.94	0.43
query29	12.58	4.05	3.34
query30	0.25	0.10	0.06
query31	2.81	0.60	0.39
query32	3.23	0.56	0.45
query33	2.98	2.99	3.03
query34	16.75	5.22	4.47
query35	4.55	4.49	4.49
query36	0.66	0.49	0.48
query37	0.09	0.07	0.06
query38	0.04	0.04	0.03
query39	0.04	0.03	0.02
query40	0.18	0.13	0.13
query41	0.08	0.03	0.02
query42	0.03	0.02	0.02
query43	0.03	0.03	0.04
Total cold run time: 106.97 s
Total hot run time: 31.59 s

)
"""
sql """ insert into ${tableName} values (3, 'Carl','2024-12-29 10:11:12') """
def result1 = try_sql """
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should also check all constant parameters for fe fold constant

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like: select cast(str_to_date('2024-12-29 10:11:12', '%Y-%m-%d %H:%i:%s.%f') as string);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@Yulei-Yang
Copy link
Contributor Author

run buildall

@doris-robot
Copy link

TPC-H: Total hot run time: 32730 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 99722354ae8c20157efe11f90bac1e48575fbf64, data reload: false

------ Round 1 ----------------------------------
q1	17608	5690	5761	5690
q2	2053	306	172	172
q3	10406	1250	731	731
q4	10209	975	544	544
q5	7558	2424	2196	2196
q6	195	164	135	135
q7	923	777	610	610
q8	9271	1411	1154	1154
q9	5265	4961	4889	4889
q10	6852	2343	1885	1885
q11	471	271	263	263
q12	350	371	220	220
q13	17762	3687	3134	3134
q14	223	231	208	208
q15	535	489	474	474
q16	633	605	582	582
q17	578	877	321	321
q18	7441	6640	6394	6394
q19	1627	947	543	543
q20	316	336	194	194
q21	2871	2289	2072	2072
q22	369	343	319	319
Total cold run time: 103516 ms
Total hot run time: 32730 ms

----- Round 2, with runtime_filter_mode=off -----
q1	5621	5526	5527	5526
q2	238	331	247	247
q3	2338	2674	2342	2342
q4	1377	1850	1391	1391
q5	4315	4808	4768	4768
q6	176	159	129	129
q7	2114	2001	1852	1852
q8	2643	2814	2657	2657
q9	7280	7196	7376	7196
q10	2990	3310	2760	2760
q11	593	525	505	505
q12	704	736	618	618
q13	3553	3908	3330	3330
q14	276	313	286	286
q15	509	479	477	477
q16	672	717	658	658
q17	1254	1752	1268	1268
q18	7785	7554	7392	7392
q19	831	1145	1145	1145
q20	2044	2102	1871	1871
q21	5723	5329	4893	4893
q22	597	600	589	589
Total cold run time: 53633 ms
Total hot run time: 51900 ms

@doris-robot
Copy link

TPC-DS: Total hot run time: 187521 ms
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 99722354ae8c20157efe11f90bac1e48575fbf64, data reload: false

query1	955	385	367	367
query2	6514	2136	2038	2038
query3	6812	218	213	213
query4	33409	23379	23038	23038
query5	4454	634	467	467
query6	285	199	203	199
query7	4611	479	295	295
query8	284	230	215	215
query9	9298	2570	2577	2570
query10	479	309	241	241
query11	18356	15438	15130	15130
query12	153	109	103	103
query13	1657	532	372	372
query14	10590	6424	7525	6424
query15	249	201	189	189
query16	8046	572	453	453
query17	1611	711	558	558
query18	2115	395	295	295
query19	219	192	156	156
query20	118	110	111	110
query21	207	125	104	104
query22	4145	4236	4279	4236
query23	33988	33198	33020	33020
query24	6489	2353	2265	2265
query25	502	463	424	424
query26	1184	277	158	158
query27	2024	475	338	338
query28	5382	2424	2388	2388
query29	733	559	414	414
query30	241	180	166	166
query31	969	893	786	786
query32	95	70	59	59
query33	500	362	310	310
query34	758	859	518	518
query35	812	816	746	746
query36	1017	1010	949	949
query37	126	105	83	83
query38	4108	4200	4207	4200
query39	1436	1418	1421	1418
query40	205	121	107	107
query41	52	52	50	50
query42	119	107	105	105
query43	532	540	511	511
query44	1331	788	787	787
query45	190	171	163	163
query46	893	1052	667	667
query47	1820	1822	1791	1791
query48	383	398	313	313
query49	793	496	393	393
query50	646	691	400	400
query51	6960	6959	6812	6812
query52	105	101	92	92
query53	227	260	194	194
query54	484	489	414	414
query55	107	81	85	81
query56	269	277	243	243
query57	1180	1147	1094	1094
query58	252	232	245	232
query59	3106	3134	2792	2792
query60	278	275	261	261
query61	117	125	120	120
query62	873	721	648	648
query63	224	205	197	197
query64	4587	1103	753	753
query65	3270	3183	3205	3183
query66	1096	432	331	331
query67	15792	15449	15361	15361
query68	2308	837	554	554
query69	462	328	273	273
query70	1215	1151	1116	1116
query71	343	356	284	284
query72	5789	3877	3809	3809
query73	628	763	351	351
query74	10173	8896	9153	8896
query75	3173	3220	2773	2773
query76	2806	1192	786	786
query77	520	385	294	294
query78	10004	10215	9400	9400
query79	3463	819	581	581
query80	1714	534	452	452
query81	584	283	235	235
query82	367	201	128	128
query83	269	174	156	156
query84	241	94	77	77
query85	783	336	301	301
query86	481	317	286	286
query87	4507	4576	4328	4328
query88	4617	2162	2097	2097
query89	396	336	296	296
query90	1842	196	195	195
query91	137	146	110	110
query92	64	57	51	51
query93	2809	901	529	529
query94	736	408	301	301
query95	350	273	266	266
query96	489	610	282	282
query97	2836	2918	2751	2751
query98	229	202	198	198
query99	1268	1374	1255	1255
Total cold run time: 286425 ms
Total hot run time: 187521 ms

@doris-robot
Copy link

ClickBench: Total hot run time: 30.77 s
machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/clickbench-tools
ClickBench test result on commit 99722354ae8c20157efe11f90bac1e48575fbf64, data reload: false

query1	0.03	0.02	0.03
query2	0.08	0.03	0.04
query3	0.24	0.07	0.07
query4	1.62	0.10	0.10
query5	0.42	0.43	0.41
query6	1.15	0.65	0.65
query7	0.03	0.01	0.01
query8	0.04	0.03	0.03
query9	0.58	0.50	0.50
query10	0.56	0.57	0.55
query11	0.14	0.11	0.11
query12	0.14	0.11	0.11
query13	0.61	0.62	0.61
query14	2.81	2.77	2.77
query15	0.90	0.84	0.83
query16	0.39	0.38	0.38
query17	0.98	1.01	1.03
query18	0.23	0.22	0.21
query19	1.96	1.89	2.02
query20	0.02	0.01	0.02
query21	15.36	0.95	0.62
query22	0.75	0.92	0.72
query23	15.13	1.39	0.58
query24	2.67	0.81	1.01
query25	0.22	0.16	0.12
query26	0.26	0.15	0.14
query27	0.08	0.04	0.05
query28	14.09	1.07	0.43
query29	12.53	4.05	3.30
query30	0.25	0.09	0.07
query31	2.82	0.59	0.39
query32	3.23	0.54	0.46
query33	2.96	3.02	3.01
query34	16.62	5.15	4.47
query35	4.53	4.51	4.53
query36	0.66	0.49	0.49
query37	0.09	0.06	0.06
query38	0.05	0.04	0.04
query39	0.03	0.02	0.02
query40	0.16	0.14	0.12
query41	0.08	0.02	0.02
query42	0.04	0.03	0.02
query43	0.03	0.03	0.03
Total cold run time: 105.57 s
Total hot run time: 30.77 s

Copy link
Contributor

PR approved by anyone and no changes requested.

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Jan 20, 2025
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@Yulei-Yang Yulei-Yang merged commit fbac364 into apache:master Jan 20, 2025
29 checks passed
@Yulei-Yang Yulei-Yang deleted the fix_str_to_date branch January 20, 2025 06:10
github-actions bot pushed a commit that referenced this pull request Jan 21, 2025
…d issue (#47129)

### What problem does this PR solve?

Issue Number: close #47105

Related PR: #24932

Problem Summary:

### Release note

str_to_date always return microsecond part for datetime even if user
does not specfic %f in date format string. This is wrong.
mysql> select id,str_to_date(dt, '%Y-%m-%d %H:%i:%s') from test1 limit
1;
+------+--------------------------------------+
| id   | str_to_date(dt, '%Y-%m-%d %H:%i:%s') |
+------+--------------------------------------+
|    2 | 2024-12-28 10:11:12.000000           |
+------+--------------------------------------+

and constant fold scenario is wrong too:
mysql> select cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d
%H:%i:%s') as string);

+--------------------------------------------------------------------------+
| cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d %H:%i:%s') as TEXT)
|

+--------------------------------------------------------------------------+
| 2025-01-17 11:59:30.000000 |

+--------------------------------------------------------------------------+


### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Yulei-Yang added a commit that referenced this pull request Jan 22, 2025
…g microsecond issue #47129 (#47261)

Cherry-picked from #47129

Co-authored-by: Yulei-Yang <yulei.yang0699@gmail.com>
dataroaring added a commit that referenced this pull request Jan 24, 2025
github-actions bot pushed a commit that referenced this pull request Jan 24, 2025
…d issue (#47129)

### What problem does this PR solve?

Issue Number: close #47105

Related PR: #24932

Problem Summary:

### Release note

str_to_date always return microsecond part for datetime even if user
does not specfic %f in date format string. This is wrong.
mysql> select id,str_to_date(dt, '%Y-%m-%d %H:%i:%s') from test1 limit
1;
+------+--------------------------------------+
| id   | str_to_date(dt, '%Y-%m-%d %H:%i:%s') |
+------+--------------------------------------+
|    2 | 2024-12-28 10:11:12.000000           |
+------+--------------------------------------+

and constant fold scenario is wrong too:
mysql> select cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d
%H:%i:%s') as string);

+--------------------------------------------------------------------------+
| cast(str_to_date('2025-01-17 11:59:30', '%Y-%m-%d %H:%i:%s') as TEXT)
|

+--------------------------------------------------------------------------+
| 2025-01-17 11:59:30.000000 |

+--------------------------------------------------------------------------+


### Check List (For Author)

- Test <!-- At least one of them must be included. -->
    - [x] Regression test
    - [ ] Unit Test
    - [ ] Manual test (add detailed scripts or steps below)
    - [ ] No need to test or manual test. Explain why:
- [ ] This is a refactor/code format and no logic has been changed.
        - [ ] Previous test can cover this change.
        - [ ] No code files have been changed.
        - [ ] Other reason <!-- Add your reason?  -->

- Behavior changed:
    - [x] No.
    - [ ] Yes. <!-- Explain the behavior change -->

- Does this need documentation?
    - [x] No.
- [ ] Yes. <!-- Add document PR link here. eg:
apache/doris-website#1214 -->

### Check List (For Reviewer who merge this PR)

- [ ] Confirm the release note
- [ ] Confirm test cases
- [ ] Confirm document
- [ ] Add branch pick label <!-- Add branch pick label that this PR
should merge into -->
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by one committer. dev/2.1.9-merged dev/3.0.x reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] str_to_date always return result with microsecond no matter you specfic %f in date format strin
8 participants