-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QL: Eliminate internal type DATETIME_NANOS #68220
Conversation
Moving towards grouping of data types in the field caps API the internal data type `DATETIME_NANOS` introduced for `date_nanos` supports is eliminated. Relates: elastic#67722 Follows: elastic#67666
Pinging @elastic/es-ql (Team:QL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't the DATE_NANOS type be removed as well, per the PR description?
The main feedback that I have is for this PR to land after adding the transition to fields API.
if (values instanceof String) { | ||
return DateUtils.asDateTimeWithNanos(values.toString(), zoneId()); | ||
try { | ||
return DateUtils.asDateTimeWithNanos(values.toString(), zoneId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd wait for the fields API from @astefan to land before going for this one.
Hopefully that eliminates the internal handling of data format and thus this try/catch as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess fields API will make this uniform, but just in case, an alternative might be to check the presence of the T
or .
in the string and based on that call the right parser.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was curious to see the perf difference of try/catch and trying to check the string:
Benchmark Mode Cnt Score Error Units
BenchmarkDateParse.testWithContains thrpt 25 1631774,366 ± 6016,080 ops/s
BenchmarkDateParse.testWithException thrpt 25 629767,727 ± 3843,439 ops/s
and it's quite big!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will use the contains("-")
since it's always there for Datetime stings and it's at the beginning of the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and it's quite big!
I wonder if indexing the string wouldn't actually make a further difference: contains takes each character at a time, but the datetime format is fixed, so -
should always be at .charAt(4)
. Either way, I think it's fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hm, yeah, I rushed here, need to revisit:
- a timestamp can start with
-
(negative millis). .charAt(4)
is also not safe, as there are possible dates like:+14953-02-....
or-134-05-....
No problem to wait until the fields API transition. What do you mean |
@elasticmachine update branch |
I meant |
|
@elasticmachine run elasticsearch-ci/2 |
We don't need to extract date_nanos for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM in general. I've left one comment about a method that seems to have sneaked in at merge time.
Also, in ExpressionTests
do we still need the testCastToDatetimeNanosNotSupported()
test?
private static boolean hasDocValues(DataType dataType) { | ||
return dataType == KEYWORD || dataType == DATETIME; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not needed.
@astefan Nice catch, removing the test, and indeed I missed this merge mistake with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@elasticmachine run elasticsearch-ci/2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More lines removed than added - that's nice.
Left a comment regarding the history in Github otherwise LGTM.
} | ||
// We ask @timestamp (or the defined alternative field) to be returned as `epoch_millis` | ||
// when matching sequence to avoid parsing into ZonedDateTime objects for performance reasons. | ||
return parseEpochMillisAsString(values.toString()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this improvement.
|
||
private static NumberType numberType(DataType dataType) { | ||
return NumberType.valueOf(dataType.esType().toUpperCase(Locale.ROOT)); | ||
return values; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the left side removed due to the use of fields API? Just double checking why the change appears to belong to this PR in Github.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a merge mistake, it's reverted already.
Moving towards grouping of data types in the field caps API
the internal data type
DATETIME_NANOS
, introduced fordate_nanos
support, is eliminated.
Relates: #67722
Follows: #67666