Bug in JavaDateFormatter when using DOY #4285
Labels
bug
Something isn't working
enhancement
Enhancement or improvement to existing feature or request
Indexing & Search
Is your feature request related to a problem? Please describe.
Elasticsearch Version
7.10.2
Installed Plugins
No response
Java Version
1.8
OS Version
CentOS
Problem Description
Customer is experiencing ES v7.10 regression in handling custom date formats. Elasticsearch is not returning correct results when a date range query is specified with an inclusive upper bound on a custom date format. This is a regression from Elasticsearch 6.x, where the underlying Joda time library handled our format by default (i.e., it was not "custom"). We updated mappings for new indices in Elasticsearch 7, but Elasticsearch claims the same format that successfully indexes is not valid only when an upper bound is specified. This issue only affects indices on Elasticsearch 7 clusters; indices migrated from Elasticsearch 6 are not affected. I will attach steps to reproduce to this ticket.
We store and query many of our date-time fields with a specific day-of-year time, e.g. 2022-111T00:00:00.000. Elasticsearch 6 automatically handled this format as a date and our mappings did not need additional customization. The format is embedded in many tools and procedures, so updating all of them and retraining our teams is not feasible. The built-ins for ES7 date fields are all just slightly off, e.g. they require the timezone to be specified but tools downstream would need to change to include that in their searches to.
Here are more details:
For indices we are creating on ES7, we are specifying this in the mappings for indices that will store these dates: "format": "strict_date_optional_time||yyyy-DDD'T'HH:mm:ss.SSS"
For example, if a document exists with a property "timestamp"="2022-199T14:08:30.294" with it's mapping set to "format": "strict_date_optional_time||yyyy-DDD'T'HH:mm:ss.SSS":
timestamp:[2022-199T14:08:30.294 TO 2022-200T14:48:05.538] returned some objects but not one in particular.
Changing query string to be exclusive on the upper end returns the result successfully:
timestamp:[2022-199T14:08:30.294 TO 2022-200T14:48:05.538} (exclusive on the upper end)
Keeping the inclusive upper bound and switching to month-day also works
timestamp:[2022-199T14:08:30.294 TO 2022-07-19T14:48:05.538]
When translating the inclusive upper bound, custom formatted date to a DSL query, an error is returned that indicates the "format" in the mapping is not being respected:
"root_cause" :
[
{
"type" : "parse_exception",
"reason" : "failed to parse date field [2022-199T14:08:30.294] with format [strict_date_optional_time||yyyy-DDD'T'HH:mm:ss.SSS]: [failed to parse date field [2022-199T14:08:30.294] with format [strict_date_optional_time||yyyy-DDD'T'HH:mm:ss.SSS]]"
}
],"
*** Here is GitHub issue for Elasticsearch our customer filed on the public Elasticsearch GitHub. it is this issue elastic/elasticsearch#89096 ***
The text was updated successfully, but these errors were encountered: