Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in JavaDateFormatter when using DOY #4285

Closed
minalsha opened this issue Aug 23, 2022 · 2 comments
Closed

Bug in JavaDateFormatter when using DOY #4285

minalsha opened this issue Aug 23, 2022 · 2 comments
Assignees
Labels
bug Something isn't working enhancement Enhancement or improvement to existing feature or request Indexing & Search

Comments

@minalsha
Copy link
Contributor

minalsha commented Aug 23, 2022

Is your feature request related to a problem? Please describe.
Elasticsearch Version
7.10.2

Installed Plugins
No response

Java Version
1.8

OS Version
CentOS

Problem Description
Customer is experiencing ES v7.10 regression in handling custom date formats. Elasticsearch is not returning correct results when a date range query is specified with an inclusive upper bound on a custom date format. This is a regression from Elasticsearch 6.x, where the underlying Joda time library handled our format by default (i.e., it was not "custom"). We updated mappings for new indices in Elasticsearch 7, but Elasticsearch claims the same format that successfully indexes is not valid only when an upper bound is specified. This issue only affects indices on Elasticsearch 7 clusters; indices migrated from Elasticsearch 6 are not affected. I will attach steps to reproduce to this ticket.

We store and query many of our date-time fields with a specific day-of-year time, e.g. 2022-111T00:00:00.000. Elasticsearch 6 automatically handled this format as a date and our mappings did not need additional customization. The format is embedded in many tools and procedures, so updating all of them and retraining our teams is not feasible. The built-ins for ES7 date fields are all just slightly off, e.g. they require the timezone to be specified but tools downstream would need to change to include that in their searches to.

Here are more details:

For indices we are creating on ES7, we are specifying this in the mappings for indices that will store these dates: "format": "strict_date_optional_time||yyyy-DDD'T'HH:mm:ss.SSS"

For example, if a document exists with a property "timestamp"="2022-199T14:08:30.294" with it's mapping set to "format": "strict_date_optional_time||yyyy-DDD'T'HH:mm:ss.SSS":

timestamp:[2022-199T14:08:30.294 TO 2022-200T14:48:05.538] returned some objects but not one in particular.

Changing query string to be exclusive on the upper end returns the result successfully:

timestamp:[2022-199T14:08:30.294 TO 2022-200T14:48:05.538} (exclusive on the upper end)

Keeping the inclusive upper bound and switching to month-day also works

timestamp:[2022-199T14:08:30.294 TO 2022-07-19T14:48:05.538]

When translating the inclusive upper bound, custom formatted date to a DSL query, an error is returned that indicates the "format" in the mapping is not being respected:

"root_cause" :
[
{
"type" : "parse_exception",
"reason" : "failed to parse date field [2022-199T14:08:30.294] with format [strict_date_optional_time||yyyy-DDD'T'HH:mm:ss.SSS]: [failed to parse date field [2022-199T14:08:30.294] with format [strict_date_optional_time||yyyy-DDD'T'HH:mm:ss.SSS]]"
}
],"

*** Here is GitHub issue for Elasticsearch our customer filed on the public Elasticsearch GitHub. it is this issue elastic/elasticsearch#89096 ***

@minalsha minalsha added bug Something isn't working enhancement Enhancement or improvement to existing feature or request untriaged labels Aug 23, 2022
@Vishalks
Copy link
Contributor

Vishalks commented Sep 8, 2022

I am currently looking at this.

@Vishalks
Copy link
Contributor

Missed updating this with my findings since I was too focused on getting my local instance run the unit tests without erroring out. The problem lies in

this(format, printer, builder -> ROUND_UP_BASE_FIELDS.forEach(builder::parseDefaulting), parsers);
There is a conflict that Java throws out since we are defaulting day here to 1 but the day that Java extracts from the input is 104 (the correct one). According to me, the fix for this would be to default day of year to 1 when the input format is yyyy-DDD. Will update this ticket when I am able to set uop running unit tests on my end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement Enhancement or improvement to existing feature or request Indexing & Search
Projects
None yet
Development

No branches or pull requests

4 participants