Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Short circuit date patterns after first match #83764

Merged

Conversation

danhermann
Copy link
Contributor

In other words, don't attempt to parse the date with the rest of the patterns after one matches.

Relates to #73918

@danhermann danhermann added >bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP v8.0.1 v8.2.0 v8.1.1 labels Feb 9, 2022
@elasticmachine elasticmachine added the Team:Data Management Meta label for data/management team label Feb 9, 2022
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@elasticsearchmachine
Copy link
Collaborator

Hi @danhermann, I've created a changelog YAML for you.

Comment on lines 102 to 104
for (Function<Map<String, Object>, Function<String, ZonedDateTime>> dateParser : dateParsers) {
int dateParserIndex = 0;
while (dateTime == null && dateParserIndex < dateParsers.size()) {
var dateParser = dateParsers.get(dateParserIndex);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand why we do it this way, why doesn't it work to do:

for (Function<stuff> dateParser : dateParsers) {
    try {
        dateTime = dateParser.apply(ingestDocument.getSourceAndMetadata()).apply(value);
        break;
    catch (Exception e) {
        ...
    }
}

To just short circuit for the first non-error-throwing date parser?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anything that terminates the loop after the parser succeeds will work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, that seems more readable to me than a counter than increments and the null check, what do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like having the loop termination criteria explicitly specified upfront with a while loop but I don't feel that strongly about it.

@danhermann
Copy link
Contributor Author

@elasticmachine update branch

@elasticmachine
Copy link
Collaborator

expected head sha didn’t match current head ref.

@danhermann
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Member

@dakrone dakrone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dakrone
Copy link
Member

dakrone commented Feb 10, 2022

Should this go into 7.17.x also? It seems like a really unfortunate bug with a small and non-dangerous fix. What do you think?

@nemphys
Copy link

nemphys commented Apr 17, 2022

@dakrone FYI this was not at all a "non-dangerous fix"; I was banging my head in order to determine why I was not seeing events in my index after the update to 7.17.1, until I realised that the @timestamp field values of all new events were in microseconds (!) instead of milliseconds.

Turns out the formats order I was using in the ingest pipeline processor (which was ISO8601, UNIX, UNIX_MS, since I have 2 sources that send dates in either seconds or milliseconds format) had to be changed to ISO8601, UNIX_MS, UNIX, otherwise the milliseconds (not the seconds, those still worked fine) were converted to microseconds, pointing some thousands year in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>bug :Data Management/Ingest Node Execution or management of Ingest Pipelines including GeoIP Team:Data Management Meta label for data/management team v7.17.1 v8.0.1 v8.1.1 v8.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants