Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding timezone support in ECS #213

Closed
2 of 3 tasks
webmat opened this issue Dec 4, 2018 · 9 comments
Closed
2 of 3 tasks

Adding timezone support in ECS #213

webmat opened this issue Dec 4, 2018 · 9 comments
Assignees
Labels

Comments

@webmat
Copy link
Contributor

webmat commented Dec 4, 2018

ECS is currently missing timezone support.

We used to have fields where the value should be defined as a differential in seconds, which we agreed to remove, as it was not very friendly (device.timezone.offset.sec and host.timezone.offset.sec).

In order to re-introduce the timezone, we need to pick a field name, and what values we support in there.

Fun facts:

  • Beats can either output:
    • a time differential in hh:mm format (e.g. -05:00)
    • or in abbreviated form (e.g. EST).
  • Ingest Node supports at least 3 input formats:
    • time differentials in hh:mm format
    • abbreviated timezones
    • canonical timezones (e.g. Europe/Amsterdam)
  • Logstash's date filter supports 2 formats, according to the docs:
    • time differentials in hh:mm format
    • canonical timezones (e.g. Europe/Amsterdam)

Here are a few questions that could help shape the direction here:

  • Can someone confirm whether Beats emits the correct time differential immediately after DST changes?
  • Should we limit which format goes into the timezone field to only one format, or let people use their preferred format, as long as it's one of the 3?
  • What field name(s) seem more appropriate?

@MikePaquette @ruflin @andrewkroh

@webmat
Copy link
Contributor Author

webmat commented Dec 4, 2018

My thoughts on the questions:

  • Field name: host.timezone and observer.timezone. We may have to define container.timezone as well.
  • Format:
    • I think limiting to one makes for better looking data in general, but I'm worried the various sources will not necessarily support our arbitrary pick out of the box, which may complicate adoption outside of Beats. Although we can always compensate with a small Ingest Node pipeline.
    • My favourite is the abbreviated format, if we are to pick only one.

I'm not the right person to figure out the time differential adjustment on DST change.

@webmat webmat self-assigned this Dec 4, 2018
@ruflin
Copy link
Member

ruflin commented Dec 4, 2018

The reasons we picked seconds in a previous version of ECS is that it makes it possible to calculate things and should be able to cover all the "in the middle" time zones. But I also see the value of the canonical time zone.

It will be the case that not all sources can provide the same format but what is the format we can convert all of these two during ingest time?

@webmat
Copy link
Contributor Author

webmat commented Dec 4, 2018

Well Ingest Node supports the 3 formats, so perhaps we can leave this open ended for now.

Actually, would people care if their event stream has a mix of timezone formats? I think they're useful mostly to adjust timestamps. I don't think people do aggregations on them or visualize them. It's just a low level bit of information meant for the pipeline to interpret the date correctly.

@webmat webmat added the discuss label Dec 5, 2018
@ruflin
Copy link
Member

ruflin commented Dec 6, 2018

If it's mainly for the pipeline, should we even have it in ECS?

@webmat
Copy link
Contributor Author

webmat commented Dec 6, 2018

Well in some pretty big use cases like default syslog messages, you don't get a full timestamp that includes the timezone.

So we've come up with workarounds like combining Beats' add_locale (example) processor and using that field in Ingest Node (example) to properly interpret the timestamp. I'd say it's pretty important that we support it in ECS. I'm sure there's more cases that would benefit from having the timezone.

It doesn't need to be a core field, though. Most modern solutions don't make that mistake and have complete timestamps.

@webmat
Copy link
Contributor Author

webmat commented Dec 6, 2018

What do you think about this, @MikePaquette ?

@ruflin
Copy link
Member

ruflin commented Dec 7, 2018

@webmat I think you made a good point around that most newer systems have the full timestamps and this is mainly for conversion.

Based on the above some thoughts on my end:

  • It's not about observer, container etc. timezone but the timezone of the @timestamp field, meaning it's about the event itself and there is only 1 field needed for conversion of the timestamp. So it should be event.timezone
  • Supported formats: I think we should support at least 2-3 different formats to make ingestions as easy as possible. As ingest supports 3 different one, lets support these.

Side note: Would be great if ingest processor could depend on event.timezone and @timestamp as defaults in the future.

@webmat
Copy link
Contributor Author

webmat commented Dec 7, 2018

Awesome. Will get that going shortly. I agree with event.timestamp

@ruflin
Copy link
Member

ruflin commented Dec 7, 2018

You mean event.timezone?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants