-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set non-UTC timezone for filebeat modules testing #13874
Conversation
Using UTC timezone for filebeat modules testing hides some issues when modifying code to parse timestamps without timezones. Use a non-UTC timezone to facilitate the detection of this kind of issues.
@@ -140,7 +140,7 @@ def run_on_file(self, module, fileset, test_file, cfgfile): | |||
output.write(" ".join(cmd) + "\n") | |||
|
|||
local_env = os.environ.copy() | |||
local_env["TZ"] = 'Etc/UTC' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only change in code, the rest of changes are the result when regenerating golden files.
Why exactly all timezones are -2 hours and not minus 200? 🙂 Can you develop, please? |
@@ -140,7 +140,7 @@ def run_on_file(self, module, fileset, test_file, cfgfile): | |||
output.write(" ".join(cmd) + "\n") | |||
|
|||
local_env = os.environ.copy() | |||
local_env["TZ"] = 'Etc/UTC' | |||
local_env["TZ"] = 'Etc/GMT+2' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to add a comment here explaining why this timezone
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, comment added.
@@ -18,7 +18,7 @@ | |||
"event.action": "access_granted", | |||
"event.dataset": "elasticsearch.audit", | |||
"event.module": "elasticsearch", | |||
"event.timezone": "+00:00", | |||
"event.timezone": "-02:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this is correct. I think this should remain +00:00
.
If you look at the first sample log entry in test-audit-730.log
, it has the following field:
"timestamp":"2019-06-11T05:21:08,484-0700"
The corresponding field in test-audit-730.log-expected.json
is:
"@timestamp": "2019-06-11T12:21:08.484Z"
So wouldn't we want event.timezone
here to be +00:00
?
@@ -8,7 +8,7 @@ | |||
"event.action": "anonymous_access_denied", | |||
"event.dataset": "elasticsearch.audit", | |||
"event.module": "elasticsearch", | |||
"event.timezone": "+00:00", | |||
"event.timezone": "-02:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my other comment: https://github.com/elastic/beats/pull/13874/files#r330526232.
@@ -8,7 +8,7 @@ | |||
"elasticsearch.node.name": "es1_1", | |||
"event.dataset": "elasticsearch.deprecation", | |||
"event.module": "elasticsearch", | |||
"event.timezone": "+00:00", | |||
"event.timezone": "-02:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my other comment: https://github.com/elastic/beats/pull/13874/files#r330526232.
@@ -6,7 +6,7 @@ | |||
"elasticsearch.node.name": "node-0", | |||
"event.dataset": "elasticsearch.server", | |||
"event.module": "elasticsearch", | |||
"event.timezone": "+00:00", | |||
"event.timezone": "-02:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my other comment: https://github.com/elastic/beats/pull/13874/files#r330526232.
@@ -16,7 +16,7 @@ | |||
"event.dataset": "elasticsearch.slowlog", | |||
"event.duration": 3000000, | |||
"event.module": "elasticsearch", | |||
"event.timezone": "+00:00", | |||
"event.timezone": "-02:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my other comment: https://github.com/elastic/beats/pull/13874/files#r330526232.
@@ -19,7 +19,7 @@ | |||
"event.dataset": "elasticsearch.slowlog", | |||
"event.duration": 6000000, | |||
"event.module": "elasticsearch", | |||
"event.timezone": "+00:00", | |||
"event.timezone": "-02:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my other comment: https://github.com/elastic/beats/pull/13874/files#r330526232.
@@ -16,7 +16,7 @@ | |||
"event.dataset": "elasticsearch.slowlog", | |||
"event.duration": 4000000, | |||
"event.module": "elasticsearch", | |||
"event.timezone": "+00:00", | |||
"event.timezone": "-02:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to my other comment: https://github.com/elastic/beats/pull/13874/files#r330526232.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I started reviewing the changes in this PR for stack monitoring modules. I left a few comments but started noticing a pattern in them so I'll leave a more general comment now.
In the cases where we are ingesting JSON logs, the timestamp field in the source log already has time zone information. In that case, it looks like the ingest pipeline is converting the timestamp to UTC before setting the final value in the @timestamp
field. Further, the value that's set in the @timestamp
field has the Z
time zone identifier for UTC. As such, we should either set event.timezone
to +00:00
(as it was before this PR) or omit that field entirely, according to ECS:
This field should be populated when the event’s timestamp does not include timezone information already (e.g. default Syslog timestamps). It’s optional otherwise.
@ycombinator I agree that there are issues to fix, specially in modules. This PR is intended only to change module tests to use non-UTC timezones, to give visibility to problems that we hadn't seen when everything was using UTC. Fixes in modules will come in follow ups (#13877). I would like to merge this change in tests before further fixes, so we can check in follow ups that timestamps actually change to proper values. Regarding the use of The case of the stack modules are a bit special, because they have two log formats, one of them contains the timezone in the format and the other one doesn't, and we set |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Chatted with @jsoriano off-PR. The changes in this PR LGTM (for stack modules). There will be a follow up PR in which we will modify the JSON-processing ingest pipelines for stack (and maybe other?) modules to drop the event.timezone
field set by the add_locale
processor. This will address my concerns in this PR.
BTW, the core issue here is that ECS specifies event.timezeone
to be the time zone of the event, whereas the add_locale
processor sets event.timezone
to be the timezone where the agent is running. We think the ideal fix is to modify the add_locale
processor to set something like agent.timezone
but a) this field is not in ECS at the moment and b) this would be a breaking change for the processor so we must wait to do it only in the next major release.
Using UTC timezone for filebeat modules testing hides some issues when
modifying code to parse timestamps without timezones. Use a non-UTC
timezone to facilitate the detection of this kind of issues.