-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix mysql slowlog template to support mysql 5.7.22 #9647
Conversation
Remove default version qualifier and rename the environment variable to set it from `BEAT_VERSION_QUALIFIER` to `VERSION_QUALIFIER` this will align with other parts of the stack. **Tested with filebeat.** ``` ❯ ./filebeat version [08:39:01] filebeat version 7.0.0 (amd64), libbeat 7.0.0 [0a0c267 built 2018-11-19 13:38:15 +0000 UTC] ``` **Without the patch** ``` ❯ ./filebeat version [08:40:07] filebeat version 7.0.0-alpha1 (amd64), libbeat 7.0.0-alpha1 [b007837 built 2018-11-19 13:39:59 +0000 UTC] ``` Fixes: #8384
@@ -8,4 +8,4 @@ multiline: | |||
pattern: '^# User@Host: ' | |||
negate: true | |||
match: after | |||
exclude_lines: ['^[\/\w\.]+, Version: .* started with:.*'] # Exclude the header | |||
exclude_lines: ['^[\/\w\.]+, Version: .* started with:.*', '^# Time:.*'] # Exclude the header and time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I'm misunderstanding this, but should we be using this new Time:
header for @timestamp
instead of ignoring it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think there are some magic here that I don't understand. I'm imagining this exclude_lines is only ignoring it when first time parsing # User@Host
. After that somehow this Time
still get parsed into @timestamp
since it is captured in mysql/slowlog/test/mysql-5.7.22.log-expected.json
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not investigate but I'm also a bit confused on why this keeps working. Perhaps the exlcude does not actually work here? Have a look at the pipeline.
Also it would be good to have in the test file below a few more lines to make sure it also works with multiple entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added the needs_backport
label here.
@@ -8,4 +8,4 @@ multiline: | |||
pattern: '^# User@Host: ' | |||
negate: true | |||
match: after | |||
exclude_lines: ['^[\/\w\.]+, Version: .* started with:.*'] # Exclude the header | |||
exclude_lines: ['^[\/\w\.]+, Version: .* started with:.*', '^# Time:.*'] # Exclude the header and time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did not investigate but I'm also a bit confused on why this keeps working. Perhaps the exlcude does not actually work here? Have a look at the pipeline.
Also it would be good to have in the test file below a few more lines to make sure it also works with multiple entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Could you rebase on master as #9777 is now merged to get CI green?
@@ -0,0 +1,79 @@ | |||
[ | |||
{ | |||
"@timestamp": "2018-08-07T08:27:47.000Z", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worried at first that taking the unix timestamp instead of the title one we would loose the timezone diff. But seems like the timestamp is in UTC which is good.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ycombinator I remember you saw something different for ES for example?
@kaiyan-sheng Could you try to run mysql in a non UTC timezone environment and check if the timestamp is still correct. If not, we should do a follow up PR that adds convert_timezone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin I used to start test environment under beats/testing/environments using make start
. Now instead of that, I started Elasticsearch locally on my macbook and ran GENERATE=1 INTEGRATION_TESTS=1 TESTING_FILEBEAT_MODULES=mysql nosetests -v --nocapture tests/system/test_modules.py
again. I think this tested the non UTC timezeon environment
scenario. Is that right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember you saw something different for ES for example?
The ES (and LS) logs contain ISO8601 timestamps without the timezone. In MySQL the value of the timestamp
system variable needs to be a Unix timestamp, which is in UTC already, so we're good here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the details @ycombinator
"log.offset": 254, | ||
"mysql.slowlog.host": "localhost", | ||
"mysql.slowlog.lock_time.sec": "0.000061", | ||
"mysql.slowlog.query": "SELECT count(*) FROM mysql.user WHERE user='root' and password='';\n# Time: 2018-08-07T16:27:47.169604+08:00", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for reviewing @ruflin ! One thing I don't understand is, in mysql.slowlog.query
there is # Time
which doesn't seem to belong here. But this happens in other logs to without this PR hmmm. Does this look right to you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Didn't catch this one. This indeed looks strange and should not be the case.
If this happened already before, I suggest we get this PR in with the current change as it already fixes and issue and introduce a follow up PR with a fix for the above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin I ran the test to generate mysql-debian-5.7.19.log-expected.json
and here is what I got:
[
{
"@timestamp": "2018-04-26T18:50:32.000Z",
"event.dataset": "slowlog",
"event.module": "mysql",
"input.type": "log",
"log.flags": [
"multiline"
],
"log.offset": 0,
"mysql.slowlog.id": "5",
"mysql.slowlog.ip": "172.17.0.11",
"mysql.slowlog.lock_time.sec": "0.000033",
"mysql.slowlog.query": "SELECT intcol1,charcol1 FROM t1;\n# Time: 2018-04-26T18:50:32.437295Z",
"mysql.slowlog.query_time.sec": "0.000100",
"mysql.slowlog.rows_examined": "101",
"mysql.slowlog.rows_sent": "101",
"mysql.slowlog.timestamp": "1524768632",
"mysql.slowlog.user": "root"
}
]
I agree to get this PR in and then I will file a new bug to this separate issue to remove # Time
from query field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mentioned this happens in other logs too. Which ones were you referring to? Couldn't find any other in the postgres module with # Time
inside.
This seems to be related to the new log files we add here so should we also adjust the multiline pattern? I assume that is the problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ruflin I was indicating other log files in mysql module. Without this PR, the # Time
part still goes into mysql.slowlog.query
. I think with adding a or
operation in multiline.pattern will solve this issue. I will push a new commit to fix it in this same PR then.
@ruflin Can you take a look again with all the new small changes before I merge it please :-) Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Left one follow up note, not needed for this PR.
@@ -0,0 +1,79 @@ | |||
[ | |||
{ | |||
"@timestamp": "2018-08-07T08:27:47.000Z", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ycombinator I remember you saw something different for ES for example?
@kaiyan-sheng Could you try to run mysql in a non UTC timezone environment and check if the timestamp is still correct. If not, we should do a follow up PR that adds convert_timezone.
#9647) (#9843) * Fix mysql slowlog template to support mysql 5.7.22 (#9647) * Never default to a qualifier when none of them are set. (#9148) Remove default version qualifier and rename the environment variable to set it from `BEAT_VERSION_QUALIFIER` to `VERSION_QUALIFIER` this will align with other parts of the stack. **Tested with filebeat.** ``` ❯ ./filebeat version [08:39:01] filebeat version 7.0.0 (amd64), libbeat 7.0.0 [0a0c267 built 2018-11-19 13:38:15 +0000 UTC] ``` **Without the patch** ``` ❯ ./filebeat version [08:40:07] filebeat version 7.0.0-alpha1 (amd64), libbeat 7.0.0-alpha1 [b007837 built 2018-11-19 13:39:59 +0000 UTC] ``` Fixes: #8384 * Fix mysql slowlog template to support mysql 5.7.22 * Add changelog * Remove changes introduced from rebasing that are not related to this change * Add more log entries to test log file * Add OR in multiline.pattern to remove # Time from mysql.slowlog.query * Use CHANGELOG.next.asciidoc instead (cherry picked from commit b29ddfc) * Rerun test with GENERATE=1 and 6.x environment
Mysql slowlog only support starting with the header
/usr/local/Cellar/mysql/5.7.10/bin/mysqld, Version: 5.7.10 (Homebrew). started with:
which will be ignored or starting with# User@Host
. But now in mysql 5.7.22, there are logs starting with# Time
which is not supported by current pipeline. Solution I found is to add# Time
intoexclude_lines
so pipeline can still starts from# User@Host
.This PR closes issue #7892