Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix mysql slowlog template to support mysql 5.7.22 #9647

Merged
merged 19 commits into from
Dec 31, 2018
Merged

Fix mysql slowlog template to support mysql 5.7.22 #9647

merged 19 commits into from
Dec 31, 2018

Conversation

kaiyan-sheng
Copy link
Contributor

Mysql slowlog only support starting with the header /usr/local/Cellar/mysql/5.7.10/bin/mysqld, Version: 5.7.10 (Homebrew). started with: which will be ignored or starting with # User@Host. But now in mysql 5.7.22, there are logs starting with # Time which is not supported by current pipeline. Solution I found is to add # Time into exclude_lines so pipeline can still starts from # User@Host.

This PR closes issue #7892

ph and others added 8 commits November 27, 2018 10:51
Remove default version qualifier and rename the environment variable to set it from `BEAT_VERSION_QUALIFIER` to `VERSION_QUALIFIER` this will align with other parts of the stack.

**Tested with filebeat.**
```
 ❯ ./filebeat version                                                                                                                                                                                                                                                                                                                                          [08:39:01]
filebeat version 7.0.0 (amd64), libbeat 7.0.0 [0a0c267 built 2018-11-19 13:38:15 +0000 UTC]
```

**Without the patch**
```
 ❯ ./filebeat version                                                                                                                                                                                                                                                                                                                                          [08:40:07]
filebeat version 7.0.0-alpha1 (amd64), libbeat 7.0.0-alpha1 [b007837 built 2018-11-19 13:39:59 +0000 UTC]
```

Fixes: #8384
@kaiyan-sheng kaiyan-sheng requested a review from a team as a code owner December 18, 2018 22:10
@@ -8,4 +8,4 @@ multiline:
pattern: '^# User@Host: '
negate: true
match: after
exclude_lines: ['^[\/\w\.]+, Version: .* started with:.*'] # Exclude the header
exclude_lines: ['^[\/\w\.]+, Version: .* started with:.*', '^# Time:.*'] # Exclude the header and time
Copy link
Contributor

@andrewvc andrewvc Dec 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I'm misunderstanding this, but should we be using this new Time: header for @timestamp instead of ignoring it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think there are some magic here that I don't understand. I'm imagining this exclude_lines is only ignoring it when first time parsing # User@Host. After that somehow this Time still get parsed into @timestamp since it is captured in mysql/slowlog/test/mysql-5.7.22.log-expected.json.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not investigate but I'm also a bit confused on why this keeps working. Perhaps the exlcude does not actually work here? Have a look at the pipeline.

Also it would be good to have in the test file below a few more lines to make sure it also works with multiple entries.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andrewvc @ruflin I looked into this more and found out @timestamp field is generated from mysql.slowlog.timestamp with format UNIX. That's why adding # Time into the exclude_lines and that didn't change the output.

@ruflin ruflin added module review Filebeat Filebeat Team:Integrations Label for the Integrations team needs_backport PR is waiting to be backported to other branches. labels Dec 19, 2018
Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added the needs_backport label here.

x-pack/auditbeat/auditbeat.reference.yml Outdated Show resolved Hide resolved
@@ -8,4 +8,4 @@ multiline:
pattern: '^# User@Host: '
negate: true
match: after
exclude_lines: ['^[\/\w\.]+, Version: .* started with:.*'] # Exclude the header
exclude_lines: ['^[\/\w\.]+, Version: .* started with:.*', '^# Time:.*'] # Exclude the header and time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not investigate but I'm also a bit confused on why this keeps working. Perhaps the exlcude does not actually work here? Have a look at the pipeline.

Also it would be good to have in the test file below a few more lines to make sure it also works with multiple entries.

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Could you rebase on master as #9777 is now merged to get CI green?

@@ -0,0 +1,79 @@
[
{
"@timestamp": "2018-08-07T08:27:47.000Z",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I worried at first that taking the unix timestamp instead of the title one we would loose the timezone diff. But seems like the timestamp is in UTC which is good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ycombinator I remember you saw something different for ES for example?
@kaiyan-sheng Could you try to run mysql in a non UTC timezone environment and check if the timestamp is still correct. If not, we should do a follow up PR that adds convert_timezone.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin I used to start test environment under beats/testing/environments using make start. Now instead of that, I started Elasticsearch locally on my macbook and ran GENERATE=1 INTEGRATION_TESTS=1 TESTING_FILEBEAT_MODULES=mysql nosetests -v --nocapture tests/system/test_modules.py again. I think this tested the non UTC timezeon environment scenario. Is that right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember you saw something different for ES for example?

The ES (and LS) logs contain ISO8601 timestamps without the timezone. In MySQL the value of the timestamp system variable needs to be a Unix timestamp, which is in UTC already, so we're good here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the details @ycombinator

"log.offset": 254,
"mysql.slowlog.host": "localhost",
"mysql.slowlog.lock_time.sec": "0.000061",
"mysql.slowlog.query": "SELECT count(*) FROM mysql.user WHERE user='root' and password='';\n# Time: 2018-08-07T16:27:47.169604+08:00",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for reviewing @ruflin ! One thing I don't understand is, in mysql.slowlog.query there is # Time which doesn't seem to belong here. But this happens in other logs to without this PR hmmm. Does this look right to you?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't catch this one. This indeed looks strange and should not be the case.

If this happened already before, I suggest we get this PR in with the current change as it already fixes and issue and introduce a follow up PR with a fix for the above.

Copy link
Contributor Author

@kaiyan-sheng kaiyan-sheng Dec 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin I ran the test to generate mysql-debian-5.7.19.log-expected.json and here is what I got:

[
    {
        "@timestamp": "2018-04-26T18:50:32.000Z", 
        "event.dataset": "slowlog", 
        "event.module": "mysql", 
        "input.type": "log", 
        "log.flags": [
            "multiline"
        ], 
        "log.offset": 0, 
        "mysql.slowlog.id": "5", 
        "mysql.slowlog.ip": "172.17.0.11", 
        "mysql.slowlog.lock_time.sec": "0.000033", 
        "mysql.slowlog.query": "SELECT intcol1,charcol1 FROM t1;\n# Time: 2018-04-26T18:50:32.437295Z", 
        "mysql.slowlog.query_time.sec": "0.000100", 
        "mysql.slowlog.rows_examined": "101", 
        "mysql.slowlog.rows_sent": "101", 
        "mysql.slowlog.timestamp": "1524768632", 
        "mysql.slowlog.user": "root"
    }
]

I agree to get this PR in and then I will file a new bug to this separate issue to remove # Time from query field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mentioned this happens in other logs too. Which ones were you referring to? Couldn't find any other in the postgres module with # Time inside.

This seems to be related to the new log files we add here so should we also adjust the multiline pattern? I assume that is the problem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin I was indicating other log files in mysql module. Without this PR, the # Time part still goes into mysql.slowlog.query. I think with adding a or operation in multiline.pattern will solve this issue. I will push a new commit to fix it in this same PR then.

@kaiyan-sheng
Copy link
Contributor Author

@ruflin Can you take a look again with all the new small changes before I merge it please :-) Thanks!

Copy link
Member

@ruflin ruflin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Left one follow up note, not needed for this PR.

@@ -0,0 +1,79 @@
[
{
"@timestamp": "2018-08-07T08:27:47.000Z",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ycombinator I remember you saw something different for ES for example?
@kaiyan-sheng Could you try to run mysql in a non UTC timezone environment and check if the timestamp is still correct. If not, we should do a follow up PR that adds convert_timezone.

@kaiyan-sheng kaiyan-sheng merged commit b29ddfc into elastic:master Dec 31, 2018
@kaiyan-sheng kaiyan-sheng deleted the mysql_slowlog branch December 31, 2018 20:49
kaiyan-sheng added a commit that referenced this pull request Jan 4, 2019
#9647) (#9843)

* Fix mysql slowlog template to support mysql 5.7.22 (#9647)

* Never default to a qualifier when none of them are set. (#9148)

Remove default version qualifier and rename the environment variable to set it from `BEAT_VERSION_QUALIFIER` to `VERSION_QUALIFIER` this will align with other parts of the stack.

**Tested with filebeat.**
```
 ❯ ./filebeat version                                                                                                                                                                                                                                                                                                                                          [08:39:01]
filebeat version 7.0.0 (amd64), libbeat 7.0.0 [0a0c267 built 2018-11-19 13:38:15 +0000 UTC]
```

**Without the patch**
```
 ❯ ./filebeat version                                                                                                                                                                                                                                                                                                                                          [08:40:07]
filebeat version 7.0.0-alpha1 (amd64), libbeat 7.0.0-alpha1 [b007837 built 2018-11-19 13:39:59 +0000 UTC]
```

Fixes: #8384

* Fix mysql slowlog template to support mysql 5.7.22

* Add changelog

* Remove changes introduced from rebasing that are not related to this change

* Add more log entries to test log file

* Add OR in multiline.pattern to remove # Time from mysql.slowlog.query

* Use CHANGELOG.next.asciidoc instead

(cherry picked from commit b29ddfc)

* Rerun test with GENERATE=1 and 6.x environment
@kaiyan-sheng kaiyan-sheng removed the needs_backport PR is waiting to be backported to other branches. label Jan 31, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Filebeat Filebeat module review Team:Integrations Label for the Integrations team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants