Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiline messages support #121

Open
wants to merge 13 commits into
base: master
Choose a base branch
from
Open

Conversation

whoawhoawhoa
Copy link
Contributor

@whoawhoawhoa whoawhoawhoa commented Aug 3, 2020

Many services such as Oracle DB provide logs in format with specific separator. Users may want to process messages based on custom regex separator.
Other implementation examples:

Multiline messages support provided for file and stdin tailers.

@whoawhoawhoa whoawhoawhoa changed the title master Multiline messages support Aug 3, 2020
@coveralls
Copy link

coveralls commented Aug 3, 2020

Coverage Status

Coverage increased (+0.05%) to 65.423% when pulling 5021ddf on whoawhoawhoa:master into e898962 on fstab:master.

@fstab
Copy link
Owner

fstab commented Aug 5, 2020

Thanks a lot! This sounds like a valuable contribution. I'll look into the code in detail in the next couple of days.

Copy link
Owner

@fstab fstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks again for your contribution. Here are a few review comments:

The command

grok_exporter -showconfig -config ./example/config.yml

should not add default values to the output. Please add something like this to the String() function in configV3.go:

if stripped.Input.LineDelimiter == "\\n" || stripped.Input.LineDelimiter == "\n" {
	stripped.Input.LineDelimiter = ""
}

In addition, use omitempty when declaring the YAML value for line delimiter:

LineDelimiter              string `yaml:"line_delimiter,omitempty"`

After doing that, please remove all the line_delimiter: \n in the examples in configV3_test.go, because they will not be needed anymore.

@@ -41,19 +44,32 @@ func (r *lineReader) ReadLine(file io.Reader) (string, bool, error) {
err error
buf = make([]byte, 512)
n = 0
reg = regexp.MustCompile(r.lineDelimiter)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user has an invalid regular expression in line_delimiter, grok_exporter should fail on startup with a reasonable error message. Please add a check for this to func (c *InputConfig) validate() in configV3.go.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you compile the regular expression when loading the config, you might as well store the result so that you don't need to compile it again here.

Copy link
Owner

@fstab fstab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The line delimiter is a regex evaluated with Golang's regexp package, wich uses re2 syntax. The regular expressions in Grok patterns use the oniguruma regular expression library. I think it might lead to confusion that different regular expression syntaxes are used in different places of the configuration. Could you use Oniguruma for the line delimiter?

)
for {
newlinePos := bytes.IndexByte(r.remainingBytesFromLastRead, '\n')
if newlinePos >= 0 {
newlinesLoc := reg.FindAllIndex(r.remainingBytesFromLastRead, 2)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In almost all cases the line delimiter will be \n or \r\n, and in these cases it's overkill to evaluate regular expressions all the time. Please keep the original bytes.IndexByte() for performance reasons if the line delimiter is \n, and use the regular expression only if a different value for line delimiter is configured.

} else if err != nil {
if err == io.EOF {
return "", true, nil
result := make([]byte, len(r.remainingBytesFromLastRead))
Copy link
Owner

@fstab fstab Aug 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The application might be in the middle of writing a log line when we hit eof. If we return what we have read, we will return part of the line. As soon as the application finishes writing the line, we will return the other part of the line. That means, we cut the line in two. This should not happen. please leave

return "", true, nil

so that we wait until the line is complete.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants