Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

promtail: Allow docker_sd_config to handle log lines longer than 64k #6256

Merged
merged 1 commit into from
Jun 3, 2022
Merged

promtail: Allow docker_sd_config to handle log lines longer than 64k #6256

merged 1 commit into from
Jun 3, 2022

Conversation

srstsavage
Copy link
Contributor

@srstsavage srstsavage commented May 27, 2022

What this PR does / why we need it:
In promtail's docker_sd_config, parsing of a container's
logs would fail if any log line was over 64k bytes in length.
This is because bufio.Scanner uses a 64k buffer by default.
Since there doesn't appear to be a way to elegantly catch
bufio.ErrTooLong and carry on, the parsing approach was changed
to bufio.Reader which can handle lines of arbitrary length.

Which issue(s) this PR fixes:
Fixes #6257

Special notes for your reviewer:
Originally I considered increasing the buffer or allowing for a config parameter to do the same, but that would only raise the ceiling of parsable log line lengths. Since there doesn't seem to be a way to catch bufio.ErrTooLong when using Scanner, it seemed better to just go with the Reader approach and handle lines of arbitrary length.

Log lines of this length can be problematic by nature and it may be prudent to add a configuration to truncate parsed log lines to a certain maximum length, but I think that concern is separate enough to avoid conflating it here.

Checklist

  • [-] Documentation added
  • Tests updated
  • [-] Is this an important fix or new feature? Add an entry in the CHANGELOG.md.
  • [-] Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

In promtail's docker_sd_config, parsing of a container's
logs would fail if any log line was over 64k bytes in length.
This is because bufio.Scanner uses a 64k buffer by default.
Since there doesn't appear to be a way to elegantly catch
`bufio.ErrTooLong` and carry on, the parsing approach was changed
to bufio.Reader which can handle lines of arbitrary length.
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Copy link
Contributor

@chaudum chaudum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix!

Comment on lines +151 to +165
// https://devmarkpro.com/working-big-files-golang
func readLine(r *bufio.Reader) (string, error) {
var (
isPrefix = true
err error
line, ln []byte
)

for isPrefix && err == nil {
line, isPrefix, err = r.ReadLine()
ln = append(ln, line...)
}

return string(ln), err
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind putting this function into a utility package, e.g. clients/pkg/promtail/common? I think it may be used in various places.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine either way on that actually, we can also wait until we need it somewhere else to figure out the best package for it to go into, or what the interface should look like, ie. there might be more duplication to reduce if we refactor our the section in process for processing a whole file?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm fine either way, happy to move it if that's the desired approach...

Copy link
Collaborator

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the improvement, LGTM!

Comment on lines +151 to +165
// https://devmarkpro.com/working-big-files-golang
func readLine(r *bufio.Reader) (string, error) {
var (
isPrefix = true
err error
line, ln []byte
)

for isPrefix && err == nil {
line, isPrefix, err = r.ReadLine()
ln = append(ln, line...)
}

return string(ln), err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine either way on that actually, we can also wait until we need it somewhere else to figure out the best package for it to go into, or what the interface should look like, ie. there might be more duplication to reduce if we refactor our the section in process for processing a whole file?

@@ -89,7 +89,7 @@ func Test_TargetManager(t *testing.T) {
require.True(t, ta.Ready())

require.Eventually(t, func() bool {
return len(entryHandler.Received()) >= 5
return len(entryHandler.Received()) >= 6
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what caused this value to change? I don't see any changes to the test setup.

Copy link
Contributor Author

@srstsavage srstsavage Jun 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I inserted a Very Long Log Line into flog.log at line 6 to test against (over 65k). I had to increase by one here to read in that extra line.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, and that's a binary file which is why I'm not seeing the diff, thanks!

Copy link
Collaborator

@slim-bean slim-bean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

promtail docker_sd_config: long lines crash promtail parsing
5 participants