Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a line-buffered logger to deamplify write syscalls #6954

Merged
merged 6 commits into from
Oct 4, 2022

Conversation

dannykopping
Copy link
Contributor

@dannykopping dannykopping commented Aug 23, 2022

What this PR does / why we need it:
We initialise a global logger in pkg/util/log/log.go and use it extensively throughout the Loki codebase. Every time we write a log message, a write syscall is invoked. Syscalls are problematic because they transition the process from userspace to kernelspace, which means:

  • a context-switch is incurred, which is inherently expensive (1-2 microseconds)
  • the goroutine executing the code is blocked
  • the underlying OS thread (M in the go scheduler model) is also blocked
  • the goroutine has to be rescheduled once the syscall exits
  • the go scheduler may need to spawn additional OS threads if all are blocked in syscalls - which can also be expensive

This change introduces the use of a line-buffered logger. It has a buffer of 256 entries, and once that buffer is filled it will flush to disk. However, a situation will arise in which that buffer remains somewhat empty for a period of time, so there is a periodic flush mechanism, configured to flush every 100ms. There is also a preallocated bytes slice of 10MB which is reused, to avoid excessive slice resizing & garbage collection.

This does mean that we could lose up to 256 log messages in case of an ungraceful termination of the process, but this would need to be precisely timed within the 100ms flushes - in other words, the likelihood is low, and generally we shouldn't kill -9 any Loki process.

Which issue(s) this PR fixes:
N/A

Special notes for your reviewer:
This PR uses my private fork of go-kit/log; once we've validated this to work in our environment I will attempt to upstream the change.

Line-buffering is common, and in Linux the stdout stream is line-buffered when attached to a terminal.

We also currently make use of the log.NewSyncWriter writer, which is totally unnecessary since write syscalls are threadsafe by nature, and we also don't need to worry about out-of-order logs anymore. Removing this should also eliminate some contention, and there's a config option to control this separately while we test this out.

Checklist

  • Documentation added
  • Tests updated
  • Is this an important fix or new feature? Add an entry in the CHANGELOG.md.
  • Changes that require user attention or interaction to upgrade are documented in docs/sources/upgrading/_index.md

@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Danny Kopping added 3 commits October 1, 2022 10:55
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0%

Refactor usages of InitLogger

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@pull-request-size pull-request-size bot added size/L and removed size/M labels Oct 3, 2022
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
+ querier/queryrange	0%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0.1%

Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
@grafanabot
Copy link
Collaborator

./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki

Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell.

+           ingester	0%
+        distributor	0%
+            querier	0%
- querier/queryrange	-0.1%
+               iter	0%
+            storage	0%
+           chunkenc	0%
+              logql	0%
+               loki	0.1%

@dannykopping dannykopping marked this pull request as ready for review October 3, 2022 07:56
@dannykopping dannykopping requested a review from a team as a code owner October 3, 2022 07:56

// buffered logger settings
var (
logEntries uint32 = 256 // buffer up to 256 log lines in memory before flushing to a write(2) syscall
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice if these could be configured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe in a subsequent PR we can make it configurable, but I don't think it's worth it personally.
It's probably already tuned ~optimally, so I don't think a user would need to configure it.

@salvacorts
Copy link
Contributor

we shouldn't kill -9 any Loki process.

I'm more concerned about OOMed processes. If a process runs out of memory, it'll be killed and we might lose important logs to assess what made the memory of the process increase.

@dannykopping
Copy link
Contributor Author

we shouldn't kill -9 any Loki process.

I'm more concerned about OOMed processes. If a process runs out of memory, it'll be killed and we might lose important logs to assess what made the memory of the process increase.

That's true, but generally logs are not useful in diagnosing problems of this nature - continuous profiling would be the solution here.

Copy link
Collaborator

@trevorwhitney trevorwhitney left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, a welcome addition.

One other thought. Do we want to be able to flush this buffer when the /flush endpoint on an ingester is called (or any other stateful component we do manual scale downs of)?

// buffered logger settings
var (
logEntries uint32 = 256 // buffer up to 256 log lines in memory before flushing to a write(2) syscall
logBufferSize uint32 = 10e6 // 10MB
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if 256 entries goes over 10MB? Will the buffer also flush on size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preallocated buffer is just to prevent slice resizing, and because it's reused it prevents unnecessary GC.
If 10MB is exceeded then the buffer's underlying slice will grow, so nothing will really happen - it won't flush on size.

var writer io.Writer
if buffered {
// TODO: it's technically possible here to lose logs between the 100ms flush and the process being killed
// => call buf.Flush() in a signal handler if this is a concern, but this is unlikely to be a problem
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a cost to also calling Flush in a signal handler? not necessary for this PR, but feels like a pretty cheap and easy way to hedge against this edge case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real cost, except last time I looked the signal handler is buried inside the weaveworks/common lib and it'll be a bit of a chore to work around that. Maybe I'm wrong though?

Copy link
Contributor Author

@dannykopping dannykopping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review!

// buffered logger settings
var (
logEntries uint32 = 256 // buffer up to 256 log lines in memory before flushing to a write(2) syscall
logBufferSize uint32 = 10e6 // 10MB
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The preallocated buffer is just to prevent slice resizing, and because it's reused it prevents unnecessary GC.
If 10MB is exceeded then the buffer's underlying slice will grow, so nothing will really happen - it won't flush on size.

var writer io.Writer
if buffered {
// TODO: it's technically possible here to lose logs between the 100ms flush and the process being killed
// => call buf.Flush() in a signal handler if this is a concern, but this is unlikely to be a problem
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real cost, except last time I looked the signal handler is buried inside the weaveworks/common lib and it'll be a bit of a chore to work around that. Maybe I'm wrong though?

@dannykopping
Copy link
Contributor Author

One other thought. Do we want to be able to flush this buffer when the /flush endpoint on an ingester is called (or any other stateful component we do manual scale downs of)?

Calling the /flush endpoint will trigger a bunch of actions which will produce logs themselves - ultimately we just need a shutdown handler here to cover all cases I believe.

@dannykopping dannykopping merged commit 6bf5b5d into grafana:main Oct 4, 2022
@dannykopping dannykopping deleted the dannykopping/buffered-logs branch October 4, 2022 11:47
@dannykopping dannykopping mentioned this pull request Oct 12, 2022
5 tasks
dannykopping pushed a commit that referenced this pull request Oct 12, 2022
Promtail is using the line-buffered logger (#6954), and log messages were not being printed on exit
lxwzy pushed a commit to lxwzy/loki that referenced this pull request Nov 7, 2022
We initialise a global logger in `pkg/util/log/log.go` and use it
extensively throughout the Loki codebase. Every time we write a log
message, a `write` syscall is invoked. Syscalls are problematic because
they transition the process from userspace to kernelspace, which means:

- a context-switch is incurred, which is inherently expensive ([1-2
microseconds](https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/))
- the goroutine executing the code is **blocked**
- the underlying OS thread (_M_ in the go scheduler model) is **also
blocked**
- the goroutine has to be rescheduled once the syscall exits
- the go scheduler may need to spawn additional OS threads if all are
blocked in syscalls - which can also be expensive

This change introduces the use of a line-buffered logger. It has a
buffer of
[256](https://gist.github.com/dannykopping/0704db32c0b08751d1d2494efaa734c2)
entries, and once that buffer is filled it will flush to disk. However,
a situation will arise in which that buffer remains somewhat empty for a
period of time, so there is a periodic flush mechanism, configured to
flush every 100ms. There is also a preallocated bytes slice of 10MB
which is reused, to avoid excessive slice resizing & garbage collection.

This does mean that we could lose up to 256 log messages in case of an
ungraceful termination of the process, but this would need to be
precisely timed within the 100ms flushes - in other words, the
likelihood is low, and generally we shouldn't `kill -9` any Loki
process.
lxwzy pushed a commit to lxwzy/loki that referenced this pull request Nov 7, 2022
Promtail is using the line-buffered logger (grafana#6954), and log messages were not being printed on exit
changhyuni pushed a commit to changhyuni/loki that referenced this pull request Nov 8, 2022
We initialise a global logger in `pkg/util/log/log.go` and use it
extensively throughout the Loki codebase. Every time we write a log
message, a `write` syscall is invoked. Syscalls are problematic because
they transition the process from userspace to kernelspace, which means:

- a context-switch is incurred, which is inherently expensive ([1-2
microseconds](https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/))
- the goroutine executing the code is **blocked**
- the underlying OS thread (_M_ in the go scheduler model) is **also
blocked**
- the goroutine has to be rescheduled once the syscall exits
- the go scheduler may need to spawn additional OS threads if all are
blocked in syscalls - which can also be expensive

This change introduces the use of a line-buffered logger. It has a
buffer of
[256](https://gist.github.com/dannykopping/0704db32c0b08751d1d2494efaa734c2)
entries, and once that buffer is filled it will flush to disk. However,
a situation will arise in which that buffer remains somewhat empty for a
period of time, so there is a periodic flush mechanism, configured to
flush every 100ms. There is also a preallocated bytes slice of 10MB
which is reused, to avoid excessive slice resizing & garbage collection.

This does mean that we could lose up to 256 log messages in case of an
ungraceful termination of the process, but this would need to be
precisely timed within the 100ms flushes - in other words, the
likelihood is low, and generally we shouldn't `kill -9` any Loki
process.
changhyuni pushed a commit to changhyuni/loki that referenced this pull request Nov 8, 2022
Promtail is using the line-buffered logger (grafana#6954), and log messages were not being printed on exit
Abuelodelanada pushed a commit to canonical/loki that referenced this pull request Dec 1, 2022
We initialise a global logger in `pkg/util/log/log.go` and use it
extensively throughout the Loki codebase. Every time we write a log
message, a `write` syscall is invoked. Syscalls are problematic because
they transition the process from userspace to kernelspace, which means:

- a context-switch is incurred, which is inherently expensive ([1-2
microseconds](https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/))
- the goroutine executing the code is **blocked**
- the underlying OS thread (_M_ in the go scheduler model) is **also
blocked**
- the goroutine has to be rescheduled once the syscall exits
- the go scheduler may need to spawn additional OS threads if all are
blocked in syscalls - which can also be expensive

This change introduces the use of a line-buffered logger. It has a
buffer of
[256](https://gist.github.com/dannykopping/0704db32c0b08751d1d2494efaa734c2)
entries, and once that buffer is filled it will flush to disk. However,
a situation will arise in which that buffer remains somewhat empty for a
period of time, so there is a periodic flush mechanism, configured to
flush every 100ms. There is also a preallocated bytes slice of 10MB
which is reused, to avoid excessive slice resizing & garbage collection.

This does mean that we could lose up to 256 log messages in case of an
ungraceful termination of the process, but this would need to be
precisely timed within the 100ms flushes - in other words, the
likelihood is low, and generally we shouldn't `kill -9` any Loki
process.
Abuelodelanada pushed a commit to canonical/loki that referenced this pull request Dec 1, 2022
Promtail is using the line-buffered logger (grafana#6954), and log messages were not being printed on exit
@dannykopping dannykopping mentioned this pull request Jul 28, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants