-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a line-buffered logger to deamplify write syscalls #6954
Use a line-buffered logger to deamplify write syscalls #6954
Conversation
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
76e0dc4
to
10322f0
Compare
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0% |
Refactor usages of InitLogger Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
+ querier/queryrange 0%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0.1% |
Signed-off-by: Danny Kopping <danny.kopping@grafana.com>
./tools/diff_coverage.sh ../loki-main/test_results.txt test_results.txt ingester,distributor,querier,querier/queryrange,iter,storage,chunkenc,logql,loki Change in test coverage per package. Green indicates 0 or positive change, red indicates that test coverage for a package fell. + ingester 0%
+ distributor 0%
+ querier 0%
- querier/queryrange -0.1%
+ iter 0%
+ storage 0%
+ chunkenc 0%
+ logql 0%
+ loki 0.1% |
|
||
// buffered logger settings | ||
var ( | ||
logEntries uint32 = 256 // buffer up to 256 log lines in memory before flushing to a write(2) syscall |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice if these could be configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in a subsequent PR we can make it configurable, but I don't think it's worth it personally.
It's probably already tuned ~optimally, so I don't think a user would need to configure it.
I'm more concerned about OOMed processes. If a process runs out of memory, it'll be killed and we might lose important logs to assess what made the memory of the process increase. |
That's true, but generally logs are not useful in diagnosing problems of this nature - continuous profiling would be the solution here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, a welcome addition.
One other thought. Do we want to be able to flush this buffer when the /flush
endpoint on an ingester is called (or any other stateful component we do manual scale downs of)?
// buffered logger settings | ||
var ( | ||
logEntries uint32 = 256 // buffer up to 256 log lines in memory before flushing to a write(2) syscall | ||
logBufferSize uint32 = 10e6 // 10MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens if 256 entries goes over 10MB? Will the buffer also flush on size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preallocated buffer is just to prevent slice resizing, and because it's reused it prevents unnecessary GC.
If 10MB is exceeded then the buffer's underlying slice will grow, so nothing will really happen - it won't flush on size.
var writer io.Writer | ||
if buffered { | ||
// TODO: it's technically possible here to lose logs between the 100ms flush and the process being killed | ||
// => call buf.Flush() in a signal handler if this is a concern, but this is unlikely to be a problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a cost to also calling Flush in a signal handler? not necessary for this PR, but feels like a pretty cheap and easy way to hedge against this edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No real cost, except last time I looked the signal handler is buried inside the weaveworks/common lib and it'll be a bit of a chore to work around that. Maybe I'm wrong though?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the review!
// buffered logger settings | ||
var ( | ||
logEntries uint32 = 256 // buffer up to 256 log lines in memory before flushing to a write(2) syscall | ||
logBufferSize uint32 = 10e6 // 10MB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The preallocated buffer is just to prevent slice resizing, and because it's reused it prevents unnecessary GC.
If 10MB is exceeded then the buffer's underlying slice will grow, so nothing will really happen - it won't flush on size.
var writer io.Writer | ||
if buffered { | ||
// TODO: it's technically possible here to lose logs between the 100ms flush and the process being killed | ||
// => call buf.Flush() in a signal handler if this is a concern, but this is unlikely to be a problem |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No real cost, except last time I looked the signal handler is buried inside the weaveworks/common lib and it'll be a bit of a chore to work around that. Maybe I'm wrong though?
Calling the |
Promtail is using the line-buffered logger (#6954), and log messages were not being printed on exit
We initialise a global logger in `pkg/util/log/log.go` and use it extensively throughout the Loki codebase. Every time we write a log message, a `write` syscall is invoked. Syscalls are problematic because they transition the process from userspace to kernelspace, which means: - a context-switch is incurred, which is inherently expensive ([1-2 microseconds](https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/)) - the goroutine executing the code is **blocked** - the underlying OS thread (_M_ in the go scheduler model) is **also blocked** - the goroutine has to be rescheduled once the syscall exits - the go scheduler may need to spawn additional OS threads if all are blocked in syscalls - which can also be expensive This change introduces the use of a line-buffered logger. It has a buffer of [256](https://gist.github.com/dannykopping/0704db32c0b08751d1d2494efaa734c2) entries, and once that buffer is filled it will flush to disk. However, a situation will arise in which that buffer remains somewhat empty for a period of time, so there is a periodic flush mechanism, configured to flush every 100ms. There is also a preallocated bytes slice of 10MB which is reused, to avoid excessive slice resizing & garbage collection. This does mean that we could lose up to 256 log messages in case of an ungraceful termination of the process, but this would need to be precisely timed within the 100ms flushes - in other words, the likelihood is low, and generally we shouldn't `kill -9` any Loki process.
Promtail is using the line-buffered logger (grafana#6954), and log messages were not being printed on exit
We initialise a global logger in `pkg/util/log/log.go` and use it extensively throughout the Loki codebase. Every time we write a log message, a `write` syscall is invoked. Syscalls are problematic because they transition the process from userspace to kernelspace, which means: - a context-switch is incurred, which is inherently expensive ([1-2 microseconds](https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/)) - the goroutine executing the code is **blocked** - the underlying OS thread (_M_ in the go scheduler model) is **also blocked** - the goroutine has to be rescheduled once the syscall exits - the go scheduler may need to spawn additional OS threads if all are blocked in syscalls - which can also be expensive This change introduces the use of a line-buffered logger. It has a buffer of [256](https://gist.github.com/dannykopping/0704db32c0b08751d1d2494efaa734c2) entries, and once that buffer is filled it will flush to disk. However, a situation will arise in which that buffer remains somewhat empty for a period of time, so there is a periodic flush mechanism, configured to flush every 100ms. There is also a preallocated bytes slice of 10MB which is reused, to avoid excessive slice resizing & garbage collection. This does mean that we could lose up to 256 log messages in case of an ungraceful termination of the process, but this would need to be precisely timed within the 100ms flushes - in other words, the likelihood is low, and generally we shouldn't `kill -9` any Loki process.
Promtail is using the line-buffered logger (grafana#6954), and log messages were not being printed on exit
We initialise a global logger in `pkg/util/log/log.go` and use it extensively throughout the Loki codebase. Every time we write a log message, a `write` syscall is invoked. Syscalls are problematic because they transition the process from userspace to kernelspace, which means: - a context-switch is incurred, which is inherently expensive ([1-2 microseconds](https://eli.thegreenplace.net/2018/measuring-context-switching-and-memory-overheads-for-linux-threads/)) - the goroutine executing the code is **blocked** - the underlying OS thread (_M_ in the go scheduler model) is **also blocked** - the goroutine has to be rescheduled once the syscall exits - the go scheduler may need to spawn additional OS threads if all are blocked in syscalls - which can also be expensive This change introduces the use of a line-buffered logger. It has a buffer of [256](https://gist.github.com/dannykopping/0704db32c0b08751d1d2494efaa734c2) entries, and once that buffer is filled it will flush to disk. However, a situation will arise in which that buffer remains somewhat empty for a period of time, so there is a periodic flush mechanism, configured to flush every 100ms. There is also a preallocated bytes slice of 10MB which is reused, to avoid excessive slice resizing & garbage collection. This does mean that we could lose up to 256 log messages in case of an ungraceful termination of the process, but this would need to be precisely timed within the 100ms flushes - in other words, the likelihood is low, and generally we shouldn't `kill -9` any Loki process.
Promtail is using the line-buffered logger (grafana#6954), and log messages were not being printed on exit
What this PR does / why we need it:
We initialise a global logger in
pkg/util/log/log.go
and use it extensively throughout the Loki codebase. Every time we write a log message, awrite
syscall is invoked. Syscalls are problematic because they transition the process from userspace to kernelspace, which means:This change introduces the use of a line-buffered logger. It has a buffer of 256 entries, and once that buffer is filled it will flush to disk. However, a situation will arise in which that buffer remains somewhat empty for a period of time, so there is a periodic flush mechanism, configured to flush every 100ms. There is also a preallocated bytes slice of 10MB which is reused, to avoid excessive slice resizing & garbage collection.
This does mean that we could lose up to 256 log messages in case of an ungraceful termination of the process, but this would need to be precisely timed within the 100ms flushes - in other words, the likelihood is low, and generally we shouldn't
kill -9
any Loki process.Which issue(s) this PR fixes:
N/A
Special notes for your reviewer:
This PR uses my private fork of
go-kit/log
; once we've validated this to work in our environment I will attempt to upstream the change.Line-buffering is common, and in Linux the
stdout
stream is line-buffered when attached to a terminal.We also currently make use of the
log.NewSyncWriter
writer, which is totally unnecessary sincewrite
syscalls are threadsafe by nature, and we also don't need to worry about out-of-order logs anymore. Removing this should also eliminate some contention, and there's a config option to control this separately while we test this out.Checklist
CHANGELOG.md
.docs/sources/upgrading/_index.md