-
Notifications
You must be signed in to change notification settings - Fork 491
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/scollector: add a kill switch for total memory used by scollector #1866
Conversation
@@ -245,17 +245,27 @@ func main() { | |||
} | |||
collect.MaxQueueLen = conf.MaxQueueLen | |||
} | |||
maxMemMegaBytes := uint64(500) | |||
maxMemMB := uint64(500) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I love the default of having the kill switch active. May surprise some people.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It has always been active, but only for runtime memory. This adds a second check for total memory usage.
We were using over 1GB on many systems (around 80GB total on all systems) due to a WMI leak, which could have caused other systems to fail if we hadn't noticed.
very nice. LGTM |
This only works if the process monitoring collectors are enabled, but it matches based on the os.Getpid call, so I don't think you need to be explicitly monitoring scollector. I'm going to manually deploy this to a few systems including ny-bosun01 (largest scollector memory usage on linux in our environment) to make sure there aren't any issues. Can then merge it on Monday. |
722b493
to
ccfad48
Compare
So it appears we don't log panics on Windows so I changed the error from a panic to a fatal. I also added #1867 to find a way to log panics. The fatal error will look like this: |
ccfad48
to
1764425
Compare
… leaks outside the runtime
1764425
to
aad1661
Compare
This should help prevent memory leaks in CGO or WMI from causing scollector to consume too much memory.