Skip to content
This repository has been archived by the owner on Aug 23, 2023. It is now read-only.

Dynamic GOGC based on ready state #1194

Merged
merged 6 commits into from
Jan 14, 2019
Merged

Dynamic GOGC based on ready state #1194

merged 6 commits into from
Jan 14, 2019

Conversation

Dieterbe
Copy link
Contributor

@Dieterbe Dieterbe commented Dec 29, 2018

This is a cleaned up version of #1171
See that PR for more information and rationale.

I have tested it like so:

cd docker/docker-dev-custom-cfg-kafka
docker-compose up --force-recreate -V
# wait til ready...
fakemetrics backfill --kafka-mdm-addr localhost:9092 --offset 5h --period 10s --speedup 100 --mpo 20000

after that completes, restart MT with gc-percent-not-ready settings that gradually go down.
Here's the tested settings:

100
50
20
10

i always restarted right at the change of a minute.
if i were to do this again, i would restart a few seconds after the minute change, to give it a chance to save all chunks. currently there's some missing data, but only after the instance stabilized so it doesn't really get in the way of the analysis.

snapshots:
https://snapshot.raintank.io/dashboard/snapshot/FoSdCguE9qi4eBXMZv3xxnRVW3EolcaN
https://snapshot.raintank.io/dashboard/snapshot/HsqiUL4vMEFFYf9oJDhLwve3Ibw5X248

I also used this patch:

diff --git a/cluster/node.go b/cluster/node.go
index 43bde4ee..940fded7 100644
--- a/cluster/node.go
+++ b/cluster/node.go
@@ -131,11 +131,14 @@ func (n HTTPNode) IsLocal() bool {
 // readyStateGCHandler adjusts the gcPercent value based on the node ready state
 func (n HTTPNode) readyStateGCHandler() {
        if gcPercent == gcPercentNotReady {
+               log.Infof("HANDLER. prio:%d, state:%q, total ready:%t. keeping stock GOGC", n.Priority, n.State, n.IsReady())
                return
        }
        if n.IsReady() {
+               log.Infof("HANDLER. prio:%d, state:%q, total ready:%t -> setting gc percent %d", n.Priority, n.State, true, gcPercent)
                debug.SetGCPercent(gcPercent)
        } else {
+               log.Infof("HANDLER. prio:%d, state:%q, total ready:%t -> setting gc percent %d", n.Priority, n.State, false, gcPercentNotReady)
                debug.SetGCPercent(gcPercentNotReady)
        }
 }

here's the output of the last run:

2018-12-29 12:39:19.039 [INFO] HANDLER. prio:0, state:"NodeNotReady", total ready:false -> setting gc percent 10
2018-12-29 12:39:19.040 [INFO] HANDLER. prio:0, state:"NodeReady", total ready:true -> setting gc percent 100
2018-12-29 12:39:29.040 [INFO] HANDLER. prio:4357775, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:39:39.040 [INFO] HANDLER. prio:130, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:39:49.043 [INFO] HANDLER. prio:121, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:39:59.042 [INFO] HANDLER. prio:103, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:40:09.040 [INFO] HANDLER. prio:112, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:40:19.050 [INFO] HANDLER. prio:85, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:40:29.040 [INFO] HANDLER. prio:79, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:40:39.040 [INFO] HANDLER. prio:62, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:40:49.040 [INFO] HANDLER. prio:47, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:40:59.048 [INFO] HANDLER. prio:37, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:41:09.040 [INFO] HANDLER. prio:26, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:41:19.040 [INFO] HANDLER. prio:16, state:"NodeReady", total ready:false -> setting gc percent 10
2018-12-29 12:41:29.040 [INFO] HANDLER. prio:10, state:"NodeReady", total ready:true -> setting gc percent 100
2018-12-29 12:41:39.040 [INFO] HANDLER. prio:0, state:"NodeReady", total ready:true -> setting gc percent 100
2018-12-29 12:41:49.040 [INFO] HANDLER. prio:1, state:"NodeReady", total ready:true -> setting gc percent 100

note that before it starts consuming data it briefly marks itself as ready.
this should not happen on read-only nodes (they use a warm up period) but i suspect this problem may also appear on read nodes if they don't start consuming the kafka backlog in time. and is something to investigate further/separately.
But even for write nodes we don't want this to happen as it means our gogc controls is temporarily not honored.

conclusions as far as this feature

  1. it works
  2. you can see with each run, cpu usage goes up, GC runs go up, cpu spent in GC goes up, and less memory is needed.
  3. in my most extreme run, you trade in cpu 265% -> 440% (66% change), you get a memory reduction from 450MB->280MB (38%). it also comes with an ingestion decrease from 81s to 132s (+63%).
    whether this is worth it, depends. I suspect a milder approach, maybe 50 is good to get started.

note: in real environments i expect the numbers to be different.

docs/config.md Outdated Show resolved Hide resolved
Copy link
Member

@woodsaj woodsaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this capability, but i don't like the idea of having a "Node" instance manipulating memory management tunables.

I think the code would be more maintainable if memory management was handled somewhere more central, ie its own package or in github.com/grafana/metricatank/util. Ideally, I would like to see MT have a notification system that components that can subscribe and publish to, so that any component can choose to be notified of events like the GC value changing, but that is outside the scope of this change.

We also need to write log messages when the GC value changes and probably emit a metric of what the value is (probably best to do that in github.com/grafana/metrictank/stats/memory_reporter.go

@replay
Copy link
Contributor

replay commented Dec 31, 2018

emitting a metric with the current setting is a really good idea, then we can easily compare the GOGC value with all the other metrics on one dashboard and watch what effect the changes have.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Jan 10, 2019

I think the code would be more maintainable if memory management was handled somewhere more central, ie its own package or in github.com/grafana/metricatank/util. Ideally, I would like to see MT have a notification system that components that can subscribe and publish to, so that any component can choose to be notified of events like the GC value changing, but that is outside the scope of this change.

agreed on all counts, including an event system being out of scope here.
we could move the GC setting to another package, but the reality is that currently, the Node is the de facto central place where we are currently being informed about changes in ready state and priority, as well as where we track their current values. so merely moving the GC setting wouldn't buy us much as it would still be driven by HTTPNode until we do a big refactor. hence i would keep this as is for now.

I'll add the log messages and metrics though.

@Dieterbe
Copy link
Contributor Author

There doesn't seem to be a way to query the runtime for the current GOGC value (!)
closest I've found was https://golang.org/pkg/runtime/#MemStats NextGC (which is the target heap size to run the next GC cycle).
So instead of doing it from the memory reporter, We can track it with a gauge or something whenever we make changes (in the HTTPNode). I can put it in the memory.gc graphite namespace though, so it'll be along with the other metrics from the memory reporter.

@woodsaj
Copy link
Member

woodsaj commented Jan 10, 2019

There doesn't seem to be a way to query the runtime for the current GOGC value

if you move the GC management into a separate pacakge you can just query that package.

eg

package gc
var currentGC int
func CurrentGC() int {
...
}
func SetGC(int gc) {
...
}

then you can have the cluster pacakge call gc.SetGC() and have the stats package call gc.CurrentGC()

@Dieterbe
Copy link
Contributor Author

true, but i'm trying to keep the stats package standalone and reusable for our other software.
but i guess making it depend on (import) a separate gc package isn't such a big problem.

@woodsaj
Copy link
Member

woodsaj commented Jan 10, 2019

What if we just set the GOGC envvar when we change the gc settings. Then just have the stats package report on what the env var is set to. That should make it much more portable.

@Dieterbe
Copy link
Contributor Author

Dieterbe commented Jan 11, 2019

okay I added the metric and log for the GOGC change.
preview of dashboard update (check out the new row with the runtime stats including GOGC. there is some room for #1198 also )
https://snapshot.raintank.io/dashboard/snapshot/LY6e0kOtgl6j12IpkrRWTulpmcgm5Gn6?orgId=2

also rebased on top of master

@replay @woodsaj PTAL

@Dieterbe Dieterbe requested a review from replay January 11, 2019 13:46
Copy link
Contributor

@replay replay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with two comments, none of which are blockers.

@Dieterbe Dieterbe changed the title Dynamic gogc based on ready state Dynamic GOGC based on ready state Jan 11, 2019
@Dieterbe
Copy link
Contributor Author

huh.

Merging can be performed automatically once the requested changes are addressed.

pretty sure I did

@woodsaj maybe if you approve now it'll work

@Dieterbe Dieterbe deleted the dynamic-gogc branch March 27, 2019 21:08
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants