-
Notifications
You must be signed in to change notification settings - Fork 813
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Register liveness check in gameservers.Controller #160
Conversation
Build Succeeded 👏 Build Id: 52e05777-3c73-48ae-868e-24efd6c4c5bc The following development artifacts have been built, and will exist for the next 30 days:
|
Build Succeeded 👏 Build Id: 59b401cd-ffa4-4646-8143-8898bbd33d17 The following development artifacts have been built, and will exist for the next 30 days:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! I had some comment, mainly around syntactic locking patterns within functions (I prefer to use defer
). Interested to hear your thoughts.
pkg/gameservers/controller.go
Outdated
@@ -82,7 +82,8 @@ func NewController( | |||
kubeInformerFactory informers.SharedInformerFactory, | |||
extClient extclientset.Interface, | |||
agonesClient versioned.Interface, | |||
agonesInformerFactory externalversions.SharedInformerFactory) *Controller { | |||
agonesInformerFactory externalversions.SharedInformerFactory, | |||
) *Controller { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason, I find this being on the next line kinda weird. I'm assuming you did this on purpose 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha. I can put it back. It's not related. Just helps me parse the args faster.
pkg/gameservers/controller.go
Outdated
@@ -109,6 +110,12 @@ func NewController( | |||
c.recorder = eventBroadcaster.NewRecorder(scheme.Scheme, corev1.EventSource{Component: "gameserver-controller"}) | |||
|
|||
c.workerqueue = workerqueue.NewWorkerQueue(c.syncGameServer, c.logger, stable.GroupName+".GameServerController") | |||
health.AddLivenessCheck("game-server-worker-queue", healthcheck.Check(func() error { | |||
if !c.workerqueue.Healthy() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought - would it make sense to have c.workerqueue.Healthy() return an error when it's not healthy - rather than a bool?
Then we could just do: health.AddLivenessCheck("game-server-worker-queue", c.workerqueue.Healthy)
Not wedded to it, but thought it might be worth discussion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a better idea. I'll make the change.
@@ -40,6 +45,10 @@ type WorkerQueue struct { | |||
queue workqueue.RateLimitingInterface | |||
// SyncHandler is exported to make testing easier (hack) | |||
SyncHandler Handler | |||
|
|||
mu sync.Mutex |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a RWLock? I suppose there will only be a single thread hitting the Healthy()
function, so there isn't much benefit to allow for multiple concurrent Healthy()
requests. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm tempted to start with a plain mutex and leave open the possibility for improving performance of the WorkerQueue
once we understand the bottle necks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM!
} | ||
|
||
func (wq *WorkerQueue) run(stop <-chan struct{}) { | ||
wq.inc() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wq.inc()
defer wq.dec()
wait.Until(wq.runWorker, workFx, stop)
?
pkg/util/workerqueue/workerqueue.go
Outdated
} | ||
|
||
// Healthy reports whether all the worker goroutines are running. | ||
func (wq *WorkerQueue) Healthy() bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wq.mu.Lock()
defer wq.mu.Unlock()
return wq.workers == wq.running
?
In this case, this shows why I prefer a lock/defer unlock pattern, as once you past it, it shows much easier exactly what the logic of the function is. Otherwise you have to mentally unwind what want
and got
actually mean. In this case it's pretty easy, but be good to set a standard. WDYT?
} | ||
|
||
func (wq *WorkerQueue) setWorkerCount(n int) { | ||
wq.mu.Lock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wq.mu.Lock()
defer wq.mu.Unlock()
wq.workers = n
?
Really just being consistent in lock/unlock strategy through the code.
pkg/util/workerqueue/workerqueue.go
Outdated
wq.mu.Unlock() | ||
} | ||
|
||
func (wq *WorkerQueue) inc() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wq.mu.Lock()
defer wq.mu.Unlock()
wq.running++
?
pkg/util/workerqueue/workerqueue.go
Outdated
wq.mu.Unlock() | ||
} | ||
|
||
func (wq *WorkerQueue) dec() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wq.mu.Lock()
defer wq.mu.Unlock()
wq.running--
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy to use the defer
pattern.
Build Succeeded 👏 Build Id: 9e961156-17ee-42ca-bbc2-dda161ba8007 The following development artifacts have been built, and will exist for the next 30 days:
|
Build Succeeded 👏 Build Id: d2d16947-94cf-49cc-8a13-949c6ee2ad9c The following development artifacts have been built, and will exist for the next 30 days:
|
Build Succeeded 👏 Build Id: 47746dc9-6df4-4c24-b80d-11153abc1cd8 The following development artifacts have been built, and will exist for the next 30 days:
|
The liveness check is based on the worker queue having all its worker goroutines running. If one of those goroutines exits, the liveness check reports an unhealthy status. Fixes #116
Build Succeeded 👏 Build Id: a1f9ebfd-df17-495d-87da-87f484a7c0ed The following development artifacts have been built, and will exist for the next 30 days:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
The liveness check is based on the worker queue having all its worker
goroutines running. If one of those goroutines exits, the liveness check
reports an unhealthy status.
Fixes #116