Support multi containers health checks #2054

jie-bao · 2021-04-09T02:22:22Z

Is your feature request related to a problem? Please describe.
I have a gameserver that contains 2 unreal server (two containers) in the same pod. Each of them is necessary for running the game and can’t know the other one’s health status. According to current Agones design, any of them call the ClientSDK health API will mark the gameserver healthy regardless of whether the other one is healthy or not.

Describe the solution you'd like
Ideally, we want Agones to support multiple containers health check.

Describe alternatives you've considered
Currently, I made a sidecar for collecting their healthy status then reports to Agones. I did the same thing for the ready API too.

The text was updated successfully, but these errors were encountered:

roberthbailey · 2021-04-09T15:43:49Z

Can you say a bit more about how the two containers are related? I'm curious how it is that they must live in the same pod (be colocated) but that they don't have a strong enough dependency for either one to know if the other is unhealthy.

markmandel · 2021-04-09T17:50:48Z

Ooh! Interesting! I'm always intrigued to hear about how people are deconstructing the dedicated games server monolith!

I am wondering one thing though: Given the SDK is accessible on localhost, is there a reason you couldn't have both containers send health pings, with appropriate timing such that the GameServer moves to unhealthy if one fails?

roberthbailey · 2021-04-09T18:28:16Z

That might work for health checks but if Ready shouldn't be triggered until both are ready, then we can't do that today. And my brain goes straight into how we could design this, but I'll hold off on inserting those comments until it's appropriate. :)

jie-bao · 2021-04-12T02:19:10Z

Can you say a bit more about how the two containers are related? I'm curious how it is that they must live in the same pod (be colocated) but that they don't have a strong enough dependency for either one to know if the other is unhealthy.

We have two UE servers for handling different workloads, one dedicated for AI and one for other main logics. They serve the clients through something like a proxy. They don't know each other's status. In addition, the "proxy" also can't tell whether a server is alive because it can't identify a server disconnection is a normal operation or due to some issue.

jie-bao · 2021-04-12T02:21:00Z

That might work for health checks but if Ready shouldn't be triggered until both are ready, then we can't do that today. And my brain goes straight into how we could design this, but I'll hold off on inserting those comments until it's appropriate. :)

I think the ready is easy. We know we have two containers need to be ready, then we can just waiting for two ready calls from different containers.

markmandel · 2021-04-12T23:39:18Z

That might work for health checks but if Ready shouldn't be triggered until both are ready, then we can't do that today. And my brain goes straight into how we could design this, but I'll hold off on inserting those comments until it's appropriate. :)

Yeah, I wasn't worrying about the Ready since that's not ticketed issue at this stage 🙂

This is one of those situations where I know we could work out a fix, but I'm also wondering if we should ??? 🤔

Or at least - should we in core Agones? The idea of having some open source sidecars to solve these kind of generic problems might be a more flexible option? At least until we see more dominant patterns of multi-game-server usage?

jie-bao · 2021-04-14T01:56:44Z

I'm fine with the sidecar solution. But I wonder whether Agones has something like a "plugin" model that can integrate the customized functions more seamlessly, so that game developers don't need to call different APIs for relative functions.

In my situation, the UE server will call my sidecar's ready/health API, while it still calls the watchGameServer API with Agones Client SDK. That means it need to know 2 different service endpoints and make the game code a bit complicated.

markmandel · 2021-04-14T23:33:21Z

But I wonder whether Agones has something like a "plugin" model that can integrate the customized functions more seamlessly, so that game developers don't need to call different APIs for relative functions.

Can you expand further on what you mean by this -- what would this look like in your mind?

jie-bao · 2021-04-15T03:19:07Z

Ideally, the game developers shouldn't know too much about infrastructure. Using Agones SDK only is fine. But using Agones SDK together with a customized sidecar is a bit fussy. Especially we need to explain why you should call the Ready/Health API provided by the sidecar and must not call the Ready/Health API in Agones SDK.

So I think if Agones can provide a way that redirect or proxy some API calls to another container. Or allowing injecting some customized logic into those APIs. But I don't have a clear idea about how to implement it.

markmandel · 2021-04-15T16:43:57Z

Sounds like what may be a better solution is rolling your own SDK for your specific platform, that quite potentially wraps around the Agones SDK, only overwriting the pieces you need.

Then you can redirect what you need where you need it to go?

I also can't think of a good plugin mechanism for the SDK 🤔 Although you still have full access to the Agones/Kubernetes API - which may also be an interesting way to go.

markmandel · 2022-07-28T02:31:55Z

Marking as stale, since it sounds like this was solved by a custom code + Agones SDK solution? Please let us know if you have objections to closing in the next few weeks.

markmandel · 2022-09-21T23:41:47Z

Closing, since no objections.

jie-bao added the kind/feature New features for Agones label Apr 9, 2021

markmandel mentioned this issue Apr 12, 2021

Adding AGONES_SDK_GRPC_HOST to NewSDK #1183

Closed

markmandel added the stale Pending closure unless there is a strong objection. label Jul 28, 2022

markmandel added the wontfix Sorry, but we're not going to do that. label Sep 21, 2022

markmandel closed this as completed Sep 21, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support multi containers health checks #2054

Support multi containers health checks #2054

jie-bao commented Apr 9, 2021

roberthbailey commented Apr 9, 2021

markmandel commented Apr 9, 2021

roberthbailey commented Apr 9, 2021

jie-bao commented Apr 12, 2021

jie-bao commented Apr 12, 2021

markmandel commented Apr 12, 2021

jie-bao commented Apr 14, 2021 •

edited

Loading

markmandel commented Apr 14, 2021

jie-bao commented Apr 15, 2021

markmandel commented Apr 15, 2021

markmandel commented Jul 28, 2022

markmandel commented Sep 21, 2022

Support multi containers health checks #2054

Support multi containers health checks #2054

Comments

jie-bao commented Apr 9, 2021

roberthbailey commented Apr 9, 2021

markmandel commented Apr 9, 2021

roberthbailey commented Apr 9, 2021

jie-bao commented Apr 12, 2021

jie-bao commented Apr 12, 2021

markmandel commented Apr 12, 2021

jie-bao commented Apr 14, 2021 • edited Loading

markmandel commented Apr 14, 2021

jie-bao commented Apr 15, 2021

markmandel commented Apr 15, 2021

markmandel commented Jul 28, 2022

markmandel commented Sep 21, 2022

jie-bao commented Apr 14, 2021 •

edited

Loading