Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid warnings when using multi-cluster allocation #2498

Closed
tmokmss opened this issue Feb 28, 2022 · 2 comments · Fixed by #2860
Closed

Invalid warnings when using multi-cluster allocation #2498

tmokmss opened this issue Feb 28, 2022 · 2 comments · Fixed by #2860
Assignees
Labels
kind/bug These are bugs.
Milestone

Comments

@tmokmss
Copy link
Contributor

tmokmss commented Feb 28, 2022

What happened:
When I use multi-cluster allocation feature in the below configuration, where Router cluster is just dispatching allocation request to DGS clusters 1,2, I'm getting invalid warnings from the agones-allocator pod on Router cluster. On each allocation, this kind of log appears.

Router ┬─> DGS cluster 1
       └─> DGS cluster 2

Printed logs are below. You can see warnings like gameserver.agones.dev \"dgs-fleet-nf97g-pszwg\" not found on every allocation.


{"error":null,"message":"allocation response is being sent","response":{"gameServerName":"dgs-fleet-nf97g-fn67k","ports":[{"name":"default","port":7605}],"address":"reducted.compute.amazonaws.com","nodeName":"ip-10-0-5-242.ap-northeast-1.compute.internal"},"severity":"info","source":"main","time":"2022-02-28T09:31:05.758738163Z"}
{"message":"allocation request received.","request":{"namespace":"default","multiClusterSetting":{"enabled":true}},"severity":"info","source":"main","time":"2022-02-28T09:31:11.705316761Z"}
{"endpoint":"reducted.elb.ap-northeast-1.amazonaws.com:443","gsaKey":"remote-allocation","message":"forwarding allocation request","request":{"namespace":"default","multiClusterSetting":{"policySelector":{}},"requiredGameServerSelector":{},"metaPatch":{},"metadata":{},"gameServerSelectors":[{}]},"severity":"debug","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:11.706135831Z"}
{"error":"gameserver.agones.dev \"dgs-fleet-nf97g-pszwg\" not found","message":"failed to get gameserver:dgs-fleet-nf97g-pszwg namespace:","severity":"warning","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:11.771535492Z"}
{"error":null,"message":"allocation response is being sent","response":{"gameServerName":"dgs-fleet-nf97g-pszwg","ports":[{"name":"default","port":7953}],"address":"reducted.compute.amazonaws.com","nodeName":"ip-10-0-5-242.ap-northeast-1.compute.internal"},"severity":"info","source":"main","time":"2022-02-28T09:31:11.7715759Z"}
{"message":"allocation request received.","request":{"namespace":"default","multiClusterSetting":{"enabled":true}},"severity":"info","source":"main","time":"2022-02-28T09:31:17.710855882Z"}
{"endpoint":"reducted.elb.ap-northeast-1.amazonaws.com:443","gsaKey":"remote-allocation","message":"forwarding allocation request","request":{"namespace":"default","multiClusterSetting":{"policySelector":{}},"requiredGameServerSelector":{},"metaPatch":{},"metadata":{},"gameServerSelectors":[{}]},"severity":"debug","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:17.711196102Z"}
{"error":"gameserver.agones.dev \"dgs-fleet-wpdmh-jmn27\" not found","message":"failed to get gameserver:dgs-fleet-wpdmh-jmn27 namespace:","severity":"warning","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:17.77880276Z"}
{"error":null,"message":"allocation response is being sent","response":{"gameServerName":"dgs-fleet-wpdmh-jmn27","ports":[{"name":"default","port":7056}],"address":"reducted.compute.amazonaws.com","nodeName":"ip-10-0-88-178.ap-northeast-1.compute.internal"},"severity":"info","source":"main","time":"2022-02-28T09:31:17.779472706Z"}
{"message":"allocation request received.","request":{"namespace":"default","multiClusterSetting":{"enabled":true}},"severity":"info","source":"main","time":"2022-02-28T09:31:28.705297459Z"}
{"endpoint":"reducted.elb.ap-northeast-1.amazonaws.com:443","gsaKey":"remote-allocation","message":"forwarding allocation request","request":{"namespace":"default","multiClusterSetting":{"policySelector":{}},"requiredGameServerSelector":{},"metaPatch":{},"metadata":{},"gameServerSelectors":[{}]},"severity":"debug","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:28.705618768Z"}
{"error":"gameserver.agones.dev \"dgs-fleet-wpdmh-7f6z8\" not found","message":"failed to get gameserver:dgs-fleet-wpdmh-7f6z8 namespace:","severity":"warning","source":"*gameserverallocations.Allocator","time":"2022-02-28T09:31:28.797657118Z"}

It seems Router cluster is trying to get information about gameserver on DGS clusters, resulting in not found errors.

What you expected to happen:
No warnings shown.

How to reproduce it (as minimally and precisely as possible):
Don't know specifically how to repoduce this. At least I'm just using multi-cluster allocation in the way written in docs and warnings are always printed.

Anything else we need to know?:

Environment:

  • Agones version: 1.21.0
  • Kubernetes version (use kubectl version): 1.21.0
  • Cloud provider or hardware configuration: aws
  • Install method (yaml/helm): helm
  • Troubleshooting guide log(s):
  • Others:
@tmokmss tmokmss added the kind/bug These are bugs. label Feb 28, 2022
@roberthbailey
Copy link
Member

As an aside, your solution looks similar to the example @pooneh-m just sent a PR for (#2499) except that you are using an Agones cluster w/ MCA for routing the requests instead of using cloud run.

@roberthbailey
Copy link
Member

That warning is coming from this line. It looks like after a successful allocation, the allocator service is trying to look up the game server in the local cluster and failing to find it (because it doesn't exist in the local cluster).

It feels to me like this lookup should be skipped if we know that the allocation was from a remote cluster. The function applyMultiClusterAllocation doesn't currently return whether the allocation was done locally or not so the code site which calls setResponse for the metrics can't hint at whether the game server should exist in the local cluster.

The good news is that this warning is benign. But it will take a bit of refactoring in the code to be able to skip the lookup when it shouldn't be done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment