Allocation Endpoint - Better load balancing for gRPC connection #1872
Labels
area/operations
Installation, updating, metrics etc
area/performance
Anything to do with Agones being slow, or making it go faster.
kind/feature
New features for Agones
obsolete
stale
Pending closure unless there is a strong objection.
Is your feature request related to a problem? Please describe.
The Kubernetes load balancer doesn't load balance a single gRPC connection across multiple pods, so when doing Allocations via the gRPC endpoint, a single client will only ever connect and use a single allocation Pod.
Describe the solution you'd like
Have the system load balance across each of the allocation pods, without requiring the external integration team (i.e. the end user) to do extra work.
Some research:
1. https://grpc.io/blog/grpc-load-balancing/
1. https://blog.nobugware.com/post/2019/kubernetes_mesh_network_load_balancing_grpc_services/
1. https://kubernetes.io/blog/2018/11/07/grpc-load-balancing-on-kubernetes-without-tears/
1. https://itnext.io/proxyless-grpc-load-balancing-in-kubernetes-ca1a4797b742
1. https://cloud.google.com/solutions/exposing-grpc-services-on-gke-using-envoy-proxy
The Envoy based solution from Google Cloud looks to me to be the most viable.
The potentially interesting part would be that the TLS certificates would then be handled by Envoy rather than the Allocation endpoint itself.
Describe alternatives you've considered
Leave things as is. Allocation seems to be performing reasonably well (or at least it should do once #1863 is merged)
Additional context
Initially discussed here: #1867 (comment)
The text was updated successfully, but these errors were encountered: