Fix failover strategy with 3 or more clusters #1705

abaguas · 2024-08-12T20:30:39Z

In a setup with 3 or more clusters the failover strategy is not returning the correct targets if the DNS query hits a non-primary region.

How to reproduce

Create a 3 cluster setup that contains a GSLB with failover strategy with eu as primary cluster. The podinfo app is up on all clusters:

K8GB_LOCAL_VERSION=test CLUSTERS_NUMBER=3 make deploy-full-local-setup

Retrieve the local targets to see which IP addresses are exposed by each cluster (eu, us and cz respectively):

➜  ~ dig +short -p 5053 @localhost localtargets-failover.cloud.example.com
172.19.0.6
172.19.0.7
➜  ~ dig +short -p 5054 @localhost localtargets-failover.cloud.example.com
172.19.0.10
172.19.0.11
➜  ~ dig +short -p 5055 @localhost localtargets-failover.cloud.example.com
172.19.0.14
172.19.0.15

Query the domain failover.cloud.example on each cluster:

➜  ~ dig +short -p 5053 @localhost failover.cloud.example.com
172.19.0.6
172.19.0.7
➜  ~ dig +short -p 5054 @localhost failover.cloud.example.com
172.19.0.6
172.19.0.7
172.19.0.14
172.19.0.15
➜  ~ dig +short -p 5055 @localhost failover.cloud.example.com
172.19.0.6
172.19.0.7
172.19.0.10
172.19.0.11

We would expect all clusters to return 172.19.0.6 and 172.19.0.7, however the non-primary clusters return also the IP addresses of the other non-primary cluster:

eu returns eu
us returns eu and cz
cz returns eu and us

This happens because a non-primary cluster returns the IP addresses of all other clusters, which is correct in a 2 cluster setup but not on a 3 cluster setup.

Fix

To fix the issue the login for non-primary clusters needs to be adapted. If the app is healthy on the primary cluster then these the targets on that cluster must be returned. If the application is unhealthy on the primary cluster then the addresses of all healthy clusters should be returned.

After the change:

➜  ~ dig +short -p 5053 @localhost failover.cloud.example.com
172.19.0.6
172.19.0.7
➜  ~ dig +short -p 5054 @localhost failover.cloud.example.com
172.19.0.6
172.19.0.7
➜  ~ dig +short -p 5055 @localhost failover.cloud.example.com
172.19.0.6
172.19.0.7

After scaling down podinfo on the primary cluster (eu):

➜  ~ k scale deploy frontend-podinfo --replicas=0 -n test-gslb --context k3d-test-gslb1
deployment.apps/frontend-podinfo scaled

➜  ~ dig +short -p 5053 @localhost failover.cloud.example.com
172.19.0.10
172.19.0.11
172.19.0.14
172.19.0.15
➜  ~ dig +short -p 5054 @localhost failover.cloud.example.com
172.19.0.10
172.19.0.11
172.19.0.14
172.19.0.15
➜  ~ dig +short -p 5055 @localhost failover.cloud.example.com
172.19.0.14
172.19.0.15
172.19.0.10
172.19.0.11

Tests

Two unit tests needed to be adapted since they did not have the correct geo tags.
The tests were overwriting the ClusterGeoTag to za and the GSLB's PrimaryGeoTag to eu which resulted in evaluating the targets on a non-primary cluster (za != eu). However, the external targets only contain records from us-east-1 (value of ExtClustersGeoTags). Interestingly, non-intentionally this actually simulates a 3 cluster setup in a non-conventional way with:

za -> localtargets up
eu -> down
us-east-1 -> external targets up

Due to the bug described above, the addresses of us-east-1 (all external targets) were being returned and the test was passing. But with the fix the correct output is the targets of both za and us-east-1 (local targets + external targets), that is why the tests failed.

To fix this the tags were updated to have only two clusters: us-east-1 and us-west-1, since this was the intended scenario for the test.

@localhost

The failover strategy is not returning the correct targets if the DNS query hits a non-primary region How to reproduce: Create a 3 cluster setup that contains a GSLB with failover strategy with `eu` as primary cluster. The podinfo app is up on all clusters. ``` K8GB_LOCAL_VERSION=test CLUSTERS_NUMBER=3 make deploy-full-local-setup ``` Retrieve the local targets to see which IP addresses are exposed by each cluster (`eu`, `us` and `cz` respectively): ``` ➜ ~ dig +short -p 5053 @localhost localtargets-failover.cloud.example.com 172.19.0.6 172.19.0.7 ➜ ~ dig +short -p 5054 @localhost localtargets-failover.cloud.example.com 172.19.0.10 172.19.0.11 ➜ ~ dig +short -p 5055 @localhost localtargets-failover.cloud.example.com 172.19.0.14 172.19.0.15 ``` Query the domain `failover.cloud.example` on each cluster: ``` ➜ ~ dig +short -p 5053 @localhost failover.cloud.example.com 172.19.0.6 172.19.0.7 ➜ ~ dig +short -p 5054 @localhost failover.cloud.example.com 172.19.0.6 172.19.0.7 172.19.0.14 172.19.0.15 ➜ ~ dig +short -p 5055 @localhost failover.cloud.example.com 172.19.0.6 172.19.0.7 172.19.0.10 172.19.0.11 ``` We would expect all clusters to return 172.19.0.6 and 172.19.0.7, however the non-primary clusters return also the IP addresses of the other cluster: * `eu` returns `eu` * `us` returns `eu` and `cz` * `cz` returns `eu` and `us` This happens because a non-primary cluster returns the IP addresses of all other clusters, which is correct in a 2 cluster setup but not on a 3 cluster setup. Fix To fix the issue the IP addresses of the primary cluster must be returned if they exist. If the application is down on the primary cluster then all addresses of all clusters should be returned. After the change: ``` ➜ ~ dig +short -p 5053 @localhost failover.cloud.example.com 172.19.0.6 172.19.0.7 ➜ ~ dig +short -p 5054 @localhost failover.cloud.example.com 172.19.0.6 172.19.0.7 ➜ ~ dig +short -p 5055 @localhost failover.cloud.example.com 172.19.0.6 172.19.0.7 ``` After scaling down podinfo on the primary cluster (`eu`): ``` ➜ ~ k scale deploy frontend-podinfo --replicas=0 -n test-gslb --context k3d-test-gslb1 deployment.apps/frontend-podinfo scaled ``` ``` ➜ ~ dig +short -p 5053 @localhost failover.cloud.example.com 172.19.0.10 172.19.0.11 172.19.0.14 172.19.0.15 ➜ ~ dig +short -p 5054 @localhost failover.cloud.example.com 172.19.0.10 172.19.0.11 172.19.0.14 172.19.0.15 ➜ ~ dig +short -p 5055 @localhost failover.cloud.example.com 172.19.0.14 172.19.0.15 172.19.0.10 172.19.0.11 ``` Signed-off-by: Andre Baptista Aguas <andre.aguas@protonmail.com>

ytsarev

Cool stuff 👍

abaguas requested review from donovanmuller, k0da, kuritka, ytsarev and jkremser as code owners August 12, 2024 20:30

abaguas marked this pull request as draft August 12, 2024 20:34

abaguas force-pushed the fix/multiclusterbug branch from 30df86d to a28696f Compare August 12, 2024 21:19

abaguas marked this pull request as ready for review August 13, 2024 05:41

ytsarev approved these changes Aug 29, 2024

View reviewed changes

ytsarev merged commit d98d51e into k8gb-io:master Aug 29, 2024
15 checks passed

ytsarev mentioned this pull request Oct 2, 2024

Test implemented LB strategies for more than 2 cluster scenario #78

Closed

abaguas deleted the fix/multiclusterbug branch November 1, 2024 10:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix failover strategy with 3 or more clusters #1705

Fix failover strategy with 3 or more clusters #1705

abaguas commented Aug 12, 2024 •

edited

Loading

ytsarev left a comment

Fix failover strategy with 3 or more clusters #1705

Fix failover strategy with 3 or more clusters #1705

Conversation

abaguas commented Aug 12, 2024 • edited Loading

How to reproduce

Fix

Tests

ytsarev left a comment

Choose a reason for hiding this comment

abaguas commented Aug 12, 2024 •

edited

Loading