Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Argocd-application-controller not sharding per cluster #9633

Closed
chrism417 opened this issue Jun 10, 2022 · 22 comments · Fixed by #13018
Closed

Argocd-application-controller not sharding per cluster #9633

chrism417 opened this issue Jun 10, 2022 · 22 comments · Fixed by #13018
Labels
bug Something isn't working

Comments

@chrism417
Copy link

According to the docs here: https://argo-cd.readthedocs.io/en/stable/operator-manual/high_availability/#argocd-application-controller it says:

If the controller is managing too many clusters and uses too much memory then you can shard clusters across multiple controller replicas. To enable sharding increase the number of replicas in argocd-application-controller StatefulSet and repeat number of replicas in ARGOCD_CONTROLLER_REPLICAS environment variable. The strategic merge patch below demonstrates changes required to configure two controller replicas.

Since we have 8 clusters, we have ours set to 8. However, when looking at the logs for several of the pods, they show all 8 clusters being ignored:

time="2022-06-10T19:21:10Z" level=info msg="Ignoring cluster redacted"
time="2022-06-10T19:21:10Z" level=info msg="Ignoring cluster redacted"
time="2022-06-10T19:21:10Z" level=info msg="Ignoring cluster redacted"
time="2022-06-10T19:21:10Z" level=info msg="Ignoring cluster redacted"
time="2022-06-10T19:21:10Z" level=info msg="Ignoring cluster redacted"
time="2022-06-10T19:21:10Z" level=info msg="Ignoring cluster redacted"
time="2022-06-10T19:21:10Z" level=info msg="Ignoring cluster redacted"
time="2022-06-10T19:21:10Z" level=info msg="Ignoring cluster redacted"

We're running the latest ArgoCD version 2.4.0, however this was present in ArgoCD 2.3.3 and 2.3.4.

@chrism417 chrism417 added the bug Something isn't working label Jun 10, 2022
@leoluz
Copy link
Collaborator

leoluz commented Jun 13, 2022

ArgoCD will return this error if it can't handle the cluster.

Please inspect the controller logs for additional information.
It seems that for some reason ArgoCD isn't able to infer the shard.
Search for these error messages:

hostname, err := os.Hostname()
if err != nil {
return 0, err
}
parts := strings.Split(hostname, "-")
if len(parts) == 0 {
return 0, fmt.Errorf("hostname should ends with shard number separated by '-' but got: %s", hostname)
}
shard, err := strconv.Atoi(parts[len(parts)-1])
if err != nil {
return 0, fmt.Errorf("hostname should ends with shard number separated by '-' but got: %s", hostname)
}
return shard, nil

@chrism417
Copy link
Author

Hi @leoluz I don't see that anywhere in the logs. This is from a fresh start of the argocd-application-controller:

➜  kops git:(master) (⎈ |Infra:default) kubectl logs -n argocd argocd-application-controller-0 -f                                                                                                                                                            <aws:infra>
time="2022-06-13T14:52:13Z" level=info msg="Processing clusters from shard 0"
time="2022-06-13T14:52:13Z" level=info msg="appResyncPeriod=3m0s, appHardResyncPeriod=0s"
time="2022-06-13T14:52:13Z" level=info msg="Application Controller (version: v2.4.0+91aefab, built: 2022-06-10T17:23:37Z) starting (namespace: argocd)"
time="2022-06-13T14:52:13Z" level=info msg="Starting configmap/secret informers"
time="2022-06-13T14:52:13Z" level=info msg="Configmap/secret informer synced"
time="2022-06-13T14:52:13Z" level=info msg="Ignore status for CustomResourceDefinitions"
time="2022-06-13T14:52:13Z" level=info msg="Ignore '/spec/preserveUnknownFields' for CustomResourceDefinitions"
time="2022-06-13T14:52:13Z" level=info msg="Loading TLS configuration from secret argocd/argocd-server-tls"
time="2022-06-13T14:52:14Z" level=info msg="Ignoring cluster https://kubernetes.default.svc"
time="2022-06-13T14:52:14Z" level=info msg="Starting secretInformer forcluster"
time="2022-06-13T14:52:14Z" level=info msg="0xc0003e45a0 subscribed to settings updates"
time="2022-06-13T14:52:14Z" level=info msg="Ignoring cluster redacted"
time="2022-06-13T14:52:14Z" level=info msg="Ignoring cluster redacted"
time="2022-06-13T14:52:14Z" level=info msg="Ignoring cluster redacted"
time="2022-06-13T14:52:14Z" level=info msg="Ignoring cluster redacted"
time="2022-06-13T14:52:14Z" level=info msg="Ignoring cluster redacted"
time="2022-06-13T14:52:14Z" level=info msg="Ignoring cluster redacted"
time="2022-06-13T14:52:14Z" level=info msg="Ignoring cluster redacted"
time="2022-06-13T14:52:23Z" level=info msg="Loading TLS configuration from secret argocd/argocd-server-tls"
time="2022-06-13T14:52:33Z" level=info msg="Loading TLS configuration from secret argocd/argocd-server-tls"
time="2022-06-13T14:52:43Z" level=info msg="Loading TLS configuration from secret argocd/argocd-server-tls"
time="2022-06-13T14:52:43Z" level=info msg="Notifying 1 settings subscribers: [0xc0003e45a0]"
time="2022-06-13T14:52:43Z" level=info msg="Ignore status for CustomResourceDefinitions"
time="2022-06-13T14:52:43Z" level=info msg="Ignore '/spec/preserveUnknownFields' for CustomResourceDefinitions"
time="2022-06-13T14:52:43Z" level=info msg="Loading TLS configuration from secret argocd/argocd-server-tls"
time="2022-06-13T14:52:53Z" level=info msg="Loading TLS configuration from secret argocd/argocd-server-tls"
time="2022-06-13T14:53:03Z" level=info msg="Loading TLS configuration from secret argocd/argocd-server-tls"

@leoluz
Copy link
Collaborator

leoluz commented Jun 13, 2022

This means that replica 0 isn't managing any cluster.
Clusters are assigned to a specific controller replica following the logic:
shard = hash(clusterSecret.metadata.UID) % replicas

If you don't have huge clusters maybe you can start the controller with 2 replicas and see how your clusters are grouped in the 2 shards?

If you absolutely want to have 8 replicas and having each one managing one specific cluster you can configure your shard manually. In order to do so you need to configure each controller replica with the env var ARGOCD_CONTROLLER_SHARD and have a matching cluster secret defined with the same shard value:

Shard *int64 `json:"shard,omitempty" protobuf:"bytes,9,opt,name=shard"`

@chrism417
Copy link
Author

@leoluz thanks, I was hoping 4 controllers with one explicitly running our "prod" cluster and the rest split between the remaining 7 clusters

@chrism417
Copy link
Author

@leoluz so to this point: In order to do so you need to configure each controller replica with the env var ARGOCD_CONTROLLER_SHARD and have a matching cluster secret defined with the same shard value

We would need 8 different deployments because each pod would need its own ARGOCD_CONTROLLER_SHARD value, correct?

@leoluz
Copy link
Collaborator

leoluz commented Jun 13, 2022

We would need 8 different deployments because each pod would need its own ARGOCD_CONTROLLER_SHARD value, correct?

Im sorry but Im not sure what exactly you mean by "deployments". ArgoCD controller is a StatefulSet. You would need to implement a mechanism to have each StatefulSet replica to specify a different value for the ARGOCD_CONTROLLER_SHARD environment variable. This is what is required for the manual shard configuration in ArgoCD. Unfortunately this particular feature isn't documented yet.

The other (simpler) alternative is defining a shard that distributes clusters automatically like you were doing initially. In this case you just have to define the ARGOCD_CONTROLLER_REPLICAS env var with the same value that you have in your StatefulSet spec.replicas field. However in this case you can't control which cluster gets managed by which controller instance.

@chrism417
Copy link
Author

I apologize I meant 8 different statefulsets to implement different env vars for ARGOCD_CONTROLLER_SHARD. Would this accomplish the task?

@leoluz
Copy link
Collaborator

leoluz commented Jun 13, 2022

I haven't tried it and can't tell for sure if this is going to work. It could break other features also that might expect every controller name to match and just have the last ordinal number incremented (standard StatefulSet behaviour). I wouldn't recommend going in this direction personally.

@chrism417
Copy link
Author

Ok, thank you

@leoluz
Copy link
Collaborator

leoluz commented Jun 13, 2022

No problem.
Im closing this for now.
Feel free to reopen if the issue persists.

@leoluz leoluz closed this as completed Jun 13, 2022
@jannfis jannfis reopened this Jun 20, 2022
@jannfis
Copy link
Member

jannfis commented Jun 20, 2022

I think the shard inferring algorithm is not able to balance clusters among shards equally.

I've taken the algorithm into a Playground script with 10 random UUIDS as follows:

package main

import (
	"fmt"
	"hash/fnv"
)

func main() {
	const replicas = 5

	uids := []string{
		"c6e30ba7-8f0c-443e-add4-9e70f031cbaf",
		"e4bcf18c-4d97-486d-be48-d49d8c0fb59c",
		"c41b2959-d465-4d68-8645-00140d2ab177",
		"31441d1c-6181-40ed-82de-52acf530be3b",
		"bc98e29f-ced1-487c-9625-a753789e03ab",
		"0aecff21-eb62-4b17-9557-92e7d430e604",
		"8c15c8ab-2116-4a2d-93f5-df8d8d6af08b",
		"0f193adb-352f-401b-bd4c-b31886eb2329",
		"f72dd0c6-610e-4951-a652-e78af6e499cc",
		"90017d9f-a570-4437-9dc2-a6189a3828cc",
	}
	for _, id := range uids {
		h := fnv.New32a()
		_, _ = h.Write([]byte(id))
		fmt.Printf("%s - Sum: %d - Shard: %d\n", id, int(h.Sum32()), int(h.Sum32()%uint32(replicas)))
	}
}

I think using the remainder of a modulo operation on the FNV sum of the UUID here is the wrong approach to the problem. The output of above yields:

c6e30ba7-8f0c-443e-add4-9e70f031cbaf - Sum: 1598152769 - Shard: 4
e4bcf18c-4d97-486d-be48-d49d8c0fb59c - Sum: 3980385533 - Shard: 3
c41b2959-d465-4d68-8645-00140d2ab177 - Sum: 3810090739 - Shard: 4
31441d1c-6181-40ed-82de-52acf530be3b - Sum: 4159509069 - Shard: 4
bc98e29f-ced1-487c-9625-a753789e03ab - Sum: 2797501489 - Shard: 4
0aecff21-eb62-4b17-9557-92e7d430e604 - Sum: 1624419721 - Shard: 1
8c15c8ab-2116-4a2d-93f5-df8d8d6af08b - Sum: 806358169 - Shard: 4
0f193adb-352f-401b-bd4c-b31886eb2329 - Sum: 408407996 - Shard: 1
f72dd0c6-610e-4951-a652-e78af6e499cc - Sum: 1352576372 - Shard: 2
90017d9f-a570-4437-9dc2-a6189a3828cc - Sum: 1322962061 - Shard: 1

We should fix it. The workaround is to assign clusters to shards manually, but that can become cumbersome.

@leoluz
Copy link
Collaborator

leoluz commented Jun 21, 2022

@jannfis I agree that it can be improved.
The problem I see is that different use-cases will require a different algorithm.
I suggest we:

  1. Allow users to specify pre-defined sharding algorithms.
  2. Improve the current manual sharding configuration. Today's approach requiring to patch each Statefulset replica from ArgoCD controller is not user-friendly as it requires complex procedures (example1, example2)

@jannfis
Copy link
Member

jannfis commented Jun 21, 2022

@leoluz Is there really a use-case that benefits of distributing clusters in such an uneven way as it does now? I always thought the reason behind automatically assigning clusters to shards would be to distribute them evenly, so that you could scale application controller replicas on an informed basis.

@jannfis
Copy link
Member

jannfis commented Jun 21, 2022

Btw, manually assigning a cluster to a shard is possible (however, not documented) by patching the shard field in the cluster secret to contain the shard number, e.g.

stringData:
  shard: 1

There is no need to patch the statefulset

@leoluz
Copy link
Collaborator

leoluz commented Jun 21, 2022

Btw, manually assigning a cluster to a shard is possible (however, not documented) by patching the shard field in the cluster secret to contain the shard number, e.g.

stringData:
  shard: 1

There is no need to patch the statefulset

Maybe Im wrong but this is not my understanding by reading this code. Unless Im missing something, it seems that it depends on the value of the ARGOCD_CONTROLLER_SHARD env var in each replica.

@jannfis Interesting. Now I understand what the InferShard is doing. tks!

Is there really a use-case that benefits of distributing clusters in such an uneven way as it does now?

I'm afraid I don't know the answer. Maybe @alexmt has a better understanding in the implementation history?

Regarding my previous suggestion to:

  1. Allow users to specify pre-defined sharding algorithms.

I think we have 2 main use-cases:

  • round-robin while assigning clusters to controller replicas
  • inspect cluster load while executing shard distribution

@leoluz
Copy link
Collaborator

leoluz commented Jun 22, 2022

FYI: spoke with @alexmt today and he pointed me to this draft PR which is related to this issue:
#8002

@bofeng96
Copy link

Thanks for these good info! We will keep following this issue. Currently, we're using the method @jannfis mentioned to manually configure sharding. May I know if there is a nicer way to balance app-controllers?

@pharaujo
Copy link

pharaujo commented Jan 4, 2023

I think the shard inferring algorithm is not able to balance clusters among shards equally.

Doing the modulo of the hash result doesn't produce even distribution across shards—it's basically trying to balance placement for all of the hash's result space. To fix that algorithm, you just need to sort the clusters by the hash result in an array, and then modulo their index.

@akram
Copy link
Contributor

akram commented Mar 24, 2023

@pharaujo indeed, this gives better results. We need to think also what would happen when a "cluster" is removed. Then, the shard placement is recalculated, and many permutation could occur, temporarily increasing response times until the placement is stable enough and the CRD cache is rebuilt properly on every shard.
cc @saumeya @jannfis

package main

import (
	"fmt"
	"hash/fnv"
	"sort"
)

func main() {
	const replicas = 5

	uids := []string{
		"c6e30ba7-8f0c-443e-add4-9e70f031cbaf",
		"e4bcf18c-4d97-486d-be48-d49d8c0fb59c",
		"c41b2959-d465-4d68-8645-00140d2ab177",
		"31441d1c-6181-40ed-82de-52acf530be3b",
		"bc98e29f-ced1-487c-9625-a753789e03ab",
		"0aecff21-eb62-4b17-9557-92e7d430e604",
		"8c15c8ab-2116-4a2d-93f5-df8d8d6af08b",
		"0f193adb-352f-401b-bd4c-b31886eb2329",
		"f72dd0c6-610e-4951-a652-e78af6e499cc",
		"90017d9f-a570-4437-9dc2-a6189a3828cc",
	}
	sort.Strings(uids)
	for index, uid := range uids {
		h := fnv.New32a()
		_, _ = h.Write([]byte(uid))
		fmt.Printf("%s - Sum: %d - Shard: %d\n", uid, int(h.Sum32()), int(uint32(index)%uint32(replicas)))
	}
}

which gives:

0aecff21-eb62-4b17-9557-92e7d430e604 - Sum: 1624419721 - Shard: 0
0f193adb-352f-401b-bd4c-b31886eb2329 - Sum: 408407996 - Shard: 1
31441d1c-6181-40ed-82de-52acf530be3b - Sum: 4159509069 - Shard: 2
8c15c8ab-2116-4a2d-93f5-df8d8d6af08b - Sum: 806358169 - Shard: 3
90017d9f-a570-4437-9dc2-a6189a3828cc - Sum: 1322962061 - Shard: 4
bc98e29f-ced1-487c-9625-a753789e03ab - Sum: 2797501489 - Shard: 0
c41b2959-d465-4d68-8645-00140d2ab177 - Sum: 3810090739 - Shard: 1
c6e30ba7-8f0c-443e-add4-9e70f031cbaf - Sum: 1598152769 - Shard: 2
e4bcf18c-4d97-486d-be48-d49d8c0fb59c - Sum: 3980385533 - Shard: 3
f72dd0c6-610e-4951-a652-e78af6e499cc - Sum: 1352576372 - Shard: 4

@akram
Copy link
Contributor

akram commented Mar 24, 2023

Having a deeper look it seems to be not that easy to implement: The cluster UID list cannot be sorted easily in the real-life scenario with the current code base, because, clusterList is pulled by batch of 10.

https://github.com/argoproj/argo-cd/blob/master/cmd/argocd/commands/admin/cluster.go#L107-L121

Another possible simple implementation, which can still be stateless, is to use a random distribution

@pre
Copy link

pre commented Mar 24, 2023

Even batches of 10 would still be an improvement. Eg we have three clusters with

  • Cluster 1 having 10 Application resouces
  • Cluster 2 having 50 Application resources
  • Cluster 3 having 500 Application resources.

Being able to scale these 500 resources of Cluster 3 would be an immensive betterment.

We are currently using the non-documented shard feature to have a dedicated cluster for each application-controller. But the 500 resources are always handled by a single application-controller replica while two other replicas are idle.

@jannfis
Copy link
Member

jannfis commented Mar 24, 2023

clusterList is pulled by batch of 10

@akram That piece of the code you are referring to is just a CLI command, it's not used within the application controller. @alexmt may know why he chose a batch size. I believe it may be for performance reasons within the CLI's command, because it calls GetClusterInfo() for each cluster somewhere.

So I would not consider that to be an obstacle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
7 participants