Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: deadlock in controller #14304

Merged
merged 1 commit into from
Jul 1, 2023
Merged

Conversation

agaudreault
Copy link
Member

@agaudreault agaudreault commented Jul 1, 2023

When the config resource.ignoreResourceUpdatesEnabled: 'true' is true and the application controller log level is set to debug, the controller hangs.


1 @ 0x43bf36 0x44d6af 0x44d686 0x46c4e6 0x1eb05f9 0x1eb05d4 0x21f8229 0x1eafe74 0x1eafd08 0x1ea9f85 0x1a9b137 0x131c59b 0x131c677 0x131d3f8 0x131d369 0x131d29c 0x1a9af32 0x1ea97ff 0x470861
#	0x46c4e5	sync.runtime_SemacquireRWMutexR+0x25								/usr/local/go/src/runtime/sema.go:82
#	0x1eb05f8	sync.(*RWMutex).RLock+0x78									/usr/local/go/src/sync/rwmutex.go:71
#	0x1eb05d3	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).GetClusterInfo+0x53			/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:1058
#	0x21f8228	github.com/argoproj/argo-cd/v2/controller/cache.(*liveStateCache).getCluster.func2+0x4a8	/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:516
#	0x1eafe73	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).onNodeUpdated+0x93			/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:1021
#	0x1eafd07	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).processEvent+0x3a7			/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:1014
#	0x1ea9f84	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).watchEvents.func1+0x6c4		/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:618
#	0x1a9b136	github.com/argoproj/gitops-engine/pkg/utils/kube.RetryUntilSucceed.func1+0xf6			/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/utils/kube/kube.go:411
#	0x131c59a	k8s.io/apimachinery/pkg/util/wait.ConditionFunc.WithContext.func1+0x1a				/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:220
#	0x131c676	k8s.io/apimachinery/pkg/util/wait.runConditionWithCrashProtectionWithContext+0x56		/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:233
#	0x131d3f7	k8s.io/apimachinery/pkg/util/wait.poll+0x37							/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:580
#	0x131d368	k8s.io/apimachinery/pkg/util/wait.PollImmediateUntilWithContext+0x48				/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:545
#	0x131d29b	k8s.io/apimachinery/pkg/util/wait.PollImmediateUntil+0x7b					/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:536
#	0x1a9af31	github.com/argoproj/gitops-engine/pkg/utils/kube.RetryUntilSucceed+0x131			/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/utils/kube/kube.go:409
#	0x1ea97fe	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).watchEvents+0x2de			/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:534

1 @ 0x43bf36 0x44d6af 0x44d686 0x46c4e6 0x1eb05f9 0x1eb05d4 0x21fb688 0x21044ee 0x470861
#	0x46c4e5	sync.runtime_SemacquireRWMutexR+0x25							/usr/local/go/src/runtime/sema.go:82
#	0x1eb05f8	sync.(*RWMutex).RLock+0x78								/usr/local/go/src/sync/rwmutex.go:71
#	0x1eb05d3	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).GetClusterInfo+0x53		/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:1058
#	0x21fb687	github.com/argoproj/argo-cd/v2/controller/cache.(*liveStateCache).GetClustersInfo+0x2c7	/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:797
#	0x21044ed	github.com/argoproj/argo-cd/v2/controller/metrics.(*clusterCollector).Run+0xcd		/go/src/github.com/argoproj/argo-cd/controller/metrics/clustercollector.go:71

1 @ 0x43bf36 0x44d6af 0x44d686 0x46c4e6 0x1eb05f9 0x1eb05d4 0x21fb688 0x2279676 0x22795cf 0x470861
#	0x46c4e5	sync.runtime_SemacquireRWMutexR+0x25							/usr/local/go/src/runtime/sema.go:82
#	0x1eb05f8	sync.(*RWMutex).RLock+0x78								/usr/local/go/src/sync/rwmutex.go:71
#	0x1eb05d3	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).GetClusterInfo+0x53		/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:1058
#	0x21fb687	github.com/argoproj/argo-cd/v2/controller/cache.(*liveStateCache).GetClustersInfo+0x2c7	/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:797
#	0x2279675	github.com/argoproj/argo-cd/v2/controller.(*clusterInfoUpdater).updateClusters+0x55	/go/src/github.com/argoproj/argo-cd/controller/clusterinfoupdater.go:63
#	0x22795ce	github.com/argoproj/argo-cd/v2/controller.(*clusterInfoUpdater).Run+0xae		/go/src/github.com/argoproj/argo-cd/controller/clusterinfoupdater.go:56

1 @ 0x43bf36 0x44d6af 0x44d686 0x46c4e6 0x1ead50b 0x1ead4e6 0x21f95ec 0x2267d75 0x2266fc5 0x22730d5 0x226bc89 0x131c37e 0x131c236 0x131c129 0x131c065 0x470861
#	0x46c4e5	sync.runtime_SemacquireRWMutexR+0x25									/usr/local/go/src/runtime/sema.go:82
#	0x1ead50a	sync.(*RWMutex).RLock+0xaa										/usr/local/go/src/sync/rwmutex.go:71
#	0x1ead4e5	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).IterateHierarchy+0x85			/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:855
#	0x21f95eb	github.com/argoproj/argo-cd/v2/controller/cache.(*liveStateCache).IterateHierarchy+0xcb			/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:587
#	0x2267d74	github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).getResourceTree+0x7b4		/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:466
#	0x2266fc4	github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).setAppManagedResources+0xa4		/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:386
#	0x22730d4	github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).processAppRefreshQueueItem+0x1234	/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:1444
#	0x226bc88	github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).Run.func3+0x28			/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:738
#	0x131c37d	k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1+0x3d						/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:155
#	0x131c235	k8s.io/apimachinery/pkg/util/wait.BackoffUntil+0xb5							/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:156
#	0x131c128	k8s.io/apimachinery/pkg/util/wait.JitterUntil+0x88							/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:133
#	0x131c064	k8s.io/apimachinery/pkg/util/wait.Until+0x24								/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:90

96 @ 0x43bf36 0x46c72c 0x46c70c 0x47948c 0x13222e5 0x2271ee7 0x226bc89 0x131c37e 0x131c236 0x131c129 0x131c065 0x470861
#	0x46c70b	sync.runtime_notifyListWait+0x12b									/usr/local/go/src/runtime/sema.go:527
#	0x47948b	sync.(*Cond).Wait+0x8b											/usr/local/go/src/sync/cond.go:70
#	0x13222e4	k8s.io/client-go/util/workqueue.(*Type).Get+0xa4							/go/pkg/mod/k8s.io/client-go@v0.24.2/util/workqueue/queue.go:157
#	0x2271ee6	github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).processAppRefreshQueueItem+0x46	/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:1319
#	0x226bc88	github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).Run.func3+0x28			/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:738
#	0x131c37d	k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1+0x3d						/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:155
#	0x131c235	k8s.io/apimachinery/pkg/util/wait.BackoffUntil+0xb5							/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:156
#	0x131c128	k8s.io/apimachinery/pkg/util/wait.JitterUntil+0x88							/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:133
#	0x131c064	k8s.io/apimachinery/pkg/util/wait.Until+0x24								/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:90

3 @ 0x43bf36 0x44d6af 0x44d686 0x46c4e6 0x1ea6385 0x1ea6359 0x21f9e13 0x227bda2 0x2284285 0x2272efe 0x226bc89 0x131c37e 0x131c236 0x131c129 0x131c065 0x470861
#	0x46c4e5	sync.runtime_SemacquireRWMutexR+0x25									/usr/local/go/src/runtime/sema.go:82
#	0x1ea6384	sync.(*RWMutex).RLock+0x64										/usr/local/go/src/sync/rwmutex.go:71
#	0x1ea6358	github.com/argoproj/gitops-engine/pkg/cache.(*clusterCache).GetAPIResources+0x38			/go/pkg/mod/github.com/argoproj/gitops-engine@v0.7.1-0.20230607163028-425d65e07695/pkg/cache/cluster.go:289
#	0x21f9e12	github.com/argoproj/argo-cd/v2/controller/cache.(*liveStateCache).GetVersionsInfo+0x72			/go/src/github.com/argoproj/argo-cd/controller/cache/cache.go:635
#	0x227bda1	github.com/argoproj/argo-cd/v2/controller.(*appStateManager).getRepoObjs+0x561				/go/src/github.com/argoproj/argo-cd/controller/state.go:148
#	0x2284284	github.com/argoproj/argo-cd/v2/controller.(*appStateManager).CompareAppState+0x5284			/go/src/github.com/argoproj/argo-cd/controller/state.go:400
#	0x2272efd	github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).processAppRefreshQueueItem+0x105d	/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:1434
#	0x226bc88	github.com/argoproj/argo-cd/v2/controller.(*ApplicationController).Run.func3+0x28			/go/src/github.com/argoproj/argo-cd/controller/appcontroller.go:738
#	0x131c37d	k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1+0x3d						/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:155
#	0x131c235	k8s.io/apimachinery/pkg/util/wait.BackoffUntil+0xb5							/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:156
#	0x131c128	k8s.io/apimachinery/pkg/util/wait.JitterUntil+0x88							/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:133
#	0x131c064	k8s.io/apimachinery/pkg/util/wait.Until+0x24								/go/pkg/mod/k8s.io/apimachinery@v0.24.2/pkg/util/wait/wait.go:90


@codecov
Copy link

codecov bot commented Jul 1, 2023

Codecov Report

Patch and project coverage have no change.

Comparison is base (8032601) 49.75% compared to head (e48bf27) 49.76%.

Additional details and impacted files
@@           Coverage Diff           @@
##           master   #14304   +/-   ##
=======================================
  Coverage   49.75%   49.76%           
=======================================
  Files         261      261           
  Lines       44659    44659           
=======================================
+ Hits        22222    22225    +3     
+ Misses      20251    20249    -2     
+ Partials     2186     2185    -1     
Impacted Files Coverage Δ
controller/cache/cache.go 25.10% <0.00%> (ø)

... and 1 file with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@jannfis
Copy link
Member

jannfis commented Jul 1, 2023

Please submit PR against master, we'll cherry-pick into release-2.8.

@crenshaw-dev crenshaw-dev changed the base branch from release-2.8 to master July 1, 2023 16:02
@crenshaw-dev
Copy link
Member

Changed the target branch, but diff is gonna be weird until rebase.

@crenshaw-dev
Copy link
Member

/cherry-pick release-2.8

Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Copy link
Member

@jannfis jannfis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks

@jannfis jannfis merged commit 53db27e into argoproj:master Jul 1, 2023
gcp-cherry-pick-bot bot pushed a commit that referenced this pull request Jul 1, 2023
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
crenshaw-dev pushed a commit that referenced this pull request Jul 1, 2023
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Co-authored-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
yyzxw pushed a commit to yyzxw/argo-cd that referenced this pull request Aug 9, 2023
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
@suzaku suzaku mentioned this pull request Aug 15, 2023
tesla59 pushed a commit to tesla59/argo-cd that referenced this pull request Dec 16, 2023
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants