Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] datadog-cluster-agent crashes with kube-state-metrics #18904

Closed
hameno opened this issue Aug 20, 2023 · 1 comment
Closed

[BUG] datadog-cluster-agent crashes with kube-state-metrics #18904

hameno opened this issue Aug 20, 2023 · 1 comment

Comments

@hameno
Copy link

hameno commented Aug 20, 2023

Agent Environment
7.46.0 (installed via Helm 3.33.10)

Describe what happened:
datadog-cluster-agent crashes with kubeStateMetricsCore enabled.

2023-08-20T10:18:06.940436413Z panic: runtime error: invalid memory address or nil pointer dereference [recovered]
2023-08-20T10:18:06.940472884Z 	panic: runtime error: invalid memory address or nil pointer dereference
2023-08-20T10:18:06.940481134Z [signal SIGSEGV: segmentation violation code=0x1 addr=0x10 pc=0x2d00e9c]
2023-08-20T10:18:06.940483327Z 
2023-08-20T10:18:06.940485816Z goroutine 631 [running]:
2023-08-20T10:18:06.940487716Z k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x2540be400?})
2023-08-20T10:18:06.940489781Z 	/go/pkg/mod/k8s.io/apimachinery@v0.23.15/pkg/util/runtime/runtime.go:55 +0xd7
2023-08-20T10:18:06.940491778Z 
panic({0x333a760, 0x5fe0a80})
2023-08-20T10:18:06.940493861Z 	/goroot/src/runtime/panic.go:884 +0x212
2023-08-20T10:18:06.940495644Z k8s.io/apimachinery/pkg/api/resource.(*Quantity).AsInt64(...)
2023-08-20T10:18:06.940497781Z 	/go/pkg/mod/k8s.io/apimachinery@v0.23.15/pkg/api/resource/quantity.go:469
2023-08-20T10:18:06.940501501Z github.com/DataDog/datadog-agent/pkg/collector/corechecks/cluster/ksm/customresources.(*hpav2Factory).MetricFamilyGenerators.func5(0xc001978940)
2023-08-20T10:18:06.940504741Z 	/go/src/github.com/DataDog/datadog-agent/pkg/collector/corechecks/cluster/ksm/customresources/hpa.go:165 +0x25c
2023-08-20T10:18:06.940506887Z github.com/DataDog/datadog-agent/pkg/collector/corechecks/cluster/ksm/customresources.wrapHPAFunc.func1({0x38525e0?, 0xc001978940})
2023-08-20T10:18:06.940508555Z 	/go/src/github.com/DataDog/datadog-agent/pkg/collector/corechecks/cluster/ksm/customresources/hpa.go:379 +0x5a
2023-08-20T10:18:06.940510333Z k8s.io/kube-state-metrics/v2/pkg/metric_generator.(*FamilyGenerator).Generate(...)
2023-08-20T10:18:06.940512515Z 	/go/pkg/mod/github.com/datadog/kube-state-metrics/v2@v2.2.2-0.20230217083638-a9a9c0ff16f4/pkg/metric_generator/generator.go:77
2023-08-20T10:18:06.940534501Z k8s.io/kube-state-metrics/v2/pkg/metric_generator.ComposeMetricGenFuncs.func1({0x38525e0, 0xc001978940})
2023-08-20T10:18:06.940536597Z 	/go/pkg/mod/github.com/datadog/kube-state-metrics/v2@v2.2.2-0.20230217083638-a9a9c0ff16f4/pkg/metric_generator/generator.go:123 +0xd8
2023-08-20T10:18:06.940538331Z github.com/DataDog/datadog-agent/pkg/kubestatemetrics/store.(*MetricsStore).Add(0xc000f41cc0, {0x38525e0, 0xc001978940})
2023-08-20T10:18:06.940540010Z 	/go/src/github.com/DataDog/datadog-agent/pkg/kubestatemetrics/store/store.go:84 +0x79
2023-08-20T10:18:06.940542093Z github.com/DataDog/datadog-agent/pkg/kubestatemetrics/store.(*MetricsStore).Replace(0xc00153c460?, {0xc0004d1500?, 0x35, 0xc00153c438?}, {0xc13097f7b71d5f59?, 0x25f8c216?})
2023-08-20T10:18:06.940544447Z 	/go/src/github.com/DataDog/datadog-agent
+0x68
2023-08-20T10:18:06.940546101Z k8s.io/client-go/tools/cache.(*Reflector).syncWith(0xc0010b0700, {0xc00013f500, 0x35, 0x0?}, {0xc001731e90, 0x9})
2023-08-20T10:18:06.940547756Z 	/go/pkg/mod/k8s.io/client-go@v0.23.15/tools/cache/reflector.go:456 +0x98
2023-08-20T10:18:06.940549388Z k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch.func1(0xc0010b0700, 0xc00153c360, 0xc000f345a0, 0xc0011cfdb0)
2023-08-20T10:18:06.940550968Z 	/go/pkg/mod/k8s.io/client-go@v0.23.15/tools/cache/reflector.go:354 +0x794
2023-08-20T10:18:06.940552518Z k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0xc0010b0700, 0xc000f345a0)
2023-08-20T10:18:06.940556912Z 	/go/pkg/mod/k8s.io/client-go@v0.23.15/tools/cache/reflector.go:361 +0x214
2023-08-20T10:18:06.940558670Z k8s.io/client-go/tools/cache.(*Reflector).Run.func1()
2023-08-20T10:18:06.940560271Z 	/go/pkg/mod/k8s.io/client-go@v0.23.15/tools/cache/reflector.go:221 +0x26
2023-08-20T10:18:06.940561821Z k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0xc000f264e0?)
2023-08-20T10:18:06.940563690Z 	/go/pkg/mod/k8s.io/apimachinery@v0.23.15/pkg/util/wait/wait.go:155 +0x3e
2023-08-20T10:18:06.940565345Z k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc000408780?, {0x443e5c0, 0xc000f729b0}, 0x1, 0xc000f345a0)
2023-08-20T10:18:06.940569250Z 	/go/pkg/mod/k8s.io/apimachinery@v0.23.15/pkg/util/wait/wait.go:156 +0xb6
2023-08-20T10:18:06.940570887Z k8s.io/client-go/tools/cache.(*Reflector).Run(0xc0010b0700, 0xc000f345a0)
2023-08-20T10:18:06.940572397Z 	/go/pkg/mod/k8s.io/client-go@v0.23.15/tools/cache/reflector.go:220 +0x17d
2023-08-20T10:18:06.940574259Z created by github.com/DataDog/datadog-agent/pkg/kubestatemetrics/builder.(*Builder).startReflector
2023-08-20T10:18:06.940575800Z 	/go/src/github.com/DataDog/datadog-agent/pkg/kubestatemetrics/builder/builder.go:237 +0xed

Describe what you expected:
no crashes

Steps to reproduce the issue:
Install EKS 1.27, with rancher-monitoring 102.0.1+up40.1.2 and Datadog

Additional environment details (Operating System, Cloud provider, etc):
EKS 1.27

@davidor
Copy link
Member

davidor commented Sep 20, 2023

This should be fixed by the changes introduced in this PR: #18580

The bug triggered when there are HPA objects in the cluster with metrics of type "Object" and "AverageValue" target type

@davidor davidor closed this as completed Sep 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants