Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Large Number of VPA Objects Results in Segfault with verticalpodautoscalers Enabled #830

Closed
stewbernetes opened this issue Jul 18, 2019 · 4 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@stewbernetes
Copy link

stewbernetes commented Jul 18, 2019

Is this a BUG REPORT or FEATURE REQUEST?:

Uncomment only one, leave it on its own line:

/kind bug

What happened:
I am attempting to take advantage of the metrics being emitted for VPA with the 1.7.0 release.

When I enable all collectors plus the VPA collector as follows:

      - name: kube-state-metrics
        image: quay.io/coreos/kube-state-metrics:v1.7.0
        args:
          - '--collectors=certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,limitranges,namespaces,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,verticalpodautoscalers'
          - '--v=5'

I experience the following crash:

I0718 14:46:48.473233       1 main.go:90] Using collectors certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,limitranges,namespaces,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,verticalpodautoscalers
I0718 14:46:48.473295       1 main.go:99] Using all namespace
I0718 14:46:48.473302       1 main.go:140] metric white-blacklisting: blacklisting the following items:
W0718 14:46:48.473323       1 client_config.go:541] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I0718 14:46:48.474207       1 main.go:185] Testing communication with server
I0718 14:46:48.487955       1 main.go:190] Running with Kubernetes cluster version: v1.11. git version: v1.11.7. git tree state: clean. commit: 65ecaf0671341311ce6aea0edab46ee69f65d59e. platform: linux/amd64
I0718 14:46:48.487972       1 main.go:192] Communication with server successful
I0718 14:46:48.488128       1 main.go:201] Starting kube-state-metrics self metrics server: 0.0.0.0:8081
I0718 14:46:48.488183       1 reflector.go:122] Starting reflector *v1beta1.CertificateSigningRequest (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488215       1 reflector.go:160] Listing and watching *v1beta1.CertificateSigningRequest from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488218       1 reflector.go:122] Starting reflector *v1.Deployment (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488236       1 reflector.go:160] Listing and watching *v1.Deployment from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488327       1 reflector.go:122] Starting reflector *v1.ResourceQuota (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488346       1 reflector.go:122] Starting reflector *v1beta1.Ingress (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488348       1 reflector.go:160] Listing and watching *v1.ResourceQuota from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488353       1 reflector.go:160] Listing and watching *v1beta1.Ingress from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488396       1 reflector.go:122] Starting reflector *v1.ReplicaSet (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488400       1 reflector.go:122] Starting reflector *v1.PersistentVolume (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488405       1 reflector.go:122] Starting reflector *v2beta1.HorizontalPodAutoscaler (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488414       1 reflector.go:122] Starting reflector *v1.ReplicationController (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488419       1 reflector.go:160] Listing and watching *v1.PersistentVolume from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488421       1 reflector.go:160] Listing and watching *v2beta1.HorizontalPodAutoscaler from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488425       1 reflector.go:160] Listing and watching *v1.ReplicationController from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488411       1 reflector.go:160] Listing and watching *v1.ReplicaSet from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488417       1 reflector.go:122] Starting reflector *v1.Pod (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488329       1 reflector.go:122] Starting reflector *v1beta1.CronJob (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488336       1 reflector.go:122] Starting reflector *v1.ConfigMap (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488978       1 builder.go:126] Active collectors: certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,limitranges,namespaces,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,verticalpodautoscalers
I0718 14:46:48.488991       1 main.go:226] Starting metrics server: 0.0.0.0:8080
I0718 14:46:48.489016       1 reflector.go:122] Starting reflector *v1.Secret (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489039       1 reflector.go:160] Listing and watching *v1.Secret from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489060       1 reflector.go:122] Starting reflector *v1.Service (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489082       1 reflector.go:160] Listing and watching *v1.ConfigMap from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489090       1 reflector.go:160] Listing and watching *v1.Service from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489118       1 reflector.go:122] Starting reflector *v1.StatefulSet (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489136       1 reflector.go:160] Listing and watching *v1.StatefulSet from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489147       1 reflector.go:122] Starting reflector *v1.StorageClass (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488862       1 reflector.go:122] Starting reflector *v1.LimitRange (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489161       1 reflector.go:160] Listing and watching *v1.StorageClass from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489163       1 reflector.go:160] Listing and watching *v1.LimitRange from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489141       1 reflector.go:122] Starting reflector *v1.Endpoints (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488377       1 reflector.go:122] Starting reflector *v1.Namespace (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488357       1 reflector.go:122] Starting reflector *v1.Job (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489147       1 reflector.go:122] Starting reflector *v1beta2.VerticalPodAutoscaler (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489237       1 reflector.go:160] Listing and watching *v1.Pod from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488373       1 reflector.go:122] Starting reflector *v1beta1.PodDisruptionBudget (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489271       1 reflector.go:160] Listing and watching *v1beta1.PodDisruptionBudget from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488391       1 reflector.go:122] Starting reflector *v1.PersistentVolumeClaim (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488367       1 reflector.go:122] Starting reflector *v1.DaemonSet (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489186       1 reflector.go:160] Listing and watching *v1.Endpoints from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489216       1 reflector.go:160] Listing and watching *v1.Namespace from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.488381       1 reflector.go:122] Starting reflector *v1.Node (0s) from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489345       1 reflector.go:160] Listing and watching *v1beta1.CronJob from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489359       1 reflector.go:160] Listing and watching *v1.PersistentVolumeClaim from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489396       1 reflector.go:160] Listing and watching *v1.Node from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489406       1 reflector.go:160] Listing and watching *v1.DaemonSet from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489223       1 reflector.go:160] Listing and watching *v1.Job from k8s.io/kube-state-metrics/internal/store/builder.go:295
I0718 14:46:48.489234       1 reflector.go:160] Listing and watching *v1beta2.VerticalPodAutoscaler from k8s.io/kube-state-metrics/internal/store/builder.go:295
E0718 14:46:48.557239       1 runtime.go:73] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 31 [running]:
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/runtime.logPanic(0x12a2f40, 0x21c7590)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:69 +0x7b
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:51 +0x82
panic(0x12a2f40, 0x21c7590)
	/usr/local/go/src/runtime/panic.go:522 +0x1b5
k8s.io/kube-state-metrics/internal/store.glob..func233(0xc0008e0000, 0xc001c04f50)
	/go/src/k8s.io/kube-state-metrics/internal/store/verticalpodautoscaler.go:99 +0x53
k8s.io/kube-state-metrics/internal/store.wrapVPAFunc.func1(0x13e0300, 0xc0008e0000, 0xc0020b27e0)
	/go/src/k8s.io/kube-state-metrics/internal/store/verticalpodautoscaler.go:233 +0x5c
k8s.io/kube-state-metrics/pkg/metric.(*FamilyGenerator).Generate(...)
	/go/src/k8s.io/kube-state-metrics/pkg/metric/generator.go:39
k8s.io/kube-state-metrics/pkg/metric.ComposeMetricGenFuncs.func1(0x13e0300, 0xc0008e0000, 0xc0000f2bc0, 0xc0008e0000, 0x0)
	/go/src/k8s.io/kube-state-metrics/pkg/metric/generator.go:78 +0x123
k8s.io/kube-state-metrics/pkg/metrics_store.(*MetricsStore).Add(0xc0000f2bc0, 0x13e0300, 0xc0008e0000, 0x0, 0x0)
	/go/src/k8s.io/kube-state-metrics/pkg/metrics_store/metrics_store.go:76 +0xd2
k8s.io/kube-state-metrics/pkg/metrics_store.(*MetricsStore).Replace(0xc0000f2bc0, 0xc00021b900, 0x125, 0x125, 0xc0020c8398, 0x8, 0x2135cf1e0103e1f4, 0x5d308658)
	/go/src/k8s.io/kube-state-metrics/pkg/metrics_store/metrics_store.go:138 +0xa4
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).syncWith(0xc0000e7540, 0xc000219300, 0x125, 0x125, 0xc0020c8398, 0x8, 0x0, 0xc0005b41e0)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:315 +0xf8
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch.func1(0xc0000e7540, 0xc0005b0680, 0xc00042a120, 0xc0020dbcc0, 0x0, 0x0)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:215 +0x8ae
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0xc0000e7540, 0xc00042a120, 0x0, 0x0)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:222 +0x17b
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).Run.func1()
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:124 +0x33
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0000d1f78)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0020dbf78, 0x3b9aca00, 0x0, 0x1, 0xc00042a120)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait.Until(...)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).Run(0xc0000e7540, 0xc00042a120)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:123 +0x16b
created by k8s.io/kube-state-metrics/internal/store.(*Builder).reflectorPerNamespace
	/go/src/k8s.io/kube-state-metrics/internal/store/builder.go:296 +0x12b
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x116e663]

goroutine 31 [running]:
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/runtime.HandleCrash(0x0, 0x0, 0x0)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/runtime/runtime.go:58 +0x105
panic(0x12a2f40, 0x21c7590)
	/usr/local/go/src/runtime/panic.go:522 +0x1b5
k8s.io/kube-state-metrics/internal/store.glob..func233(0xc0008e0000, 0xc001c04f50)
	/go/src/k8s.io/kube-state-metrics/internal/store/verticalpodautoscaler.go:99 +0x53
k8s.io/kube-state-metrics/internal/store.wrapVPAFunc.func1(0x13e0300, 0xc0008e0000, 0xc0020b27e0)
	/go/src/k8s.io/kube-state-metrics/internal/store/verticalpodautoscaler.go:233 +0x5c
k8s.io/kube-state-metrics/pkg/metric.(*FamilyGenerator).Generate(...)
	/go/src/k8s.io/kube-state-metrics/pkg/metric/generator.go:39
k8s.io/kube-state-metrics/pkg/metric.ComposeMetricGenFuncs.func1(0x13e0300, 0xc0008e0000, 0xc0000f2bc0, 0xc0008e0000, 0x0)
	/go/src/k8s.io/kube-state-metrics/pkg/metric/generator.go:78 +0x123
k8s.io/kube-state-metrics/pkg/metrics_store.(*MetricsStore).Add(0xc0000f2bc0, 0x13e0300, 0xc0008e0000, 0x0, 0x0)
	/go/src/k8s.io/kube-state-metrics/pkg/metrics_store/metrics_store.go:76 +0xd2
k8s.io/kube-state-metrics/pkg/metrics_store.(*MetricsStore).Replace(0xc0000f2bc0, 0xc00021b900, 0x125, 0x125, 0xc0020c8398, 0x8, 0x2135cf1e0103e1f4, 0x5d308658)
	/go/src/k8s.io/kube-state-metrics/pkg/metrics_store/metrics_store.go:138 +0xa4
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).syncWith(0xc0000e7540, 0xc000219300, 0x125, 0x125, 0xc0020c8398, 0x8, 0x0, 0xc0005b41e0)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:315 +0xf8
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch.func1(0xc0000e7540, 0xc0005b0680, 0xc00042a120, 0xc0020dbcc0, 0x0, 0x0)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:215 +0x8ae
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0xc0000e7540, 0xc00042a120, 0x0, 0x0)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:222 +0x17b
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).Run.func1()
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:124 +0x33
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1(0xc0000d1f78)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152 +0x54
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil(0xc0020dbf78, 0x3b9aca00, 0x0, 0x1, 0xc00042a120)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153 +0xf8
k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait.Until(...)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88
k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache.(*Reflector).Run(0xc0000e7540, 0xc00042a120)
	/go/src/k8s.io/kube-state-metrics/vendor/k8s.io/client-go/tools/cache/reflector.go:123 +0x16b
created by k8s.io/kube-state-metrics/internal/store.(*Builder).reflectorPerNamespace
	/go/src/k8s.io/kube-state-metrics/internal/store/builder.go:296 +0x12b

When I confine kube-state-metrics to a single namespace with only a few VPA objects, the collector runs fine and there are no issues.

What you expected to happen:
kube-state-metrics to execute correctly.

How to reproduce it (as minimally and precisely as possible):
Enable the verticalpodautoscalers collector on a cluster with a large number of VPA objects and run it globally without namespace restrictions.

Anything else we need to know?:
I currently have 293 VPA objects across this particular cluster.

Environment:

  • Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-20T04:49:16Z", GoVersion:"go1.12.6", Compiler:"gc", Platform:"darwin/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.7", GitCommit:"65ecaf0671341311ce6aea0edab46ee69f65d59e", GitTreeState:"clean", BuildDate:"2019-01-24T19:22:45Z", GoVersion:"go1.10.7", Compiler:"gc", Platform:"linux/amd64"}
  • Kube-state-metrics image version
      - name: kube-state-metrics
        image: quay.io/coreos/kube-state-metrics:v1.7.0
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Jul 18, 2019
@brancz
Copy link
Member

brancz commented Jul 18, 2019

This doesn't seem like it has anything to do with the number of VPA objects, but rather that we didn't guard properly:

for _, c := range a.Spec.ResourcePolicy.ContainerPolicies {

@tariq1890
Copy link
Contributor

closed by #832

@davidquarles
Copy link

@tariq1890 This is still happening to me on both 1.7.2 and 1.8.0

@tariq1890
Copy link
Contributor

Can you paste the error stack?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

5 participants