Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash on nonexistent metric paths in custom resources #1992

Closed
bartebor opened this issue Feb 14, 2023 · 4 comments · Fixed by #1998 or #2140
Closed

Crash on nonexistent metric paths in custom resources #1992

bartebor opened this issue Feb 14, 2023 · 4 comments · Fixed by #1998 or #2140
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@bartebor
Copy link

What happened:
When defining custom metric of nonexistent field, kube-state-metrics crashes. This happens for example when referencing some status field in a newly created object which initially does not have status at all (ADDED event).

What you expected to happen:
kube-state-metrics should not crash and ignore the metric.

How to reproduce it (as minimally and precisely as possible):
Use the following config (you may substitute Prometheus CRD with something else, but you need to have at least one object of the definition):

kind: CustomResourceStateMetrics
spec:
  resources:
    - groupVersionKind:
        group: monitoring.coreos.com 
        kind: Prometheus
        version: v1
      metrics:
        - name: "some_stat"
          each:
            type: Gauge
            gauge:
              path: [notExists, someField]

and run kube-state-metrics:

 ./kube-state-metrics --v 16  --custom-resource-state-config-file custommetrics-test.yaml --resources=prometheuses

Anything else we need to know?:
Here is a log fragment from a crash:

I0214 16:06:12.513522  357332 registry_factory.go:639] "Checked" compiledFamilyName="kube_customresource_some_stat" unstructuredName="kube-prometheus-stack-prometheus"
E0214 16:06:12.513710  357332 runtime.go:79] Observed a panic: "invalid memory address or nil pointer dereference" (runtime error: invalid memory address or nil pointer dereference)
goroutine 73 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic({0x1818d20?, 0x293bd80})
        /go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:75 +0x99
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x2540be400?})
        /go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:49 +0x75
panic({0x1818d20, 0x293bd80})
        /usr/local/go/src/runtime/panic.go:884 +0x212
k8s.io/kube-state-metrics/v2/pkg/customresourcestate.(*compiledGauge).Values(0xc000131d40, {0x0?, 0x0?})
        /go/src/k8s.io/kube-state-metrics/pkg/customresourcestate/registry_factory.go:285 +0x31a
k8s.io/kube-state-metrics/v2/pkg/customresourcestate.scrapeValuesFor({0x7f940b6c17d0, 0xc000131d40}, 0xc000526c60)
        /go/src/k8s.io/kube-state-metrics/pkg/customresourcestate/registry_factory.go:661 +0x8f
k8s.io/kube-state-metrics/v2/pkg/customresourcestate.generate(0xc00058ce20, {{0xc000500c00, 0x1d}, {0x0, 0x0}, {0x7f940b6c17d0, 0xc000131d40}, 0xc000526750, 0xc000526840, 0x0}, ...)
        /go/src/k8s.io/kube-state-metrics/pkg/customresourcestate/registry_factory.go:643 +0x29c
k8s.io/kube-state-metrics/v2/pkg/customresourcestate.famGen.func1({0x1a30ba0?, 0xc00058ce20?})
        /go/src/k8s.io/kube-state-metrics/pkg/customresourcestate/registry_factory.go:632 +0x6c
k8s.io/kube-state-metrics/v2/pkg/metric_generator.(*FamilyGenerator).Generate(...)
        /go/src/k8s.io/kube-state-metrics/pkg/metric_generator/generator.go:73
k8s.io/kube-state-metrics/v2/pkg/metric_generator.ComposeMetricGenFuncs.func1({0x1a30ba0, 0xc00058ce20})
        /go/src/k8s.io/kube-state-metrics/pkg/metric_generator/generator.go:119 +0xd8
k8s.io/kube-state-metrics/v2/pkg/metrics_store.(*MetricsStore).Add(0xc0002690c0, {0x1a30ba0, 0xc00058ce20})
        /go/src/k8s.io/kube-state-metrics/pkg/metrics_store/metrics_store.go:71 +0xd4
k8s.io/kube-state-metrics/v2/pkg/metrics_store.(*MetricsStore).Replace(0xc0002690c0, {0xc0004ed570, 0x1, 0xc00012ea68?}, {0xc0f306b91e9b14ba?, 0x1ab9623a?})
        /go/src/k8s.io/kube-state-metrics/pkg/metrics_store/metrics_store.go:133 +0xab
k8s.io/client-go/tools/cache.(*Reflector).syncWith(0xc0004be0f0, {0xc0004ed530, 0x1, 0x0?}, {0xc000560ba0, 0x9})
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:469 +0x98
k8s.io/client-go/tools/cache.(*Reflector).list(0xc0004be0f0, 0xc0005aa1e0)
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:454 +0x82b
k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0xc0004be0f0, 0xc0005aa1e0)
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:259 +0x152
k8s.io/client-go/tools/cache.(*Reflector).Run.func1()
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:223 +0x26
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x2966ce0?)
        /go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:157 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00026a2c0?, {0x1cbf340, 0xc0002745a0}, 0x1, 0xc0005aa1e0)
        /go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:158 +0xb6
k8s.io/client-go/tools/cache.(*Reflector).Run(0xc0004be0f0, 0xc0005aa1e0)
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:222 +0x185
created by k8s.io/kube-state-metrics/v2/internal/store.(*Builder).startReflector
        /go/src/k8s.io/kube-state-metrics/internal/store/builder.go:575 +0x2c5
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
        panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x164595a]

goroutine 73 [running]:
k8s.io/apimachinery/pkg/util/runtime.HandleCrash({0x0, 0x0, 0x2540be400?})
        /go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/runtime/runtime.go:56 +0xd7
panic({0x1818d20, 0x293bd80})
        /usr/local/go/src/runtime/panic.go:884 +0x212
k8s.io/kube-state-metrics/v2/pkg/customresourcestate.(*compiledGauge).Values(0xc000131d40, {0x0?, 0x0?})
        /go/src/k8s.io/kube-state-metrics/pkg/customresourcestate/registry_factory.go:285 +0x31a
k8s.io/kube-state-metrics/v2/pkg/customresourcestate.scrapeValuesFor({0x7f940b6c17d0, 0xc000131d40}, 0xc000526c60)
        /go/src/k8s.io/kube-state-metrics/pkg/customresourcestate/registry_factory.go:661 +0x8f
k8s.io/kube-state-metrics/v2/pkg/customresourcestate.generate(0xc00058ce20, {{0xc000500c00, 0x1d}, {0x0, 0x0}, {0x7f940b6c17d0, 0xc000131d40}, 0xc000526750, 0xc000526840, 0x0}, ...)
        /go/src/k8s.io/kube-state-metrics/pkg/customresourcestate/registry_factory.go:643 +0x29c
k8s.io/kube-state-metrics/v2/pkg/customresourcestate.famGen.func1({0x1a30ba0?, 0xc00058ce20?})
        /go/src/k8s.io/kube-state-metrics/pkg/customresourcestate/registry_factory.go:632 +0x6c
k8s.io/kube-state-metrics/v2/pkg/metric_generator.(*FamilyGenerator).Generate(...)
        /go/src/k8s.io/kube-state-metrics/pkg/metric_generator/generator.go:73
k8s.io/kube-state-metrics/v2/pkg/metric_generator.ComposeMetricGenFuncs.func1({0x1a30ba0, 0xc00058ce20})
        /go/src/k8s.io/kube-state-metrics/pkg/metric_generator/generator.go:119 +0xd8
k8s.io/kube-state-metrics/v2/pkg/metrics_store.(*MetricsStore).Add(0xc0002690c0, {0x1a30ba0, 0xc00058ce20})
        /go/src/k8s.io/kube-state-metrics/pkg/metrics_store/metrics_store.go:71 +0xd4
k8s.io/kube-state-metrics/v2/pkg/metrics_store.(*MetricsStore).Replace(0xc0002690c0, {0xc0004ed570, 0x1, 0xc00012ea68?}, {0xc0f306b91e9b14ba?, 0x1ab9623a?})
        /go/src/k8s.io/kube-state-metrics/pkg/metrics_store/metrics_store.go:133 +0xab
k8s.io/client-go/tools/cache.(*Reflector).syncWith(0xc0004be0f0, {0xc0004ed530, 0x1, 0x0?}, {0xc000560ba0, 0x9})
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:469 +0x98
k8s.io/client-go/tools/cache.(*Reflector).list(0xc0004be0f0, 0xc0005aa1e0)
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:454 +0x82b
k8s.io/client-go/tools/cache.(*Reflector).ListAndWatch(0xc0004be0f0, 0xc0005aa1e0)
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:259 +0x152
k8s.io/client-go/tools/cache.(*Reflector).Run.func1()
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:223 +0x26
k8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1(0x2966ce0?)
        /go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:157 +0x3e
k8s.io/apimachinery/pkg/util/wait.BackoffUntil(0xc00026a2c0?, {0x1cbf340, 0xc0002745a0}, 0x1, 0xc0005aa1e0)
        /go/pkg/mod/k8s.io/apimachinery@v0.26.1/pkg/util/wait/wait.go:158 +0xb6
k8s.io/client-go/tools/cache.(*Reflector).Run(0xc0004be0f0, 0xc0005aa1e0)
        /go/pkg/mod/k8s.io/client-go@v0.26.1/tools/cache/reflector.go:222 +0x185
created by k8s.io/kube-state-metrics/v2/internal/store.(*Builder).startReflector
        /go/src/k8s.io/kube-state-metrics/internal/store/builder.go:575 +0x2c5

Environment:

  • kube-state-metrics version: 2.8.0
  • Kubernetes version (use kubectl version):
$ kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"archive", BuildDate:"2022-04-02T14:49:13Z", GoVersion:"go1.18", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"23", GitVersion:"v1.23.12", GitCommit:"c6939792865ef0f70f92006081690d77411c8ed5", GitTreeState:"clean", BuildDate:"2022-09-21T12:13:07Z", GoVersion:"go1.17.13", Compiler:"gc", Platform:"linux/amd64"}
  • Cloud provider or hardware configuration: none (on premises)
  • Other info: -
@bartebor bartebor added the kind/bug Categorizes issue or PR as related to a bug. label Feb 14, 2023
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 14, 2023
@mateusz-lubanski-sinch
Copy link

I faced same issue. After downgrading back to 2.7.0 it is working fine

@mrueg
Copy link
Member

mrueg commented Feb 16, 2023

Thanks for the report, I can reproduce it.

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 16, 2023
@mrueg mrueg self-assigned this Feb 16, 2023
@mrueg
Copy link
Member

mrueg commented Feb 16, 2023

It seems like this is a regression from bd2ea7a#diff-1cbeb50a6b171ac66f4fad40018421010ed1733ea97fcb40b65a7ebd5e7e3d77R433

@rexagod @dgrisonnet I believe we need to look up if the path exists and return an error instead to distinguish between path does not exist and value is nil?

@rexagod
Copy link
Member

rexagod commented Feb 16, 2023

Hmm, right. We don't want to crash on non-existent paths.
/assign

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
5 participants