Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Inconsistency between customresource interface and customresource causes panic #2202

Open
AliDatadog opened this issue Sep 19, 2023 · 7 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@AliDatadog
Copy link

AliDatadog commented Sep 19, 2023

Issue Description:
After the alteration of the custom resources handling behavior in #1851, there has been a breaking change.

  1. GVRFromType is invariably invoked:
    gvr := util.GVRFromType(f.Name(), f.ExpectedType())
  2. The subsequent function https://github.com/rexagod/kube-state-metrics/blob/25a1d8da057cf761d614c59a52785335d34082d1/pkg/util/utils.go#L143 puts forward a presumption that ExpectedType() persistently returns unstructured.Unstructured{}. This triggers panic if a different argument type is provided.

The change leads to a breaking compatibility with the previous interface that we use internally to enhance KSM with other metrics.

Moreover, the availableStore index now uses gvr.String() over ResourceName as per

gvrString = gvr.String()
. Consequently, the output of gvrString is structured as Resource=verticalpodautoscalers,apiregistration.k8s.io/v1

Expected Outcome:
I anticipated ExpectedType() to have the capacity to return any resource type and f.Name() to serve as the resource index in availableStore.
Another solution would be to introduce a new function StoreIndex() or StoreName() in this interface.

Reproducibility Steps:
To reproduce this issue, simply add VPA as a custom resource and observe that the code triggers panic.

Additional Information:
None applicable.
Environment Details:
I'm using the kind cluster 1.27.1.

  • Kube-state-metrics version: 2.10.0
  • Kubernetes version (retrievable using kubectl version): 1.27.1
  • Cloud provider or hardware setup: macOS ARM
  • Any Other Relevant Information:
@AliDatadog AliDatadog added the kind/bug Categorizes issue or PR as related to a bug. label Sep 19, 2023
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 19, 2023
@dashpole
Copy link

/assign @rexagod
/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 21, 2023
@CatherineF-dev
Copy link
Contributor

Thanks for reporting this issue, found you have already used v2.8.2 as a mitigation.

@CatherineF-dev
Copy link
Contributor

To reproduce this issue, simply add VPA as a custom resource and observe that the code triggers panic.

QQ: how did you notice this issue, is it possible to move some tests into this repo? If so, it won't break your components in the future.

@AliDatadog
Copy link
Author

Unfortunately we did not implement a test able to catch this issue in particular.

Perhaps we could have a test checking that a simple extended store can be generated.

Here is an example of one: https://github.com/DataDog/datadog-agent/blob/0ee3251831d2a9da15f7c47e843ebaacc8b85aab/pkg/collector/corechecks/cluster/ksm/customresources/crd.go#L50

@AliDatadog
Copy link
Author

Any update on this issue ? @rexagod @CatherineF-dev

@CatherineF-dev
Copy link
Contributor

Could you try fixing this issue? cc @AliDatadog I haven't looked into details, but I can help review.

@rexagod might know details.

@rexagod
Copy link
Member

rexagod commented Jan 11, 2024

@AliDatadog Are you using KSM as a library? We do not provide any guarantees on the exported APIs as of now. cc @dgrisonnet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants