Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ksm doesn't report any metrics at all if it lacks rights for just 1 subject namespace #1413

Closed
Drugoy opened this issue Mar 15, 2021 · 7 comments · Fixed by #1499
Closed
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.

Comments

@Drugoy
Copy link

Drugoy commented Mar 15, 2021

What happened: ksm doesn't report any metrics at all if just 1 namespace from the list of specified ones is not available.

What you expected to happen: ksm should return metrics for the k8s objects from other namespaces (where it has access to).

How to reproduce it (as minimally and precisely as possible): have arg '--namespaces=project1,project2' but give access only to ksm's ServiceAccount 'view' rights only in project1.
ksm will produce NO metrics, because its ServiceAccount lacks 'view' rights in project2, although it could return metrics from project1.

Anything else we need to know?: no

Environment:

  • kube-state-metrics version: 1.9.0
  • Kubernetes version (use kubectl version): 1.18.3+002a51f
  • Cloud provider or hardware configuration: OpenShift (on premises)
  • Other info: none.
@Drugoy Drugoy added the kind/bug Categorizes issue or PR as related to a bug. label Mar 15, 2021
@lilic
Copy link
Member

lilic commented Mar 16, 2021

Hello 👋 Is this the kube-state-metrics in OpenShift you are talking about or are you deploying one manually? If manually can you share the deployments manifests and which one you used? FYI the manifests in this repo are just examples so you have to customize them if you need something specific.

@Drugoy
Copy link
Author

Drugoy commented Mar 16, 2021

Hi, Lili!
Manual deploy via image quay.io/coreos/kube-state-metrics:v1.9.8 (link to it was taken from here).

Manifests are really irrelevant here, because they don't even specify which custom ServiceAccount to use, so the default one default is used.

The only part that may be relevant here is spec.containers.[0]:

- image: ...
  args:
    - "--collectors=pods"
    - "--namespace=project1,project2"
    - "--metric-whitelist=kube_pod_start_time,kube_pod_created,kube_pod_status_ready,kube_pod_container_status_running,kube_pod_container_status_restarts_total"

I deploy it to namespaceA.
I then rolebind /view role in namespace project1 to serviceaccount default in namespace namespaceA.
Then the bug occurs.
If I then do a similar rolebinding for project2 - ksm starts to return metrics for both namespaces.

@lilic
Copy link
Member

lilic commented Mar 17, 2021

Yes seems like a bug with the multlistwatcher, do you want to take up fixing it?

@Drugoy
Copy link
Author

Drugoy commented Mar 17, 2021

Sorry, I'm not fluent in Golang enough as to fix it.

@lilic lilic added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label Mar 17, 2021
@lilic
Copy link
Member

lilic commented Mar 17, 2021

No worries! I added the help wanted label, as we had folks interested in contributing more, otherwise will have a look if no one else picks it up. Thanks for reporting!

@brancz would be great to replace the custom listwatch with something else, maybe importing it from Prometheus operator what do you think?

fpetkovski added a commit to fpetkovski/kube-state-metrics that referenced this issue Jun 7, 2021
The multiListerWatcher is a composite object encapsulating multiple
ListerWatchers and implements the ListerWatcher interface.
When calling the List method on the multiListerWatcher, if an individual
Lister call fails, the outcome is treated as an error and the entire
call fails. This leads to KSM not exporting any metrics when it does not
have the necessary permissions for resources in one more more namespaces.

This commit modifies the multiListerWatcher List function to log errors
from individual ListerWatchers and continue with execution. As a result,
when KSM does not have permissions to list resources from
a namespace, it will still export metrics from namespaces it has
permissions to.

Fixes kubernetes#1413

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
fpetkovski added a commit to fpetkovski/kube-state-metrics that referenced this issue Jun 7, 2021
…er errors

The multiListerWatcher is a composite object encapsulating multiple
ListerWatchers and implements the ListerWatcher interface.
When calling the List method on the multiListerWatcher, if an individual
Lister call fails, the outcome is treated as an error and the entire
call fails. This leads to KSM not exporting any metrics when it does not
have the necessary permissions for resources in one more more namespaces.

This commit modifies the multiListerWatcher List function to log errors
from individual ListerWatchers and continue with execution. As a result,
when KSM does not have permissions to list resources from
a namespace, it will still export metrics from namespaces it has
permissions to.

Fixes kubernetes#1413

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
fpetkovski added a commit to fpetkovski/kube-state-metrics that referenced this issue Jun 7, 2021
… errors

The multiListerWatcher is a composite object encapsulating multiple
ListerWatchers and implements the ListerWatcher interface.
When calling the List method on the multiListerWatcher, if an individual
Lister call fails, the outcome is treated as an error and the entire
call fails. This leads to KSM not exporting any metrics when it does not
have the necessary permissions for resources in one more more namespaces.

This commit modifies the multiListerWatcher List function to log errors
from individual ListerWatchers and continue with execution. As a result,
when KSM does not have permissions to list resources from
a namespace, it will still export metrics from namespaces it has
permissions to.

Fixes kubernetes#1413

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
fpetkovski added a commit to fpetkovski/kube-state-metrics that referenced this issue Jun 7, 2021
The multiListerWatcher is a composite object encapsulating multiple
ListerWatchers and implements the ListerWatcher interface.
When calling the List method on the multiListerWatcher, if an individual
Lister call fails, the outcome is treated as an error and the entire
call fails. This leads to KSM not exporting any metrics when it does not
have the necessary permissions for resources in one more more namespaces.

This commit modifies the multiListerWatcher List function to log errors
from individual ListerWatchers and continue with execution. As a result,
when KSM does not have permissions to list resources from
a namespace, it will still export metrics from namespaces it has
permissions to.

Fixes kubernetes#1413

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
fpetkovski added a commit to fpetkovski/kube-state-metrics that referenced this issue Jun 7, 2021
The multiListerWatcher is a composite object encapsulating multiple
ListerWatchers and implements the ListerWatcher interface.
When calling the List method on the multiListerWatcher, if an individual
Lister call fails, the outcome is treated as an error and the entire
call fails. This leads to KSM not exporting any metrics when it does not
have the necessary permissions for resources in one more more namespaces.

This commit modifies the multiListerWatcher List function to log errors
from individual ListerWatchers and continue with execution. As a result,
when KSM does not have permissions to list resources from
a namespace, it will still export metrics from namespaces it has
permissions to.

Fixes kubernetes#1413

Signed-off-by: fpetkovski <filip.petkovsky@gmail.com>
@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 15, 2021
@liangyuanpeng
Copy link
Contributor

I would like to look at this problem
/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 28, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants