Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

inputs.prometheus still requires cluster level permissions when scoped to a namespace #13876

Closed
n0coast opened this issue Sep 7, 2023 · 9 comments · Fixed by #14871
Closed
Assignees
Labels
bug unexpected problem or unintended behavior

Comments

@n0coast
Copy link

n0coast commented Sep 7, 2023

Relevant telegraf.conf

[[inputs.prometheus]]
      kubernetes_label_selector = "app.kubernetes.io/name=app,app.kubernetes.io/component=web"
      monitor_kubernetes_pods = true
      monitor_kubernetes_pods_method = "settings"
      monitor_kubernetes_pods_namespace = "some-namespace"
      monitor_kubernetes_pods_port = 8000

Logs from Telegraf

gha-telegraf-5fd694d667-c9xz9 telegraf W0907 00:08:07.576872       1 reflector.go:533] pkg/mod/k8s.io/client-go@v0.27.2/tools/cache/reflector.go:231: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:some-namepsace:gha-telegraf" cannot list resource "namespaces" in API group "" at the cluster scope
gha-telegraf-5fd694d667-c9xz9 telegraf E0907 00:08:07.576920       1 reflector.go:148] pkg/mod/k8s.io/client-go@v0.27.2/tools/cache/reflector.go:231: Failed to watch *v1.Namespace: failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:some-namespace:gha-telegraf" cannot list resource "namespaces" in API group "" at the cluster scope

System info

Telegraf 1.27.4, container: docker.io/library/telegraf:1.27-alpine, kubernetes v1.26

Docker

No response

Steps to reproduce

Add the following chart dependency:

   - name: telegraf
     version: 1.8.33
     repository: https://helm.influxdata.com/

Configure telegraf role as follows:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  annotations:
    ...
  labels:
    app.kubernetes.io/instance: gha
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: telegraf
    helm.sh/chart: telegraf-1.8.27
  name: gha-telegraf
  namespace: some-namespace
  ...
rules:
- apiGroups:
  - ""
  resources:
  - namespaces
  - pods
  - services
  verbs:
  - get
  - list
  - watch

Install helm chart with telegraf dependency included, with inputs.prometheus configured and scoped to the namespace application and telegraf are running in.

Expected behavior

Metrics are scraped without issue in the configured namespace.

Actual behavior

telegraf attempts to list namespaces at cluster level and is unable, loudly complains in logs

Additional info

Similar to #12780, but we're using a more recent version of telegraf which should include this fix: #13063

@n0coast n0coast added the bug unexpected problem or unintended behavior label Sep 7, 2023
@powersj
Copy link
Contributor

powersj commented Sep 11, 2023

@Ivaylogi98 or @redbaron are either of you able to run telegraf without cluster level permissions? I know you confirmed in the issue and PR, but @n0coast seems to not be able to.

@powersj powersj added the waiting for response waiting for response from contributor label Sep 11, 2023
@redbaron
Copy link
Contributor

@n0coast , do you have any other instances of inputs.prometheus plugin in the telegraf config?

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Sep 12, 2023
@redbaron
Copy link
Contributor

Fix you are looking for is #13627 , which I think is in 1.28, but wasn't backported to 1.27

@powersj powersj added the waiting for response waiting for response from contributor label Sep 12, 2023
@redbaron
Copy link
Contributor

@n0coast , looking at the error closer, I think it comes from the part where it lists all namespaces. It is for namespace_annotation_pass, but it does it even if this option is not specified like in your config.

This can be improved, I agree

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Sep 12, 2023
@Thrinadh-Kumpatla
Copy link

Thrinadh-Kumpatla commented Jan 3, 2024

Still facing the same issue with telegraf:1.29.

Config:

  inputs:
    - prometheus:
        monitor_kubernetes_pods: true
        monitor_kubernetes_pods_method: "annotations"
        monitor_kubernetes_pods_namespace: "thrinadh"
        namepass:
          - "badger_db_size"
        tagdrop:
          type: 
           - "total"

failed to list *v1.Namespace: namespaces is forbidden: User "system:serviceaccount:thrinadh:thrinadh-test-telegraf" cannot list resource "namespaces" in API group "" at the cluster scope

@valeraBr
Copy link

Hello,

Im facing the same issue with 1.29.
No matter what I try telegraf still fails because of the cluster scope permissions. Any workaround for this?

@srebhan
Copy link
Contributor

srebhan commented Feb 21, 2024

@n0coast, @Thrinadh-Kumpatla and @valeraBr please test the binary in PR #14871, available once CI finished all tests... Let me know if this fixes the issue!

@srebhan srebhan self-assigned this Feb 21, 2024
@srebhan
Copy link
Contributor

srebhan commented Mar 5, 2024

Can someone please test the PR!?!?

@srebhan srebhan added the waiting for response waiting for response from contributor label Mar 5, 2024
@valeraBr
Copy link

valeraBr commented Mar 6, 2024

@srebhan I've tested the raised PR, it works now with the provided binaries. No cluster scope errors anymore. Thanks.

@telegraf-tiger telegraf-tiger bot removed the waiting for response waiting for response from contributor label Mar 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants