Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple targets with ServiceMonitor #336

Closed
sc250024 opened this issue Sep 21, 2018 · 2 comments
Closed

Multiple targets with ServiceMonitor #336

sc250024 opened this issue Sep 21, 2018 · 2 comments

Comments

@sc250024
Copy link

sc250024 commented Sep 21, 2018

I'm not sure if this is a "bug", but since I'm probably not the only one to experience this situation, I thought it would be worth raising. I'm monitoring Kubernetes services using the Prometheus operator pattern, and now trying to monitor around ~200 mobile routers using the snmp_exporter. I think the issue is related to labeling.

In summary, I'm having the following issues:

  • Multiple targets are indistinguishable in the Prometheus metrics.
  • The snmp_exporter process is only scraping one of the targets.

I have configured the snmp_exporter targets on two test routers using the following ServiceMonitor resource:

servicemonitor-teltonika.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    prometheus: kube-prometheus
  name: snmp-exporter
spec:
  jobLabel: snmp-exporter
  selector:
    app: snmp-exporter
  namespaceSelector:
    matchNames:
    - monitoring
  endpoints:
  - interval: 240s
    port: http-metrics
    params:
      module:
      - if_mib_ifdescr
      target:
      - 10.161.XX.YY
      - 10.161.XX.YZ
    path: "/snmp"
    scrapeTimeout: 30s
    targetPort: 9116

This results in the "successful" configuration in Prometheus, which is generated by the Prometheus operator:

prometheus-config.yaml

- job_name: monitoring/snmp-exporter/0
  params:
    module:
    - if_mib_ifdescr
    target:
    - 10.161.XX.YY
    - 10.161.XX.YZ
  scrape_interval: 4m
  scrape_timeout: 30s
  metrics_path: /snmp
  scheme: http
  kubernetes_sd_configs:
  - role: endpoints
    namespaces:
      names:
      - monitoring
  relabel_configs:
  - source_labels: [__meta_kubernetes_endpoint_port_name]
    separator: ;
    regex: http-metrics
    replacement: $1
    action: keep
  - source_labels: [__meta_kubernetes_namespace]
    separator: ;
    regex: (.*)
    target_label: namespace
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_pod_name]
    separator: ;
    regex: (.*)
    target_label: pod
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: service
    replacement: $1
    action: replace
  - source_labels: [__meta_kubernetes_service_name]
    separator: ;
    regex: (.*)
    target_label: job
    replacement: ${1}
    action: replace
  - source_labels: [__meta_kubernetes_service_label_snmp_exporter]
    separator: ;
    regex: (.+)
    target_label: job
    replacement: ${1}
    action: replace
  - separator: ;
    regex: (.*)
    target_label: endpoint
    replacement: http-metrics
    action: replace

However, the resulting data in a metric like ifHCInOctets seems to group everything together since there's no label to distinguish between the two targets. While the ifDescr properties are different, it seems that the multiple targets are getting lumped into the same counter, since there's nothing to distinguish them. See "What did you see instead?" section for details.

Host operating system: output of uname -a

The snmp_exporter is running on Kubernetes v1.10.8, and Debian 9 nodes.

Linux 4.9.0-8-amd64 #1 SMP Debian 4.9.110-3+deb9u4 (2018-08-21) x86_64 GNU/Linux

snmp_exporter version: output of snmp_exporter -version

snmp_exporter, version 0.13.0 (branch: HEAD, revision: 84cab6d72f4c70e6239e1efa4f7ea9cba2b7acc8)
  build user:       root@b26f44b735fe
  build date:       20180912-11:01:50
  go version:       go1.10.3

What device/snmpwalk OID are you using?

I'm using the pre-defined module if_mib_ifdescr.

If this is a new device, please link to the MIB(s).

N/A

What did you do that produced an error?

Technically the ServiceMonitor configuration "works", but there's no way to distinguish between multiple targets.

What did you expect to see?

Since I specified multiple targets, I would expect to see another set of metrics, perhaps with an additional label of target to identify the different targets like so:

ifHCOutOctets{[...],target="10.161.XX.YY"} 1782842
ifHCOutOctets{[...],target="10.161.XX.YY"} 2266513
ifHCOutOctets{[...],target="10.161.XX.YY"} 0
ifHCOutOctets{[...],target="10.161.XX.YY"} 0
ifHCOutOctets{[...],target="10.161.XX.YY"} 0
ifHCOutOctets{[...],target="10.161.XX.YY"} 0
ifHCOutOctets{[...],target="10.161.XX.YY"} 0
ifHCOutOctets{[...],target="10.161.XX.YY"} 0
ifHCOutOctets{[...],target="10.161.XX.YY"} 16923
ifHCOutOctets{[...],target="10.161.XX.YY"} 419792
ifHCOutOctets{[...],target="10.161.XX.YY"} 10170457

ifHCOutOctets{[...],target="10.161.XX.YZ"} 1682242
ifHCOutOctets{[...],target="10.161.XX.YZ"} 1266593
ifHCOutOctets{[...],target="10.161.XX.YZ"} 0
ifHCOutOctets{[...],target="10.161.XX.YZ"} 0
ifHCOutOctets{[...],target="10.161.XX.YZ"} 0
ifHCOutOctets{[...],target="10.161.XX.YZ"} 0
ifHCOutOctets{[...],target="10.161.XX.YZ"} 0
ifHCOutOctets{[...],target="10.161.XX.YZ"} 0
ifHCOutOctets{[...],target="10.161.XX.YZ"} 17843
ifHCOutOctets{[...],target="10.161.XX.YZ"} 374001
ifHCOutOctets{[...],target="10.161.XX.YZ"} 11053154

What did you see instead?

It seems the metrics of the two test devices are lumped into one group:

ifHCOutOctets{endpoint="http-metrics",ifDescr="br-lan",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}  1782842
ifHCOutOctets{endpoint="http-metrics",ifDescr="eth0",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}    2266513
ifHCOutOctets{endpoint="http-metrics",ifDescr="eth1",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}    0
ifHCOutOctets{endpoint="http-metrics",ifDescr="gre0",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}    0
ifHCOutOctets{endpoint="http-metrics",ifDescr="gretap0",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"} 0
ifHCOutOctets{endpoint="http-metrics",ifDescr="ifb0",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}    0
ifHCOutOctets{endpoint="http-metrics",ifDescr="ifb1",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}    0
ifHCOutOctets{endpoint="http-metrics",ifDescr="ip6tnl0",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"} 0
ifHCOutOctets{endpoint="http-metrics",ifDescr="lo",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}      16923
ifHCOutOctets{endpoint="http-metrics",ifDescr="wlan0",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}   419792
ifHCOutOctets{endpoint="http-metrics",ifDescr="wwan0",instance="172.27.135.96:9116",job="snmp-exporter",namespace="monitoring",pod="snmp-exporter-79f7877df5-lpcjx",service="snmp-exporter"}   10170457

Logs

Here's the logs that show, despite multiple targets, scraping only happens on one of the targets:

monitoring snmp-exporter-79f7877df5-8tdxn snmp-exporter time="2018-09-21T21:21:48Z" level=debug msg="Scraping target '10.161.XX.YY' with module 'if_mib_ifdescr'" source="main.go:86"
monitoring snmp-exporter-79f7877df5-8tdxn snmp-exporter time="2018-09-21T21:21:48Z" level=debug msg="Getting 1 OIDs from target \"10.161.XX.YY\"" source="collector.go:103"
monitoring snmp-exporter-79f7877df5-8tdxn snmp-exporter time="2018-09-21T21:21:49Z" level=debug msg="Get of 1 OIDs completed in 809.801612ms" source="collector.go:109"
monitoring snmp-exporter-79f7877df5-8tdxn snmp-exporter time="2018-09-21T21:21:49Z" level=debug msg="Walking target \"10.161.XX.YY\" subtree \"1.3.6.1.2.1.2\"" source="collector.go:133"
monitoring snmp-exporter-79f7877df5-8tdxn snmp-exporter time="2018-09-21T21:21:50Z" level=debug msg="Walk of target \"10.161.XX.YY\" subtree \"1.3.6.1.2.1.2\" completed in 1.097932233s" source="collector.go:143"
monitoring snmp-exporter-79f7877df5-8tdxn snmp-exporter time="2018-09-21T21:21:50Z" level=debug msg="Walking target \"10.161.XX.YY\" subtree \"1.3.6.1.2.1.31.1.1\"" source="collector.go:133"
monitoring snmp-exporter-79f7877df5-8tdxn snmp-exporter time="2018-09-21T21:21:51Z" level=debug msg="Walk of target \"10.161.XX.YY\" subtree \"1.3.6.1.2.1.31.1.1\" completed in 919.302653ms" source="collector.go:143"
monitoring snmp-exporter-79f7877df5-8tdxn snmp-exporter time="2018-09-21T21:21:51Z" level=debug msg="Scrape of target '10.161.XX.YY' with module 'if_mib_ifdescr' took 2.833386 seconds" source="main.go:97"

Possible Fix

I think the issue is related to this blog post: https://www.robustperception.io/controlling-the-instance-label. In my configurations above, my instance label is always 172.27.135.96:9116, which is the IP of the snmp_exporter pod. Is it advised then to change this to the target's IP address? And if so, how to do that within a ServiceMonitor definition ?

@SuperQ
Copy link
Member

SuperQ commented Sep 21, 2018

It looks like your prometheus config is not specifying the target list correctly.

You're trying to pass a list of targets via params, which is not going to work. This ends up on the exporter like target=10.161.XX.YY&target=10.161.XX.YZ. The snmp_exporter only knows how to scrape one target at a time, so it will simply pick one of the targets you list there.

The target list has to be fed to Prometheus, and then it uses relabel configs to pass the correct target= param.

I think what you want is this:

  - job_name: monitoring/snmp-exporter/0
    ...
    static_configs:
    - targets:
      - 10.161.XX.YY
      - 10.161.XX.YZ
    relabel_configs:
      - source_labels: [__address__]
        target_label: __param_target
      - source_labels: [__param_target]
        target_label: instance
      - target_label: __address__
        replacement: snmp-exporter.svc:9116 

@sc250024
Copy link
Author

@SuperQ Thanks for the feedback! It looks like I should ask about this in the Prometheus operator repo instead. I know the ServiceMonitor resource does have certain restrictions on relabeling, and I'm not sure if this applies or not. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants