Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

snmpd: ignore Docker network interfaces #170

Merged
merged 1 commit into from
Aug 30, 2024

Conversation

DasSkelett
Copy link
Member

Problem

When setting up ft04 I noticed that Influx doesn't show stats for the ft04_v* interfaces on the vm-host.
In the log I then found:

Aug 16 11:11:02 metrics influxd-systemd-start.sh[579]: ts=2024-08-16T09:11:02.736381Z lvl=warn msg="max-values-per-tag limit may be exceeded soon" log_id=0qAP5cnl000 service=store perc=100% n=100000 max=100000 db_instance=librenms measurement=ports tag=ifName
Aug 16 11:11:02 metrics influxd-systemd-start.sh[579]: ts=2024-08-16T09:11:02.738841Z lvl=warn msg="max-values-per-tag limit may be exceeded soon" log_id=0qAP5cnl000 service=store perc=80% n=80783 max=100000 db_instance=ffmuc_other measurement=net tag=interface

This is due to Docker containers creating new veth interfaces with different names whenever containers are recreated, and new br interfaces when networks are recreated. Especially the amount of different veth interfaces (i.e. the cardinality of the Influx measurement) over time is huge, it makes up around 95% of the interfaces Influx (and LibreNMS).
As a first countermeasure I removed all series from the ports measurement where the ifName starts with veth before 2024-06-30, and lowered the cardinality from just below the 1005 to ~8k (SHOW TAG VALUES CARDINALITY WITH KEY = "ifName").
A few minutes afterwards the two ft04_v* interfaces started to show up in Influx.

Solution

Let's exclude the Docker br- and veth interfaces from SNMP measurements, thus not sending their data to LibreNMS, thus not sending the data to Influx.
I chose this over increasing the max-values-per-tag-limit, because I think we don't have much use for historical Docker interface stats (especially as you can't really match them to the respective container anymore), and InfluxDB (and LibreNMS) should also be very happy about a much reduced load/database size.

The filter for the br- interfaces checks whether their MAC address starts with 02:42, which appears to be the prefix that Docker uses, as not to match our bridges on the gateways.

 

The second line regarding db_instance=ffmuc_other measurement=net comes from Telegraf collecting network stats on the VMs directly. The filtering there (if possible) is still TODO, in a separate PR.

@DasSkelett DasSkelett added the bug Something isn't working label Aug 18, 2024
@DasSkelett DasSkelett requested a review from a team as a code owner August 18, 2024 17:05
@DasSkelett DasSkelett merged commit 8cc57b1 into freifunkMUC:main Aug 30, 2024
4 checks passed
@DasSkelett DasSkelett deleted the snmpd-ignore-docker-nics branch August 30, 2024 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants