Possible netlink leak on 3.29.1 #9603

imbstack · 2024-12-14T00:26:23Z

We recently updated calico to 3.29.1 on one of our staging clusters and found that after a few hours there was a clear upward trend in the number of file descriptors held by calico-node pods.

Checking on a running instance after a couple days, we found that the calico-node -felix process had nearly 6000 file descriptors according to lsof, nearly all of which were like the following:

# lsof -p 1517383 | tail
calico-no 1517383 root 5778u  netlink                 0t0  689245146 ROUTE
calico-no 1517383 root 5779u  netlink                 0t0  688913733 ROUTE
calico-no 1517383 root 5780u  netlink                 0t0  689385180 ROUTE
calico-no 1517383 root 5781u  netlink                 0t0  689391746 ROUTE
calico-no 1517383 root 5782u  netlink                 0t0  689402663 ROUTE
calico-no 1517383 root 5783u  netlink                 0t0  689407738 ROUTE
calico-no 1517383 root 5784u  netlink                 0t0  689296536 ROUTE
calico-no 1517383 root 5785u  netlink                 0t0  689301559 ROUTE
calico-no 1517383 root 5790u  netlink                 0t0  689395559 ROUTE
calico-no 1517383 root 5791u  netlink                 0t0  689400833 ROUTE

Deleting that pod made dropped the fds although the new pod is starting the trend all over again.

Let me know if there is any other debugging data I can provide.

Expected Behavior

A relatively steady state of file descriptors for a calico-node pod.

Current Behavior

A steady increase in open file descriptors.

Possible Solution

Steps to Reproduce (for bugs)

Just deploy calico 3.29.1 afaict

Context

This is ok for now in our staging environment but are worried about going to production this way. It is entirely possible this is due to some weird config on our side but nothing is jumping out at me so far.

Your Environment

Calico version: 3.29.1
Calico dataplane (iptables, windows etc.): iptables
Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes
Operating System and version: Linux ip-10-213-23-129 6.8.0-1018-aws #19~22.04.1-Ubuntu SMP Wed Oct 9 16:48:22 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Link to your project (optional):

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible netlink leak on 3.29.1 #9603

Possible netlink leak on 3.29.1 #9603

imbstack commented Dec 14, 2024

Possible netlink leak on 3.29.1 #9603

Possible netlink leak on 3.29.1 #9603

Comments

imbstack commented Dec 14, 2024

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

Context

Your Environment