Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequent syscall event drops #1382

Closed
Yuriy6735 opened this issue Sep 8, 2020 · 11 comments
Closed

Frequent syscall event drops #1382

Yuriy6735 opened this issue Sep 8, 2020 · 11 comments
Assignees

Comments

@Yuriy6735
Copy link

Falco generates repeated alerts:
Critical Falco internal: syscall event drop. 1 system calls dropped in last second. (ebpf_enabled=0 n_drops=1 n_drops_buffer=1 n_drops_bug=0 n_drops_pf=0 n_evts=19802)

How to reproduce it

Run Falco 0.25 with EKS 1.15 and centos7

Expected behaviour

No syscall event drop errors

Environment

  • Falco version:
    Falco version: 0.25.0 Driver version: ae104eb20ff0198a5dcb0c91cc36c86e7c3f25c7
  • System info:
"machine": "x86_64",
 "release": "3.10.0-1127.19.1.el7.x86_64",
 "sysname": "Linux",
 "version": "#1 SMP Tue Aug 25 17:23:54 UTC 2020"
  • Cloud provider or hardware configuration: AWS (EKS)
  • OS: CentOS7
  • Kernel: 3.10.0-1127.19.1.el7.x86_64
  • Installation method: Kubernetes (Helm chart version: 1.3.0, appVersion: 0.25.0)

Additional context

Tue Sep  8 12:30:55 2020: Falco internal: syscall event drop. 2712 system calls dropped in last second.
Tue Sep  8 12:35:11 2020: Falco internal: syscall event drop. 1 system calls dropped in last second.
12:35:10.754728829: Critical Falco internal: syscall event drop. 1 system calls dropped in last second. (ebpf_enabled=0 n_drops=1 n_drops_buffer=1 n_drops_bug=0 n_drops_pf=0 n_evts=35735)
Tue Sep  8 12:37:01 2020: Falco internal: syscall event drop. 3 system calls dropped in last second.
12:37:01.780775557: Critical Falco internal: syscall event drop. 3 system calls dropped in last second. (ebpf_enabled=0 n_drops=3 n_drops_buffer=3 n_drops_bug=0 n_drops_pf=0 n_evts=22944)
Tue Sep  8 12:41:46 2020: Falco internal: syscall event drop. 1 system calls dropped in last second.
12:41:45.856167911: Critical Falco internal: syscall event drop. 1 system calls dropped in last second. (ebpf_enabled=0 n_drops=1 n_drops_buffer=1 n_drops_bug=0 n_drops_pf=0 n_evts=19802)
Tue Sep  8 12:42:45 2020: Falco internal: syscall event drop. 2 system calls dropped in last second.
12:42:45.872042597: Critical Falco internal: syscall event drop. 2 system calls dropped in last second. (ebpf_enabled=0 n_drops=2 n_drops_buffer=2 n_drops_bug=0 n_drops_pf=0 n_evts=15223)
@Fsero
Copy link

Fsero commented Sep 14, 2020

I'm also experiencing this:

Environment

  • Falco version: Falco version 0.25.0 (driver version ae104eb20ff0198a5dcb0c91cc36c86e7c3f25c7)

  • system info: 5.4.41-flatcar Digwatch compiler #1 SMP Fri May 22 20:03:20 -00 2020 x86_64 Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz GenuineIntel GNU/Linux

  • Cloud provider: AWS (m5.xlarge)

  • OS: Flatcar Container Linux by Kinvolk 2513.0.0 (Oklo)

  • Kernel: 5.4.41-flatcar

  • Installation method: Helm chart using no-driver image and driver compiled wit driverkit using eBPF.

Additional context

{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"13973"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:42:45.946068858Z"}
2020-09-14T07:43:15+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"15121"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:43:15.949941061Z"}
2020-09-14T07:44:08+0000: Falco internal: syscall event drop. 8 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 8 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"8","n_drops_buffer":"0","n_drops_bug":"8","n_drops_pf":"0","n_evts":"40"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:43:43.145186684Z"}
2020-09-14T07:44:15+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"17373"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:44:15.842518797Z"}
2020-09-14T07:44:45+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"15543"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:44:45.961501054Z"}
2020-09-14T07:45:15+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"16964"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:45:15.965352100Z"}
2020-09-14T07:45:45+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"11594"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:45:45.969171461Z"}
2020-09-14T07:46:16+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"31324"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:46:16.973129071Z"}
2020-09-14T07:46:46+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"29031"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:46:46.976872569Z"}
2020-09-14T07:47:16+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"34272"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:47:16.981036908Z"}
2020-09-14T07:47:47+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"29803"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:47:46.984966564Z"}
2020-09-14T07:48:17+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"32211"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:48:16.988721019Z"}
2020-09-14T07:48:47+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"29454"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:48:46.992595181Z"}
2020-09-14T07:49:17+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"33715"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:49:16.996596904Z"}
2020-09-14T07:49:47+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"32821"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:49:47.000491305Z"}
2020-09-14T07:50:17+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"40196"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:50:17.005361196Z"}
2020-09-14T07:50:47+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"30702"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:50:47.009055533Z"}
2020-09-14T07:51:17+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"35532"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:51:17.012871855Z"}
2020-09-14T07:51:47+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"27026"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:51:47.016590959Z"}
2020-09-14T07:52:17+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"30309"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:52:17.020355766Z"}
2020-09-14T07:52:47+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"26418"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:52:47.024001615Z"}
2020-09-14T07:53:17+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"30459"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:53:17.027861088Z"}
2020-09-14T07:53:47+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"45129"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T07:53:47.304052903Z"}

@Fsero
Copy link

Fsero commented Sep 14, 2020

/assign @leogr @fntlnz

@leogr
Copy link
Member

leogr commented Sep 14, 2020

Please, could you try to increment (or remove) the container CPU limit?

N.B.
Although it is not yet confirmed, the CPU limitation seems to be one of the possible causes of this problem.

@Fsero
Copy link

Fsero commented Sep 14, 2020

@leogr thanks for the suggestion.

I tried that

        - name: HOST_ROOT
          value: /host
        image: docker.io/falcosecurity/falco-no-driver:0.25.0
        imagePullPolicy: IfNotPresent
        name: falco
        resources:
          requests:
            cpu: "1"
            memory: 1Gi
        securityContext:
          privileged: true

And it doesnt seem to help much, what i noted is that the amount of dropped events is significantly higher after startup time and then it tend to stabilize.

2020-09-14T11:32:27+0000: Starting gRPC server at unix:///var/run/falco/falco.sock
2020-09-14T11:32:29+0000: Falco internal: syscall event drop. 4 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 4 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"4","n_drops_buffer":"4","n_drops_bug":"0","n_drops_pf":"0","n_evts":"192925"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:27.755392170Z"}
2020-09-14T11:32:45+0000: Falco internal: syscall event drop. 109569 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 109569 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"109569","n_drops_buffer":"12","n_drops_bug":"109555","n_drops_pf":"2","n_evts":"626926"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:28.763386954Z"}
2020-09-14T11:32:45+0000: Falco internal: syscall event drop. 38 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 38 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"38","n_drops_buffer":"0","n_drops_bug":"38","n_drops_pf":"0","n_evts":"112"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:29.763516420Z"}
2020-09-14T11:32:45+0000: Falco internal: syscall event drop. 42 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 42 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"42","n_drops_buffer":"0","n_drops_bug":"42","n_drops_pf":"0","n_evts":"239"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:30.763529371Z"}
2020-09-14T11:32:45+0000: Falco internal: syscall event drop. 17 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 17 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"17","n_drops_buffer":"17","n_drops_bug":"0","n_drops_pf":"0","n_evts":"480"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:43.958065613Z"}
2020-09-14T11:32:45+0000: Falco internal: syscall event drop. 109618 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 109618 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"109618","n_drops_buffer":"109618","n_drops_bug":"0","n_drops_pf":"0","n_evts":"4565"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:45.960116187Z"}
2020-09-14T11:32:47+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"48212"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:47.970049912Z"}
2020-09-14T11:32:49+0000: Falco internal: syscall event drop. 4 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 4 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"4","n_drops_buffer":"4","n_drops_bug":"0","n_drops_pf":"0","n_evts":"36412"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:49.970153990Z"}
2020-09-14T11:32:52+0000: Falco internal: syscall event drop. 2 system calls dropped in last second.
{"output":"Falco internal: syscall event drop. 2 system calls dropped in last second.","output_fields":{"ebpf_enabled":"1","n_drops":"2","n_drops_buffer":"2","n_drops_bug":"0","n_drops_pf":"0","n_evts":"31876"},"priority":"Critical","rule":"Falco internal: syscall event drop","time":"2020-09-14T11:32:52.070232738Z"}
2020-09-14T11:32:55+0000: Falco internal: syscall event drop. 1 system calls dropped in last second.

In any case thanks for the reply! is there anything else i can try or look into to try to help you to debug this?

@leogr
Copy link
Member

leogr commented Sep 14, 2020

Since it seems to be not related to CPU performance (unless you noticed that Falco is using 100% of its allocated CPU), I'd suggest looking on those components used by Falco to fetch metadata that might slow down Falco (eg. Kubernetes APIs). For example, are you using --disable-cri-async ? If yes, try to remove it.

PS
Feel free to contact me on Slack via DM for any help in debugging this!

@Yuriy6735
Copy link
Author

Seems, problem not in the CPU performance.
image

Using the default configuration without --disable-cri-async

@leogr
Copy link
Member

leogr commented Sep 15, 2020

Seems, problem not in the CPU performance.
image

Using the default configuration without --disable-cri-async

Have drops decreased without --disable-cri-async ?

@Yuriy6735
Copy link
Author

If I understand correctly, there is no --disable-cri-async arg in the default configuration.
https://github.com/falcosecurity/charts/blob/master/falco/templates/daemonset.yaml#L57-L68
https://github.com/falcosecurity/charts/blob/master/falco/values.yaml#L29
I didn't add this arg.

@leogr leogr mentioned this issue Sep 16, 2020
13 tasks
@leogr
Copy link
Member

leogr commented Sep 16, 2020

Hey everyone,

I wrote an "umbrella" issue, that can help with debugging this problem, PTAL 👉 #1403

@stale
Copy link

stale bot commented Nov 15, 2020

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. Issues labeled "cncf", "roadmap" and "help wanted" will not be automatically closed. Please refer to a maintainer to get such label added if you think this should be kept open.

@stale stale bot added the wontfix label Nov 15, 2020
@fntlnz
Copy link
Contributor

fntlnz commented Nov 16, 2020

This is being tracked here #1403 - closing this to keep things clean and avoid duplicates.

Thanks to the original reporter @Yuriy6735 for bringing this to our attention.

@fntlnz fntlnz closed this as completed Nov 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants