Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[Auditbeat] Recover from errors in audit monitoring routine (#22673)
The auditd module spawns a monitoring goroutine that fetches auditd status every 15s. Due to this routine using a single audit client, if an update fails (because a netlink message is late or other causes), the audit client can get out of sync with the stream, failing in all subsequent requests. For reasons that aren't 100% clear to me at the moment, this error condition leads to a lot of `[audit_send_repl]` (2.6.x) / `[audit_send_reply]` (3.x+) kernel threads being created. ``` ERROR [auditd] auditd/audit_linux.go:183 get status request failed:failed to get audit status ack: unexpected sequence number for reply (expected 6286 but got 6285) ``` ``` $ ps -ef [...] root 27790 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27791 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27792 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27793 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27794 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27795 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27796 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27797 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27798 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27799 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27800 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27801 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27802 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27803 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27804 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27805 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27806 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27807 2 0 12:52 ? 00:00:00 [audit_send_repl] root 27808 2 0 12:52 ? 00:00:00 [audit_send_repl] [...] ``` This patch updates the error-handling logic to create a new audit client when a status update fails, allowing to recover and preventing the proliferation of `audit_send_repl` kernel threads.
- Loading branch information