-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Auditbeat] system/socket: Monitor all online CPUs #22827
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
adriansr
added
bug
in progress
Pull request is currently in progress.
Auditbeat
Team:Security-External Integrations
labels
Dec 1, 2020
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
botelastic
bot
added
needs_team
Indicates that the issue/PR needs a Team:* label
and removed
needs_team
Indicates that the issue/PR needs a Team:* label
labels
Dec 1, 2020
Marked as draft until it's tested further. |
adriansr
force-pushed
the
ab_socket_cpu_affinity
branch
from
December 1, 2020 16:38
79defef
to
5ae5ad2
Compare
Collaborator
Auditbeat's system/socket dataset needs to install kprobes on all online CPUs. Previously, it was using runtime.NumCPU() to determine the CPUs in the system, and monitoring CPUs 0 to NumCPU. This was a mistake that lead to startup failures or loss of events in any of the following scenarios: - When Auditbeat is started with a CPU affinity mask that excludes some CPUs - When there are offline CPUs in the system. This patch updates the tracing library in Auditbeat to fetch the list of online CPUs from /sys/devices/system/cpu/online so that it can install kprobes in all of them regardless of its own affinity mask, and correctly skipping offline CPUs. Related elastic#18755
adriansr
added
review
and removed
in progress
Pull request is currently in progress.
labels
Dec 2, 2020
adriansr
force-pushed
the
ab_socket_cpu_affinity
branch
from
December 2, 2020 17:53
bf4409a
to
88819af
Compare
andrewstucki
reviewed
Dec 2, 2020
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some questions, which if cleared up I'll go ahead and approve.
andrewstucki
approved these changes
Dec 2, 2020
adriansr
added
the
needs_backport
PR is waiting to be backported to other branches.
label
Dec 2, 2020
adriansr
added a commit
to adriansr/beats
that referenced
this pull request
Dec 2, 2020
Auditbeat's system/socket dataset needs to install kprobes on all online CPUs. Previously, it was using runtime.NumCPU() to determine the CPUs in the system, and monitoring CPUs 0 to NumCPU. This was a mistake that lead to startup failures or loss of events in any of the following scenarios: - When Auditbeat is started with a CPU affinity mask that excludes some CPUs - When there are offline or isolated CPUs in the system. This patch updates the tracing library in Auditbeat to fetch the list of online CPUs from /sys/devices/system/cpu/online so that it can install kprobes in all of them regardless of its own affinity mask, and correctly skipping offline CPUs. Related elastic#18755 (cherry picked from commit 6356887)
4 tasks
adriansr
added
v7.11.0
and removed
needs_backport
PR is waiting to be backported to other branches.
labels
Dec 2, 2020
adriansr
added a commit
to adriansr/beats
that referenced
this pull request
Dec 2, 2020
Auditbeat's system/socket dataset needs to install kprobes on all online CPUs. Previously, it was using runtime.NumCPU() to determine the CPUs in the system, and monitoring CPUs 0 to NumCPU. This was a mistake that lead to startup failures or loss of events in any of the following scenarios: - When Auditbeat is started with a CPU affinity mask that excludes some CPUs - When there are offline or isolated CPUs in the system. This patch updates the tracing library in Auditbeat to fetch the list of online CPUs from /sys/devices/system/cpu/online so that it can install kprobes in all of them regardless of its own affinity mask, and correctly skipping offline CPUs. Related elastic#18755 (cherry picked from commit 6356887)
4 tasks
adriansr
added a commit
that referenced
this pull request
Dec 3, 2020
Auditbeat's system/socket dataset needs to install kprobes on all online CPUs. Previously, it was using runtime.NumCPU() to determine the CPUs in the system, and monitoring CPUs 0 to NumCPU. This was a mistake that lead to startup failures or loss of events in any of the following scenarios: - When Auditbeat is started with a CPU affinity mask that excludes some CPUs - When there are offline or isolated CPUs in the system. This patch updates the tracing library in Auditbeat to fetch the list of online CPUs from /sys/devices/system/cpu/online so that it can install kprobes in all of them regardless of its own affinity mask, and correctly skipping offline CPUs. Related #18755 (cherry picked from commit 6356887)
adriansr
added a commit
that referenced
this pull request
Dec 3, 2020
Auditbeat's system/socket dataset needs to install kprobes on all online CPUs. Previously, it was using runtime.NumCPU() to determine the CPUs in the system, and monitoring CPUs 0 to NumCPU. This was a mistake that lead to startup failures or loss of events in any of the following scenarios: - When Auditbeat is started with a CPU affinity mask that excludes some CPUs - When there are offline or isolated CPUs in the system. This patch updates the tracing library in Auditbeat to fetch the list of online CPUs from /sys/devices/system/cpu/online so that it can install kprobes in all of them regardless of its own affinity mask, and correctly skipping offline CPUs. Related #18755 (cherry picked from commit 6356887)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This patch updates the tracing library in Auditbeat to fetch the list of online CPUs from
/sys/devices/system/cpu/online
so that it can install kprobes in all of them regardless of its own affinity mask, and correctly skipping offline CPUs.Why is it important?
Auditbeat's system/socket dataset needs to install kprobes on all online CPUs.
Previously, it was using Go's
runtime.NumCPU()
to determine the CPUs in the system, and monitoring CPUs0
toNumCPU-1
. This was a mistake that lead to startup failures or loss of events in any of the following scenarios:Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
Easier way to reproduce is to start Auditbeat with a CPU affinity mask that excludes the first CPU and only allows it to run on the second CPU:
This will pin Auditbeat to CPU1 while kprobes will be installed to CPU0, preventing guesses to work.
Alternatively, one can disable a few CPUs before launching Auditbeat:
Related issues
Related #18755
This PR fixes most of the problems reported in the above issue, but the main issue is fixed by #22787