-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cpuallocator: implement clustered allocation based on cache groups. #343
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add GetNthLevelCacheCPUSet() akin to GetLastLevelCacheCPUSet() but returning the nth level cache cpuset for the CPU. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
klihub
force-pushed
the
devel/cache-group-allocation
branch
3 times, most recently
from
June 26, 2024 12:07
366a62e
to
430e7fc
Compare
askervin
reviewed
Jun 27, 2024
Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Discover groups of CPUs sharing a cache. Ignore groups which happen to be the same as CPUs grouped by some other topology criteria (package, die, etc.). Assume that all CPUs within a group are from the same package and die, and share the same closest NUMA node. Verify this assumption during discovery, aborting with a panic if it is not true. Use the last useful level of cache for clustering, IOW try finding the last cache which provides non-trivial clustering and use it grouping. Note that the current implementation is somewhat simplistic. It expects all CPUs to provide identical cache grouping and therefore picks a single cache level. This is usually fine, but it might result in suboptimal clustering with hybrid cores. If necessary we can remove this limitation in the future. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Implement cache group based CPU allocation. When a request makes this possible, try to satisfy it by allocating one or more cache groups. Always prefer using fully idle cache groups first, fragmenting idle cache groups by partial allocation second, and only resort to using fragmented groups when the allocation cannot be satisfied in any other way. When doing partial allocations, do it in a hyperthreading aware fashion, taking full physical cores whenever possible. Note: This allocation algorithm is a modified version of the cluster based allocator. We might want to try and combine them into a single group based allocator in the future. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
Use explicit cluster info based allocation only on hybrid core platforms. In other cases omit it altogether only using cache group based allocation. These should provide identical groups but for cache groups we do have partial allocation implemented while for cluster groups we don't have ATM. Signed-off-by: Krisztian Litkey <krisztian.litkey@intel.com>
klihub
force-pushed
the
devel/cache-group-allocation
branch
from
June 28, 2024 16:17
430e7fc
to
e4ed880
Compare
askervin
approved these changes
Jul 1, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Great. Thanks @klihub!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This patch series adds support for clustered CPU allocation based on cache groups to the cpuallocator package. In particular with these patches in place, the cpuallocator now
Allocation always prefers using fully idle cache groups first, fragmenting idle cache groups by partial allocation second, and only resort to using fragmented groups when the allocation cannot be satisfied in any other way. Partial cache group allocations are done in a hyperthread aware fashion, taking full physical cores whenever possible.
The allocator uses the last useful level of cache for clustering. IOW it tries to find the last cache which provides non-trivial clustering, one that provides different grouping than packages, dies, or hyperthreading, and uses that for grouping CPUs. The current implementation is currently somewhat simplistic as it expects all CPUs to provide identical cache grouping and therefore picks a single cache level for grouping. This is fine on most architectures, but it might result in suboptimal clustering with hybrid cores. If necessary this limitation can be addressed in the future.