-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[10.4-stable] Estimating Memory Overhead Before Domain Creation #3599
Merged
eriknordmark
merged 5 commits into
lf-edge:10.4-stable
from
OhmSpectator:bugfix/take-overhead-into-account-10.4
Nov 21, 2023
Merged
[10.4-stable] Estimating Memory Overhead Before Domain Creation #3599
eriknordmark
merged 5 commits into
lf-edge:10.4-stable
from
OhmSpectator:bugfix/take-overhead-into-account-10.4
Nov 21, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
eriknordmark
requested changes
Nov 17, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issues/concerns as in #3598
- Move PCI device related functions into a new pci.go - Rename typeAndPci to pciDevice - Implement functions on pciDevice structure - Fix linter warnings Signed-off-by: Mikhail Malyshev <mikem@zededa.com>
The original formula was "600 Mb + 0.25% RAM" The empirical constant 600 Mb doesn't cover all the cases and is too big for certain VMs. New formula dynamically adjusts VMM overhead taking into account No. vCPUs and the size of MMIO regions mapped for PCI devices. Some devices may have very large aperture size. E.g. NVIDIA A40 has 64 Gb apertura. If the VM has many GPUs assigned the overhead may be significant. The analysis showed that QEMU uses 0.6 - 0.85% of MMIO region size for internal management. We allocate 1% to be on a save side QEMU itself consumes ~20 Mb (constant) for code and shared libraries and ~300 Mb is still unknown. Each vCPU requires ~3 Mb. So the formula is (No. vCPU * 3Mb) + 1% total-MMIO + 0.25% RAM + 20 Mb QEMU + 300 MB unknown For VMs with 2 CPU and 1GB RAM we now have ~354 Mb overhead instead of 626 Mb Signed-off-by: Mikhail Malyshev <mikem@zededa.com>
The commit d88091b ("New formula to estimate VMM memory overhead (x86)") breaks the container case and does not account vRAM, which can be set to a big value, which causes QEMU process to be killed by the OOM killer. This patch removes the `IsOCIContainer()` check, which makes the default estimation path to be taken and eventually this amount of overhead limit should be set for all the cases (vms and containers): 350Mb (undefined mappings) + 20Mb (QEMU binaries mappings) + 3Mb * nvCPUs (vCPU stacks) + 0 (MMIO regions, which presumably 0 if no direct attach) + vRAM * 0.25 So for a minimal reasonable container configuration with 4 vCPUs and 500Mb of vRAM, without PCI devices direct attach the overhead limit will be set to ~500Mb, which should be safe. What is the other motivation to increase the value from 100Mb/300Mb to 500Mb for the container case rather than vRAM accounting? We notice increased number of customer issues on the latest EVE versions, when QEMU process was killed by the OOM killer. Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
This commit significantly enhances the memory management strategy for new domain creation in ZEDManager by incorporating memory overhead estimation prior to domain creation. Key changes include: - Implemented memory overhead calculation in `updatestatus.go` using the `CountMemOverhead` function from the hypervisor package. - Updated `getRemainingMemory` in `memorysizemgmt.go` to account for memory overhead in the total used memory size. - Extended `pillar/hypervisor` package files to support memory overhead estimation across different hypervisors. - Significantly, added assignable adapters information to the `zedmanagerContext`. This involved creating a new subscription for assignable adapters within `zedmanager.go` and handling their updates. This addition is crucial for accurately computing memory overhead. This update ensures that domains are only started when there is enough memory, considering both the domain's requirements and the overhead, thereby enhancing system stability and performance. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>
…of DomainConfig. In this commit, the CountMemOverhead function across various hypervisors (containerd, kvm, null) is refactored. The change involves passing individual parameters (like domain UUID, memory size, CPU counts, etc.) directly to the function, rather than encapsulating them within a DomainConfig object. This approach eliminates the need for creating a mock DomainConfig just to calculate memory overhead, leading to cleaner and more explicit code. The changes are applied consistently across the relevant files in the zedmanager and hypervisor packages. Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>
OhmSpectator
force-pushed
the
bugfix/take-overhead-into-account-10.4
branch
from
November 20, 2023 16:17
a41e382
to
d5899ae
Compare
I've added the commits related to the overhead estimation that were not packported before. |
eriknordmark
approved these changes
Nov 21, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
It's a backport of #3588 into 10.4.
I had to refactor the code a bit:
The 10.4 version has no reworked overhead estimation introduced later by @rucoder, so I used the old implementation to get it. This implementation does not take adapters into account.
For the tests, I just checked that an app started successfully. I can perform the tests that touch the new logic, but it has not been done yet.