Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[10.4-stable] Estimating Memory Overhead Before Domain Creation #3599

Conversation

OhmSpectator
Copy link
Member

It's a backport of #3588 into 10.4.

I had to refactor the code a bit:

The 10.4 version has no reworked overhead estimation introduced later by @rucoder, so I used the old implementation to get it. This implementation does not take adapters into account.

For the tests, I just checked that an app started successfully. I can perform the tests that touch the new logic, but it has not been done yet.

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issues/concerns as in #3598

@OhmSpectator
Copy link
Member Author

OhmSpectator commented Nov 20, 2023

The list of the commits to be backported:

Processing PR #3383...
PR #3383 - Master Hash: 10f3ba6
PR #3383 - Master Hash: d88091b
Processing PR #3504...
PR #3504 - Master Hash: c79d3fe

mikem-zed and others added 5 commits November 20, 2023 17:05
- Move PCI device related functions into a new pci.go
- Rename typeAndPci to pciDevice
- Implement functions on pciDevice structure
- Fix linter warnings

Signed-off-by: Mikhail Malyshev <mikem@zededa.com>
The original formula was "600 Mb + 0.25% RAM"
The empirical constant 600 Mb doesn't cover all the cases and is too big for certain VMs.
New formula dynamically adjusts VMM overhead taking into account No. vCPUs and the size of
MMIO regions mapped for PCI devices. Some devices may have very large aperture size.
E.g. NVIDIA A40 has 64 Gb apertura. If the VM has many GPUs assigned the overhead may
be significant.

The analysis showed that QEMU uses 0.6 - 0.85% of MMIO region size for internal
management. We allocate 1% to be on a save side

QEMU itself consumes ~20 Mb (constant) for code and shared libraries
and ~300 Mb is still unknown. Each vCPU requires ~3 Mb. So the formula is

(No. vCPU * 3Mb) + 1% total-MMIO + 0.25% RAM + 20 Mb QEMU + 300 MB unknown

For VMs with 2 CPU and 1GB RAM we now have ~354 Mb overhead instead of 626 Mb

Signed-off-by: Mikhail Malyshev <mikem@zededa.com>
The commit d88091b ("New formula to estimate VMM memory overhead (x86)")
breaks the container case and does not account vRAM, which can be set to a big
value, which causes QEMU process to be killed by the OOM killer.

This patch removes the `IsOCIContainer()` check, which makes the default
estimation path to be taken and eventually this amount of overhead limit
should be set for all the cases (vms and containers):

    350Mb (undefined mappings) +
    20Mb (QEMU binaries mappings) +
    3Mb * nvCPUs (vCPU stacks) +
    0 (MMIO regions, which presumably 0 if no direct attach) +
    vRAM * 0.25

So for a minimal reasonable container configuration with 4 vCPUs and
500Mb of vRAM, without PCI devices direct attach the overhead limit
will be set to ~500Mb, which should be safe.

What is the other motivation to increase the value from 100Mb/300Mb
to 500Mb for the container case rather than vRAM accounting? We notice
increased number of customer issues on the latest EVE versions, when
QEMU process was killed by the OOM killer.

Signed-off-by: Roman Penyaev <r.peniaev@gmail.com>
This commit significantly enhances the memory management strategy for new
domain creation in ZEDManager by incorporating memory overhead estimation prior
to domain creation.

Key changes include:
- Implemented memory overhead calculation in `updatestatus.go` using the
  `CountMemOverhead` function from the hypervisor package.
- Updated `getRemainingMemory` in `memorysizemgmt.go` to account for memory
  overhead in the total used memory size.
- Extended `pillar/hypervisor` package files to support memory overhead
  estimation across different hypervisors.
- Significantly, added assignable adapters information to the
  `zedmanagerContext`. This involved creating a new subscription for assignable
  adapters within `zedmanager.go` and handling their updates. This addition is
  crucial for accurately computing memory overhead.

This update ensures that domains are only started when there is enough memory,
considering both the domain's requirements and the overhead, thereby enhancing
system stability and performance.

Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>
…of DomainConfig.

In this commit, the CountMemOverhead function across various hypervisors
(containerd, kvm, null) is refactored. The change involves passing individual
parameters (like domain UUID, memory size, CPU counts, etc.) directly to the
function, rather than encapsulating them within a DomainConfig object. This
approach eliminates the need for creating a mock DomainConfig just to calculate
memory overhead, leading to cleaner and more explicit code. The changes are
applied consistently across the relevant files in the zedmanager and hypervisor
packages.

Signed-off-by: Nikolay Martyanov <nikolay@zededa.com>
@OhmSpectator OhmSpectator force-pushed the bugfix/take-overhead-into-account-10.4 branch from a41e382 to d5899ae Compare November 20, 2023 16:17
@OhmSpectator
Copy link
Member Author

OhmSpectator commented Nov 20, 2023

I've added the commits related to the overhead estimation that were not packported before.
Need to retest.

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@eriknordmark eriknordmark merged commit 55ab7aa into lf-edge:10.4-stable Nov 21, 2023
9 of 11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants