Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

core: Change the CPUPolicyUnit for dedicated cpu pinning #299

Merged
merged 1 commit into from
May 26, 2022

Conversation

ljelinkova
Copy link
Contributor

@ljelinkova ljelinkova commented Apr 22, 2022

Before the introduction of the exclusively pinned CPUs, the logic of CPUPolicyUnit allowed to check the filtering
constraints for each VM in the vm group individually. The constraint was that the number of VM's CPUs had to be <=
host CPUs.

With the introduction of the exclusively pinned CPUs, that is no more possible - if the group contained VMs with both, shared and exclusively pinned CPUs, we need to calculate the CPU count constraints for the whole group (similar to huge pages in HugePagesFilterPolicyUnit).

Now the algorithm for calculating if the vm group fits into
the host is as follows:

  1. Calculate the host CPU count
  2. Calculate the currently exclusively pinned CPUs (including pending)
  3. Calculate the sum of all dedicated CPUs of the vm group
  4. For all VMs with shared CPUs, find the max requested shared CPUs

The host can fit the VMs if:
hostCpuCount - takenCpus - addedExclusivelyPinnedCpus - requestedMaxSharedCpuCount >= 0

Note that the calculation of the previous values may differ based on the cluster setting of "Count threads as cores".

@ljelinkova
Copy link
Contributor Author

/ost

@ahadas
Copy link
Member

ahadas commented Apr 25, 2022

Before the introduction of the exclusively pinned CPUs, the logic of CPUPolicyUnit allowed to check the filtering constraints for each VM in the vm group individually. The constraint was that the number of VM's CPUs had to be >= host CPUs.

<=, right?

Copy link
Member

@ahadas ahadas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall looks good, it's much more than refactoring ;)

@ljelinkova
Copy link
Contributor Author

Before the introduction of the exclusively pinned CPUs, the logic of CPUPolicyUnit allowed to check the filtering constraints for each VM in the vm group individually. The constraint was that the number of VM's CPUs had to be >= host CPUs.

<=, right?

Yes

@ljelinkova ljelinkova force-pushed the fix-dedicated-migration branch 2 times, most recently from f7fde23 to 80b73ac Compare April 26, 2022 13:36
@ljelinkova ljelinkova changed the title core: Refactoring of CPUPolicyUnit core: Change the CPUPolicyUnit for dedicated cpu pinning Apr 26, 2022
@ljelinkova
Copy link
Contributor Author

/ost

@ljelinkova
Copy link
Contributor Author

/ost

@ahadas
Copy link
Member

ahadas commented Apr 28, 2022

@ljelinkova can you please rebase?

@ljelinkova
Copy link
Contributor Author

/ost

1 similar comment
@ljelinkova
Copy link
Contributor Author

/ost

Copy link
Member

@liranr23 liranr23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looked on the CpuPolicyUnit and the calls from there. Overall looks good to me :)

Comment on lines 253 to 256
if (allocatedCpus.isEmpty()) {
return Integer.MAX_VALUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why MAX_VALUE?? if it's empty it means the vm can't allocate on this host (maybe it's the first vm or the second). This means these VMs can't run together in this order of scheduling on this specific host.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah so I think that's why we return max value here - so the caller will realize that everything is unavailable and the host would be filtered out. worth adding a comment though in that case

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liranr23 Yes, you're right, the MAX_VALUE is just to notify the caller about that. We can throw an exception, but we would need to handle that. Saying that Integer.MAX_VALUE number of CPUs are unavailable allows the caller to use the same calculation for successful and unsuccessful allocations.

@ahadas adding a comment is a good idea

Copy link
Member

@ahadas ahadas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Comment on lines 253 to 256
if (allocatedCpus.isEmpty()) {
return Integer.MAX_VALUE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah so I think that's why we return max value here - so the caller will realize that everything is unavailable and the host would be filtered out. worth adding a comment though in that case

@@ -127,8 +138,9 @@ public List<VdsCpuUnit> updatePhysicalCpuAllocations(VM vm, Map<Guid, List<VdsCp
return cpusToBeAllocated;
}

filterSocketsWithInsufficientMemoryForNumaNode(cpuTopology, vm, hostId);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we still need it, don't we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but not for the calculation of the free cpus (because CpuPolicyUnit is not about numa memory). I've just moved it to line 113.

However, your question made me think, if we need two separate cpu policies (cpu count, cpu pinning) now when we are actually doing the same thing in both of them (preview the "pinning"). I think we could merge them into one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

given the changes i did now in #380, we run this allocation: VMs * hosts * 2 (once pinning policy, once CPU policy) + VMs(schedule) .

but: in terms of logical thinking - i think the current implementation make sense, also it saves us big scheduler change.
in terms of code and performance - we should do all the filtering once to reduce the *2 part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it won't be possible to merge CpuUnitPolicy and CpuPinningPolicyUnit into one unit after all. It would be possible if we could rely that all of the hosts do report cpuTopology and we can do manual pinning on the cpuToplogy directly but that might not be true for older hosts. For those, we still need to count with the currently reported online cpus and it makes sense to have separated in CpuPinningPolicyUnit.

Copy link
Contributor Author

@ljelinkova ljelinkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the patch with your comments and I've also changed the preview pinning in VdsCpuUnitPinningHelper to be as close to the actual pinning as possible. This allowed to get rid of Integer.MAX_VALUE. Now the algorithm is not concerned if the pinning succeeds or not (that should filter out the CpuPinningPolicyUnit.

Before the introduction of the exclusively pinned CPUs,
the logic of CPUPolicyUnit allowed to check the filtering
constraints for each VM in the vm group individually. The
constraint was that the number of VM's CPUs had to be >=
host CPUs.

With the introduction of the exclusively pinned CPUs, that is
no more possible - if the group contained VMs with both, shared
and exclusively pinned CPUs, we need to calculate the CPU count
constraints for the whole group (similar to huge pages in
HugePagesFilterPolicyUnit).

Now the algorithm for calculating if the vm group fits into
the host is as follows:
1. Calculate the host CPU count
2. Calculate the currently exclusively pinned CPUs (including pending)
3. Pin the vms that are being schedulled and count how many CPUs will be taken
4. Calculate how many shared CPUs are required to be left on the host
   as the maximum of required shared CPUs for vmGroup, pending VMs and running VMs.

The host can fit the VMs if:
hostCpuCount - exclusiveCpus - maxSharedCpuCount >= 0

Note that the calculation of the previous values may differ based on
the cluster setting of "Count threads as cores".
@ljelinkova
Copy link
Contributor Author

I've updated the patch with your comments and I've also changed the preview pinning in VdsCpuUnitPinningHelper to be as close to the actual pinning as possible. This allowed to get rid of Integer.MAX_VALUE. Now the algorithm is not concerned if the pinning succeeds or not (that should filter out the CpuPinningPolicyUnit.

... and I put the Integer.MAX_VALUE back to the patch as we need to know if the pinning of the exclusive cpus does not succeed otherwise we can end up taking more shared CPUs than we should.

@ljelinkova
Copy link
Contributor Author

/ost

@ahadas
Copy link
Member

ahadas commented May 25, 2022

@liranr23 anything else from your side?

@ahadas ahadas merged commit a6c17bd into oVirt:master May 26, 2022
@ljelinkova ljelinkova deleted the fix-dedicated-migration branch June 15, 2022 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants