Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reset NUMA configuration with resize and pin VM #504

Merged
merged 1 commit into from
Jul 21, 2022

Conversation

liranr23
Copy link
Member

When running a Resize and Pin NUMA VM, we create the virtual NUMA
nodes including the pinning to the physical NUMA nodes of the host.
This is a static configuration that being set on the VM.
Until now, this configuration harmed many flows to update the VM
configuration and a manual workaround needed to overcome it. Now the
NUMA configuration will be deleted once the VM shutdown and when
updating the VM, it will see the current configuration as there is no
NUMA to that VM.

Change-Id: I7d6e8200e7830dc6903a76ac219225fa359d2c53
Bug-Url: https://bugzilla.redhat.com/2074525
Signed-off-by: Liran Rotenberg lrotenbe@redhat.com

@liranr23 liranr23 added the virt label Jun 28, 2022
Copy link
Contributor

@ljelinkova ljelinkova left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch looks good and will make some flows easier. However, it might still be missing some parts if we want to keep (resize and pin -> no numa nodes defined by the user as invariant).

For example, we should set an empty list also when creating a new VM with resize and pin numa (for REST API), for the import and we could drop some validations in VmHandler.validateCpuPinningPolicy().

@liranr23
Copy link
Member Author

The patch looks good and will make some flows easier. However, it might still be missing some parts if we want to keep (resize and pin -> no numa nodes defined by the user as invariant).

For example, we should set an empty list also when creating a new VM with resize and pin numa (for REST API), for the import and we could drop some validations in VmHandler.validateCpuPinningPolicy().

AFAIK, for REST-API you need to execute other actions, not AddVm.
I thought of doing it for AddVm as well, but it's nonsense. In the API it requires additional API calls to the NUMA, in UI it's blocked since we are in resize and pin. That was my thought.

@ljelinkova
Copy link
Contributor

Could you please update to the latest master before we continue the review? Changes introduced in 2d8e98f will effect your PR.

@liranr23
Copy link
Member Author

Could you please update to the latest master before we continue the review? Changes introduced in 2d8e98f will effect your PR.

Any specific flow I should test with it? The annoying part now is about running VM, changing something that relates to NUMA (selecting host to run on, changing to high performance), indicates that NUMA configuration changed but not really.

@liranr23
Copy link
Member Author

/ost

@liranr23
Copy link
Member Author

/ost

@ahadas
Copy link
Member

ahadas commented Jul 20, 2022

... and when updating the VM, it will see the current configuration as there is no NUMA to that VM.

do we still need this part or is everything sorted out with the last change that skipped that validation?

@ahadas ahadas self-requested a review July 20, 2022 12:44
@liranr23
Copy link
Member Author

... and when updating the VM, it will see the current configuration as there is no NUMA to that VM.

do we still need this part or is everything sorted out with the last change that skipped that validation?

well, now we reset the vNUMA count to 0 on update/shutdown. so it's still correct to my eyes.

@liranr23
Copy link
Member Author

/ost

@ahadas
Copy link
Member

ahadas commented Jul 20, 2022

... and when updating the VM, it will see the current configuration as there is no NUMA to that VM.

do we still need this part or is everything sorted out with the last change that skipped that validation?

well, now we reset the vNUMA count to 0 on update/shutdown. so it's still correct to my eyes.

I meant whether we need it in the code itself - if we clear the topology when the vm reaches down state and in update-vm we skip the problematic validation, is there another justification for setting the nodes to empty list on update-vm?

@ljelinkova
Copy link
Contributor

... and when updating the VM, it will see the current configuration as there is no NUMA to that VM.

do we still need this part or is everything sorted out with the last change that skipped that validation?

well, now we reset the vNUMA count to 0 on update/shutdown. so it's still correct to my eyes.

I meant whether we need it in the code itself - if we clear the topology when the vm reaches down state and in update-vm we skip the problematic validation, is there another justification for setting the nodes to empty list on update-vm?

I think that is is a reasonable default and it is more understandable to the user, compared to having some nodes when the VM is down and different during the runtime. And when the VM is stopped back to the static nodes... It is confusing.

@liranr23
Copy link
Member Author

... and when updating the VM, it will see the current configuration as there is no NUMA to that VM.

do we still need this part or is everything sorted out with the last change that skipped that validation?

well, now we reset the vNUMA count to 0 on update/shutdown. so it's still correct to my eyes.

I meant whether we need it in the code itself - if we clear the topology when the vm reaches down state and in update-vm we skip the problematic validation, is there another justification for setting the nodes to empty list on update-vm?

I think that is is a reasonable default and it is more understandable to the user, compared to having some nodes when the VM is down and different during the runtime. And when the VM is stopped back to the static nodes... It is confusing.

i agree. although i think another change is required to the UI (in resize and pin don't show the popup field regarding vNUMA change), but without this part in backend it will be even more confusing, it will looks as there is next-run changes.

@liranr23
Copy link
Member Author

/ost

@ahadas
Copy link
Member

ahadas commented Jul 20, 2022

I think that is is a reasonable default and it is more understandable to the user, compared to having some nodes when the VM is down and different during the runtime. And when the VM is stopped back to the static nodes... It is confusing.

I'm not sure we talk about the same thing then because I'm talking about something that should not be visible for users but to developers that see the code -
if we clear the numa settings when the vm goes down (which I believe we already do) then in the common scenario of updating a vm that is down, we don't need to do anything on update-vm (assuming the client got empty numa node list)
if the user updates the vm while it was running and provided us the automatically created numa settings, we should not persist them to the next-run snapshot as we know they are transient for vms with resize-and-pin
if the vm is down and the client provides us with numa settings, either because the user specified it or the client loaded the previously automatically created numa nodes, the backend should reject that rather than reset it to an empty list

from user's perspective it would be the same - the vm won't have numa nodes as long as the vm is down

@liranr23
Copy link
Member Author

I think that is is a reasonable default and it is more understandable to the user, compared to having some nodes when the VM is down and different during the runtime. And when the VM is stopped back to the static nodes... It is confusing.

I'm not sure we talk about the same thing then because I'm talking about something that should not be visible for users but to developers that see the code - if we clear the numa settings when the vm goes down (which I believe we already do) then in the common scenario of updating a vm that is down, we don't need to do anything on update-vm (assuming the client got empty numa node list) if the user updates the vm while it was running and provided us the automatically created numa settings, we should not persist them to the next-run snapshot as we know they are transient for vms with resize-and-pin if the vm is down and the client provides us with numa settings, either because the user specified it or the client loaded the previously automatically created numa nodes, the backend should reject that rather than reset it to an empty list

from user's perspective it would be the same - the vm won't have numa nodes as long as the vm is down

the current state hopefully answers it.

@liranr23
Copy link
Member Author

/ost

Copy link
Member

@ahadas ahadas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks better now, minor comment inside
I'm ok with dropping the validation of numa nodes and ignore it instead, if that simplifies the client side

When running a `Resize and Pin NUMA` VM, we create the virtual NUMA
nodes including the pinning to the physical NUMA nodes of the host.
This is a static configuration that being set on the VM.
Until now, this configuration harmed many flows to update the VM
configuration and a manual workaround needed to overcome it. Now the
NUMA configuration will be deleted once the VM shutdown and when
updating the VM, it will see the current configuration as there is no
NUMA to that VM.

Change-Id: I7d6e8200e7830dc6903a76ac219225fa359d2c53
Bug-Url: https://bugzilla.redhat.com/2074525
Signed-off-by: Liran Rotenberg <lrotenbe@redhat.com>
@ahadas
Copy link
Member

ahadas commented Jul 21, 2022

/ost

@ahadas ahadas merged commit 18a2eb6 into oVirt:master Jul 21, 2022
@liranr23 liranr23 deleted the resize_reset_numa branch July 24, 2022 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants