Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

volume: allow extend cow preallocated #379

Merged
merged 4 commits into from
Feb 21, 2023

Conversation

aesteve-rh
Copy link
Member

@aesteve-rh aesteve-rh commented Feb 20, 2023

Currently we assume in the code that cow
volumes do not need to be extended in any case.
However, all preallocated volumes, both
cow and raw, should be extended when requested.
Otherwise, cow preallocated volumes have bigger
capacity than their truesize. Thus, when the size
of the disk reaches the truesize, the VM will
pause, which is an undesired effect.

To avoid this, we relax this assumption so that
only cow sparse volumes are skipped on extend calls.

Bug-Url: https://bugzilla.redhat.com/2170689
Signed-off-by: Albert Esteve aesteve@redhat.com

@aesteve-rh aesteve-rh self-assigned this Feb 20, 2023
Add test to extend volume size. Preallocated volumes
(both raw and cow) should extend to prevent VMs from
pausing when reaching the volume truesize limit.

Test is used to reproduce the issue in
https://bugzilla.redhat.com/2170689

Signed-off-by: Albert Esteve <aesteve@redhat.com>
@aesteve-rh
Copy link
Member Author

/ost

Copy link
Member

@ahadas ahadas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm but I'd also rename _extendSizeRaw then

lib/vdsm/storage/volume.py Show resolved Hide resolved
tests/storage/blocksd_test.py Show resolved Hide resolved
Currently we assume in the code that cow
volumes do not need to be extended in any case.
However, all preallocated volumes, both
cow and raw, should be extended when requested.
Otherwise, cow preallocated volumes have bigger
capacity than their truesize. Thus, when the size
of the disk reaches the truesize, the VM will
pause, which is an undesired effect.

To avoid this, we relax this assumption so that
only cow sparse volumes are skipped on extend calls.

Bug-Url: https://bugzilla.redhat.com/2170689
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Rename '_extendSizeRaw' to just '_extendSize',
as COW preallocated volumes can also reach
this call to extend LVs.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
@aesteve-rh aesteve-rh force-pushed the aesteve/fix-extend-prealloc-cow branch from 1f1f7bb to c9e61e4 Compare February 20, 2023 15:32
@aesteve-rh aesteve-rh added the verified Change was tested; please describe how it was tested in the PR label Feb 20, 2023
@aesteve-rh
Copy link
Member Author

Verified in my local setup doing:

  1. Create a new 10GiB Preallocated disk with Enable Incremental Backup
  2. Virtual Size and Actual Size are 10GiB
  3. Edit disk and extend by 5GiB
  4. Results:
    a) Without the fix: Actual Size is 15GiB, but Virtual Size remains at 10GiB
    b) With the fix: both Actual Size and Virtual Size are expanded to 15GiB

mz-pdm
mz-pdm previously approved these changes Feb 20, 2023
Copy link
Member

@mz-pdm mz-pdm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better now, thanks!

Add a test to ensure that extendSize is correctly
skipped for cow sparse volumes.

Signed-off-by: Albert Esteve <aesteve@redhat.com>
@aesteve-rh
Copy link
Member Author

/ost

@aesteve-rh aesteve-rh merged commit a558d52 into oVirt:master Feb 21, 2023
@aesteve-rh aesteve-rh deleted the aesteve/fix-extend-prealloc-cow branch February 21, 2023 07:42
@nirs
Copy link
Member

nirs commented Feb 21, 2023

Verified in my local setup doing:

  1. Create a new 10GiB Preallocated disk with Enable Incremental Backup
  2. Virtual Size and Actual Size are 10GiB
  3. Edit disk and extend by 5GiB
  4. Results:
    a) Without the fix: Actual Size is 15GiB, but Virtual Size remains at 10GiB

This is wrong - extending cow disk must increase the virtual size, but
the actual size does not have to change, since cow volume grow as needed.
We can argue that when using preallocated volume extending must also
preallocated the disk size, looks like the existing ocde already does this.

b) With the fix: both Actual Size and Virtual Size are expanded to 15GiB

Sound correct.

But I don't see how this issue can cause a VM to pause. If the virtual size did not
change (a), the vm does not see the new virtual size so it wil not try to write after
the end of the actual lv, and will not pause.

@ahadas
Copy link
Member

ahadas commented Feb 21, 2023

@nirs the part you comnented on was supposed to be: Without the fix: Virtual Size is 15GiB, but Actual Size remains at 10GiB

Copy link
Member

@nirs nirs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change looks safe, but it cannot fix an issue of vm pausing, since this cannot be caused by smaller virtual size.

The fix also does not affect the code changing the virtual size of
cow volume, it only affects the code changing the actual lv.

lib/vdsm/storage/volume.py Show resolved Hide resolved
lib/vdsm/storage/volume.py Show resolved Hide resolved
lib/vdsm/storage/volume.py Show resolved Hide resolved
@nirs
Copy link
Member

nirs commented Feb 21, 2023

@nirs the part you comnented on was supposed to be: Without the fix: Virtual Size is 15GiB, but Actual Size remains at 10GiB

But this is fine, the disk will be extended as needed while writing.

@nirs
Copy link
Member

nirs commented Feb 21, 2023

Looking at the code path changed, this affects HSM.extendVolumeSize, which extend the
amount of space allocated (e.g. file size or lv size). This virtual size is changed
in HSM.updateVolumeSize.

Bad API on vdsm side, instead of one API that does the right thing we have 2 APIs
that engine must call in some cases, but too late to fix this now.

So the fix is just a nice to have allocation fix, keep preallocated volumes
preallocated after user extend the size on the engine side.

If this fix an issue of pausing vms, it means thin provisioning code is broken
and should be fixed. This change can hide the thin proviioning issue by preallocating,
but if the thin provioning is broken, the vm will still pause when the disk will
become full, becuase qcow2 metadata need extra space.

Testing extension of preallocated volumes without this fix will tell if more work is
needed for thin volumes.

@ahadas
Copy link
Member

ahadas commented Feb 21, 2023

So the fix ... keep preallocated volumes preallocated after user extend the size on the engine side.

right, without this fix our tools show customers that something is wrong in their storage domains after extending preallocated qcow disks on block storage, as they expect virtual size == actual size for preallocated disks

@nirs
Copy link
Member

nirs commented Feb 21, 2023

OK, so the issue is not pausing vms, but incorrect allocation reported by the tools.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
storage verified Change was tested; please describe how it was tested in the PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants