Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Live merge sometimes fails, "No space left on device" appear in the log #352

Closed
3 tasks done
ahadas opened this issue Nov 16, 2022 · 0 comments · Fixed by #357
Closed
3 tasks done

Live merge sometimes fails, "No space left on device" appear in the log #352

ahadas opened this issue Nov 16, 2022 · 0 comments · Fixed by #357
Assignees
Labels

Comments

@ahadas
Copy link
Member

ahadas commented Nov 16, 2022

Description of problem: sometime snapshot removal fails, leaving VM with disk(s) in illegal state; subsequent tries to remove snapshot fail again

Version-Release number of selected component (if applicable): ovirt-engine-4.5.2.4-1.el8.noarch, oVirt Node 4.5.2 (vdsm-4.50.2.2-1.el8)

How reproducible: not always; issue happens now and then when backup system (storware vprotect, interacting with oVirt via API) tries to delete snapshots

Actual results: snapshot removals fail, and subsequent tries fail again

Expected results: snapshot is correctly removed

Additional info: issue doesn't happen every time, VMs are always (almost) the same (5 or 6 out of about 40), with no apparent pattern (VM either linux or windows, big or small disks, all hypervisors, ...)

Original bug: https://bugzilla.redhat.com/2122525

Tasks:

@ahadas ahadas added the storage label Nov 16, 2022
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 17, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 17, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 17, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 17, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 17, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 17, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 18, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 18, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 18, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 18, 2022
Measure also the base volume when prepare for
a merge. We need to consider base volume
bitmaps size, since measuring only top may
result in untracked bitmaps even with backing
enabled. This can happen if the bitmaps were
added to base after the top volume was created.

Considering both base and top volume bitmaps
may result in allocating more than needed, but
will avoid errors when commiting the image due
to 'No space left on device'.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 23, 2022
Prune the stale base volume bitmaps before
a commit during a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, it is better
to prune them to avoid the error.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 23, 2022
As we do with cold merge, live merge needs
to prune stale bitmaps from the base volume
before calling blockCommit, to avoid the
'No space left on device' errors.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 23, 2022
Prune the stale base volume bitmaps before
a commit during a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, it is better
to prune them to avoid the error.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 23, 2022
As we do with cold merge, live merge needs
to prune stale bitmaps from the base volume
before calling blockCommit, to avoid the
'No space left on device' errors.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 23, 2022
As we do with cold merge, live merge needs
to prune stale bitmaps from the base volume
before calling blockCommit, to avoid the
'No space left on device' errors.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 24, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 24, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 24, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 25, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
nirs pushed a commit that referenced this issue Nov 28, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: #352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Nov 30, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 1, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 7, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 7, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 12, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 12, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 12, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 13, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 13, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 13, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 13, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 13, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit to aesteve-rh/vdsm that referenced this issue Dec 13, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: oVirt#352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit that referenced this issue Dec 13, 2022
Prune the stale base volume bitmaps during
the prepare step on a merge operation.

These stale bitmaps can cause the merge
operation to fail due to 'No space left on device'.
In this case, qemu does not end with error, so
the failure goes unnoticed.

As there is not a reliable way to measure
the size of these stale bitmaps, and they are
invalid and can never be used for incremental
backup, it is better to prune them to avoid
the error.

Related: #352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
aesteve-rh added a commit that referenced this issue Dec 13, 2022
Prune stale bitmaps before a live merge to avoid
failing with ENOSPC.

Fixes: #352
Signed-off-by: Albert Esteve <aesteve@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants