Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
core: skip thresholds validation for remove snapshot
Try to remove snapshot even in case of low Storage Domain disk space. Otherwise, when remove snapshot is performed at the end of Live Storage Migration operation, the "RemoveSnapshotCommand" fails on validation and leaves 3 chunks' snapshot (7.5 GiB) unreleased. Additionally, the LSM operation is reported as failed, though the "move" part actually succeeded. Due to extra 3 chunks that are temporary used during LSM flow, we might temporary fall below disk space threshold. Currently not only that the code leaves unreleased an almost 8 GiB "junk", but it also leaves the Storage Domain in unhealthy low-space state which blocks other operations and requires a manual intervention instead of cleanly recovering from this temporary low disk space situation by performing the proper cleanup. Before the fix: 2022-08-23 13:14:56,448+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-45) [] EVENT_ID: IRS_DISK_SPACE_LOW_ERROR(201), Critical, Low disk space. iSCSI_SD2 domain has 4 GB of free space. 2022-08-23 13:16:10,455+03 WARN [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-16) [03ef74cf-ddbe-4027-805f-02631bf96929] Validation of action 'RemoveSnapshot' failed for user admin@internal-authz. Reasons: VAR__TYPE__SNAPSHOT,VAR__ACTION__REMOVE,ACTION_TYPE_FAILED_DISK_SPACE_LOW_ON_STORAGE_DOMAIN,$storageName iSCSI_SD2 2022-08-23 13:16:15,555+03 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-17) [03ef74cf-ddbe-4027-805f-02631bf96929] Ending command 'org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand' with failure. 2022-08-23 13:16:15,581+03 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-17) [03ef74cf-ddbe-4027-805f-02631bf96929] EVENT_ID: USER_MOVE_IMAGE_GROUP_FAILED_TO_DELETE_SRC_IMAGE(2,025), Possible failure while deleting iSCSI_VM1_Disk1 from the source Storage Domain iSCSI_SD2 during the move operation. The Storage Domain may be manually cleaned-up from possible leftovers (User:admin@internal-authz). At the end "iSCSI_SD2" has 4 GiB available, which is below the 5 GiB "Critical Space Action Blocker". After the fix "RemoveSnapshotCommand" executes successfully: 2022-08-24 11:00:43,164+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-48) [] EVENT_ID: IRS_DISK_SPACE_LOW_ERROR(201), Critical, Low disk space. iSCSI_SD2 domain has 4 GB of free space. We fall below threshold and then automatically recover, 7.5 GiB is returned to the Storage Domain. At the end "iSCSI_SD2" correctly has 12 GiB available, which is above the 5 GiB "Critical Space Action Blocker". Signed-off-by: Pavel Bar <pbar@redhat.com>
- Loading branch information