Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
core: skip thresholds check for LSM remove snapshot
When remove snapshot is performed at the end of Live Storage Migration operation, try to remove snapshot even in case of low disk space. Otherwise "RemoveSnapshotCommand" fails on validation and leaves 3 chunks' snapshot (7.5 GiB) unreleased and also reports LSM operation as failed (though the "move" part actually succeeded). Due to extra 3 chunks that are temporary used during LSM flow, we might temporary fall below disk space threshold. Currently not only that the code leaves unreleased an almost 8 GiB "junk", but it also leaves the Storage Domain in unhealthy low-space state which blocks other operations and requires a manual intervention instead of cleanly recovering from this temporary low disk space situation by performing the proper cleanup. Before the fix: 2022-08-23 13:14:56,448+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-45) [] EVENT_ID: IRS_DISK_SPACE_LOW_ERROR(201), Critical, Low disk space. iSCSI_SD2 domain has 4 GB of free space. 2022-08-23 13:16:10,455+03 WARN [org.ovirt.engine.core.bll.snapshots.RemoveSnapshotCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-16) [03ef74cf-ddbe-4027-805f-02631bf96929] Validation of action 'RemoveSnapshot' failed for user admin@internal-authz. Reasons: VAR__TYPE__SNAPSHOT,VAR__ACTION__REMOVE,ACTION_TYPE_FAILED_DISK_SPACE_LOW_ON_STORAGE_DOMAIN,$storageName iSCSI_SD2 2022-08-23 13:16:15,555+03 ERROR [org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-17) [03ef74cf-ddbe-4027-805f-02631bf96929] Ending command 'org.ovirt.engine.core.bll.storage.lsm.LiveMigrateDiskCommand' with failure. 2022-08-23 13:16:15,581+03 WARN [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-17) [03ef74cf-ddbe-4027-805f-02631bf96929] EVENT_ID: USER_MOVE_IMAGE_GROUP_FAILED_TO_DELETE_SRC_IMAGE(2,025), Possible failure while deleting iSCSI_VM1_Disk1 from the source Storage Domain iSCSI_SD2 during the move operation. The Storage Domain may be manually cleaned-up from possible leftovers (User:admin@internal-authz). At the end "iSCSI_SD2" has 4 GiB available, while the "Critical Space Action Blocker" is 5 GiB: After the fix: 2022-08-24 11:00:43,164+03 ERROR [org.ovirt.engine.core.dal.dbbroker.auditloghandling.AuditLogDirector] (EE-ManagedScheduledExecutorService-engineScheduledThreadPool-Thread-48) [] EVENT_ID: IRS_DISK_SPACE_LOW_ERROR(201), Critical, Low disk space. iSCSI_SD2 domain has 4 GB of free space. We fall below threshold and then automatically recover, 7.5 GiB are returned to the Storage Domain. At the end "iSCSI_SD2" correctly has 12 GiB available that is above the 5 GiB "Critical Space Action Blocker". Signed-off-by: Pavel Bar <pbar@redhat.com>
- Loading branch information