-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Distinguish between different VDDK validation errors #969
Commits on Sep 19, 2024
-
Don't pass around the vddk image url unless necessary
Several functions accept a vddk image argument even though the vddk image can be retrieved directly from the plan. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Configuration menu - View commit details
-
Copy full SHA for 8962235 - Browse repository at this point
Copy the full SHA 8962235View commit details -
factor out the code for validating the vddk image validation Job
Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Configuration menu - View commit details
-
Copy full SHA for e0d014d - Browse repository at this point
Copy the full SHA e0d014dView commit details -
vddk validation: return errors if providers aren't set
This code previously returned nil if the source and destination providers were not set for the plan when validating the vddk image, but it seems to make more sense to return an error instead. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Configuration menu - View commit details
-
Copy full SHA for 4800b77 - Browse repository at this point
Copy the full SHA 4800b77View commit details -
Don't pass labels to createVddkCheckJob()
Rather than passing the labels to the function, just query it using the utility function. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Configuration menu - View commit details
-
Copy full SHA for 9dff5ca - Browse repository at this point
Copy the full SHA 9dff5caView commit details -
vddk validation: don't restart validator pod on failure
If the vddk validator pod fails, we don't need to keep re-trying. The container simply checks for the existence of a file, so restarting the pod is unlikely to change anything. In addition, by specifying `Never` for the restart policy, the completed pod should be retained for examination after the job fails, which can be helpful for determining the cause of failure. Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Configuration menu - View commit details
-
Copy full SHA for f2a5843 - Browse repository at this point
Copy the full SHA f2a5843View commit details
Commits on Sep 24, 2024
-
Distinguish between different VDDK validation errors
There are multiple cases that can lead to a "VDDK Init image is invalid" error message for a migration plan. They are currently handled with a single VDDKInvalid condition. One of the most common is when the vddk image cannot be pulled (either due to network issues or due to the user typing an incorrect image URL). Categorizing this type of error as an "invalid VDDK image" is confusing to the user. When the initContainer cannot pull the VDDK init image, the vddk-validator-* pod has something like the following status: initContainerStatuses: - name: vddk-side-car state: waiting: reason: ErrImagePull message: 'reading manifest 8.0.3.14 in default-route-openshift-image-registry.apps-crc.testing/openshift/vddk: manifest unknown' lastState: {} ready: false restartCount: 0 image: 'default-route-openshift-image-registry.apps-crc.testing/openshift/vddk:8.0.3.14' imageID: '' started: false We can use the existence of the 'waiting' state on the pod to indicate that the image cannot be pulled. Unfortunately, the validation job's pods are deleted when the job fails due to a failure to pull the image. Because of this, there's no way to examine the pod status to see why the failure occurred after the deadline. So this patch removes the deadline from the validation job, which requires overhauling the validation logic slightly. We add a new advisory condition `VDDKInitImageNotReady` to indicate that we are still waiting to pull the VDDK init image, and a new critical condition `VDDKInitImageUnavailable` to indicate that the condition has persisted for longer than the active deadline setting. Since the job will now retry pulling the vddk image indefinitely (due to the removal of the job deadline), we need to make sure that orphaned jobs don't run forever. So when the vddk image for a plan changes, we need to cancel all active validation jobs that are still running for the old vddk image. This overall approach has several advantages: - The user gets an early indication (via `VDDKInitImageNotReady`) that the image can't be pulled - The validation will automatically complete when any network interruption is resolved, without needing to delete and re-create the plan to start a new validation - The validation will no longer report a VDDKInvalid error when the image pull is very slow due to network issues because there is no longer a deadline for the job. Resolves: https://issues.redhat.com/browse/MTV-1150 Signed-off-by: Jonathon Jongsma <jjongsma@redhat.com>
Configuration menu - View commit details
-
Copy full SHA for 56de205 - Browse repository at this point
Copy the full SHA 56de205View commit details