Check for init container failures and InvalidImageName #441
+209
−50
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What
Check for init containers that fail.
Check for containers that can't pull because the image name is invalid.
If either of these things happens, acquire and fail the job.
Why
If an init container fails, "app" (post-init) containers do not run. This includes the agent, so nothing reports the failure to BK and the job remains unstarted.
Failing the job instead exposes the failure in the job log, and prevents the controller from eventually trying to re-create a potentially doomed job. Adding an init container to the podSpec as a way to check some precondition needed to run a job is also more useful.
InvalidImageName
does not cause init containers to fail in the same way, but is somewhat similar toImagePullBackOff
. The container tries to start, and k8s continues trying forever, without resulting in an exit. UnlikeImagePullBackOff
, we can reasonably assumeInvalidImageName
has no chance to resolve itself, so failing the BK job ignores the grace period in that case.