Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Check for init container failures and InvalidImageName #441

Merged
merged 1 commit into from
Dec 4, 2024

Conversation

DrJosh9000
Copy link
Contributor

@DrJosh9000 DrJosh9000 commented Dec 3, 2024

What

Check for init containers that fail.
Check for containers that can't pull because the image name is invalid.
If either of these things happens, acquire and fail the job.

Why

If an init container fails, "app" (post-init) containers do not run. This includes the agent, so nothing reports the failure to BK and the job remains unstarted.

Failing the job instead exposes the failure in the job log, and prevents the controller from eventually trying to re-create a potentially doomed job. Adding an init container to the podSpec as a way to check some precondition needed to run a job is also more useful.

InvalidImageName does not cause init containers to fail in the same way, but is somewhat similar to ImagePullBackOff. The container tries to start, and k8s continues trying forever, without resulting in an exit. Unlike ImagePullBackOff, we can reasonably assume InvalidImageName has no chance to resolve itself, so failing the BK job ignores the grace period in that case.

@DrJosh9000 DrJosh9000 force-pushed the check-init-container-failure branch 5 times, most recently from db9e7f1 to 3b007fd Compare December 3, 2024 02:00
@DrJosh9000 DrJosh9000 marked this pull request as ready for review December 3, 2024 02:03
@DrJosh9000 DrJosh9000 force-pushed the check-init-container-failure branch 2 times, most recently from 8a691d1 to c7f7898 Compare December 3, 2024 07:20
@DrJosh9000 DrJosh9000 changed the title Check for init container failures Check for init container failures and InvalidImageName Dec 3, 2024
@DrJosh9000 DrJosh9000 force-pushed the check-init-container-failure branch 2 times, most recently from f6a025c to 7cdeeb7 Compare December 3, 2024 22:36
@DrJosh9000 DrJosh9000 merged commit 94ae79d into main Dec 4, 2024
1 check passed
@DrJosh9000 DrJosh9000 deleted the check-init-container-failure branch December 4, 2024 02:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants