Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zarf injector should filter on "running" pods when initializing #2356

Closed
dmiller-boeing opened this issue Mar 4, 2024 · 0 comments · Fixed by #2415
Closed

Zarf injector should filter on "running" pods when initializing #2356

dmiller-boeing opened this issue Mar 4, 2024 · 0 comments · Fixed by #2415
Labels
enhancement ✨ New feature or request

Comments

@dmiller-boeing
Copy link

Is your feature request related to a problem? Please describe.

Recently, I had a k3s cluster get corrupted by the master of master nodes being deleted. Even the zarf-docker-registry was corrupt, and I lost it and the images it contained. I tried a zarf init, and the injector continued to clone pods that were in ImagePullBackoff of similar sates. The timeout before moving on to another pod to try to clone was pretty long, so it took an enormous amount of time to start up the injector pod correctly and some manual finagling with taints to get on a node that had the fewest pods in error states.

Describe the solution you'd like

  • Given an existing cluster with many pods running, but also many that are in error states (like ImagePullBackoff)
  • When running zarf init
  • Then the injector filters out all pods but those that are healthy, in the "running" state to clone

Describe alternatives you've considered

An alternative might be to set the pod and/or node which to clone via an environment variable or --set.

Additional context

The timeout for an injector pod to get to the "running" state could also be lowered or made overrideable by an env var or --set

@dmiller-boeing dmiller-boeing added the enhancement ✨ New feature or request label Mar 4, 2024
Noxsios added a commit that referenced this issue Apr 15, 2024
…2415)

## Description
filter on running pods when finding an image for injector pod


https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#pod-phase

Description of the `Running` pod phase:

> The Pod has been bound to a node, and all of the containers have been
created. At least one container is still running, or is in the process
of starting or restarting.

## Related Issue
Fixes #2356
Fixes #2410

## Type of change

- [x] Bug fix (non-breaking change which fixes an issue)

## Checklist before merging

- [x] Test, docs, adr added or updated as needed
- [x] [Contributor Guide
Steps](https://github.com/defenseunicorns/zarf/blob/main/CONTRIBUTING.md#developer-workflow)
followed

---------

Co-authored-by: Austin Abro <37223396+AustinAbro321@users.noreply.github.com>
Co-authored-by: razzle <razzle@defenseunicorns.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement ✨ New feature or request
Projects
No open projects
Status: Closed
Development

Successfully merging a pull request may close this issue.

1 participant