-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Faults" view should show all Terminating pods #2738
Comments
The reason is because a pod in Terminating does not necessarily mean its containers are not ready, which is what k9s is using to classify faulty pods. You can see that here: Lines 168 to 177 in be1ec87
When an error is returned, this is propagated to the k9s/internal/model1/table_data.go Line 211 in be1ec87
In my experience we had an EC2 fail causing services to go out - pod phases were Terminating, but not showing in the Faults view because the container statuses were still Ready. So I agree with @akatch. Majority of the time the container-ready/container-total metric works, but there are cases where it doesn't apply. In addition, if we could just see all terminating pods it would help reveal pods that are stuck in terminating. We've dealt with that problem extensively and had to manually search for them because they are filtered out in the faults view. |
This MR contains the following updates: | Package | Update | Change | |---|---|---| | [derailed/k9s](https://github.com/derailed/k9s) | patch | `v0.32.5` -> `v0.32.7` | MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot). **Proposed changes to behavior should be submitted there as MRs.** --- ### Release Notes <details> <summary>derailed/k9s (derailed/k9s)</summary> ### [`v0.32.7`](https://github.com/derailed/k9s/releases/tag/v0.32.7) [Compare Source](derailed/k9s@v0.32.6...v0.32.7) <img src="https://raw.githubusercontent.com/derailed/k9s/master/assets/k9s.png" align="center" width="800" height="auto"/> ### Release v0.32.7 #### Notes Thank you to all that contributed with flushing out issues and enhancements for K9s! I'll try to mark some of these issues as fixed. But if you don't mind grab the latest rev and see if we're happier with some of the fixes! If you've filed an issue please help me verify and close. Your support, kindness and awesome suggestions to make K9s better are, as ever, very much noted and appreciated! Also big thanks to all that have allocated their own time to help others on both slack and on this repo!! As you may know, K9s is not pimped out by corps with deep pockets, thus if you feel K9s is helping your Kubernetes journey, please consider joining our [sponsorship program](https://github.com/sponsors/derailed) and/or make some noise on social! [@​kitesurfer](https://twitter.com/kitesurfer) On Slack? Please join us [K9slackers](https://join.slack.com/t/k9sers/shared_invite/enQtOTA5MDEyNzI5MTU0LWQ1ZGI3MzliYzZhZWEyNzYxYzA3NjE0YTk1YmFmNzViZjIyNzhkZGI0MmJjYzhlNjdlMGJhYzE2ZGU1NjkyNTM) #### Maintenance Release! *** #### Videos Are In The Can! Please dial [K9s Channel](https://www.youtube.com/channel/UC897uwPygni4QIjkPCpgjmw) for up coming content... - [K9s v0.31.0 Configs+Sneak peek](https://youtu.be/X3444KfjguE) - [K9s v0.30.0 Sneak peek](https://youtu.be/mVBc1XneRJ4) - [Vulnerability Scans](https://youtu.be/ULkl0MsaidU) *** #### Resolved Issues - [#​2970](derailed/k9s#2970) Ctrl-z on events view causes runtime error in v0.32.6 - [#​2969](derailed/k9s#2969) When using impersonation user information and permissions not preserved when switching context - [#​2966](derailed/k9s#2966) Go to the Contexts page and filter, contexts that are matched will be filtered ou - [#​2962](derailed/k9s#2962) Small colour/filtering related bug - [#​2961](derailed/k9s#2961) Drain node with the -disable-eviction - [#​2958](derailed/k9s#2958) Restart count in container view associated with the wrong container - [#​2945](derailed/k9s#2945) Could we add ServiceAccount Column in v1/POD view *** #### Contributed MRs Please be sure to give `Big Thanks!` and `ATTA Girls/Boys!` to all the fine contributors for making K9s better for all of us!! - [#​2968](derailed/k9s#2968) Update go version to 1.23.X in README - [#​2964](derailed/k9s#2964) feat(dao,used-by-cmd): check imagePullSecrets as well - [#​2960](derailed/k9s#2960) Put log levels in order in cmd help *** <img src="https://raw.githubusercontent.com/derailed/k9s/master/assets/imhotep_logo.png" width="32" height="auto"/> © 2024 Imhotep Software LLC. All materials licensed under [Apache v2.0](http://www.apache.org/licenses/LICENSE-2.0) ### [`v0.32.6`](https://github.com/derailed/k9s/releases/tag/v0.32.6) [Compare Source](derailed/k9s@v0.32.5...v0.32.6) <img src="https://raw.githubusercontent.com/derailed/k9s/master/assets/k9s.png" align="center" width="800" height="auto"/> ### Release v0.32.6 #### Notes Thank you to all that contributed with flushing out issues and enhancements for K9s! I'll try to mark some of these issues as fixed. But if you don't mind grab the latest rev and see if we're happier with some of the fixes! If you've filed an issue please help me verify and close. Your support, kindness and awesome suggestions to make K9s better are, as ever, very much noted and appreciated! Also big thanks to all that have allocated their own time to help others on both slack and on this repo!! As you may know, K9s is not pimped out by corps with deep pockets, thus if you feel K9s is helping your Kubernetes journey, please consider joining our [sponsorship program](https://github.com/sponsors/derailed) and/or make some noise on social! [@​kitesurfer](https://twitter.com/kitesurfer) On Slack? Please join us [K9slackers](https://join.slack.com/t/k9sers/shared_invite/enQtOTA5MDEyNzI5MTU0LWQ1ZGI3MzliYzZhZWEyNzYxYzA3NjE0YTk1YmFmNzViZjIyNzhkZGI0MmJjYzhlNjdlMGJhYzE2ZGU1NjkyNTM) #### Maintenance Release! *** #### Videos Are In The Can! Please dial [K9s Channel](https://www.youtube.com/channel/UC897uwPygni4QIjkPCpgjmw) for up coming content... - [K9s v0.31.0 Configs+Sneak peek](https://youtu.be/X3444KfjguE) - [K9s v0.30.0 Sneak peek](https://youtu.be/mVBc1XneRJ4) - [Vulnerability Scans](https://youtu.be/ULkl0MsaidU) *** #### Resolved Issues - [#​2947](derailed/k9s#2947) CTRL+Z causes k9s to crash - [#​2938](derailed/k9s#2938) Critical Vulnerability CVE-2024-41110 in v26.0.1 of docker included in k9s - [#​2929](derailed/k9s#2929) conflicting plugins shortcuts - [#​2896](derailed/k9s#2896) Add a plugin to disable/enable a keda ScaledObject - [#​2811](derailed/k9s#2811) Dockerfile build step fails due to misaligned Go versions (1.21.5 vs 1.22.0) - [#​2767](derailed/k9s#2767) Manually triggered jobs don't get automatically cleaned up - [#​2761](derailed/k9s#2761) Enable "jump to owner" for more kinds - [#​2754](derailed/k9s#2754) Plugins not loaded/shown in UI - [#​2747](derailed/k9s#2747) Combining context and namespace switching only works sporadically (e.g. ":pod foo-ns [@​ctx-dev](https://github.com/ctx-dev)") - [#​2746](derailed/k9s#2746) k9s does not display "\[::]" string in its logs - [#​2738](derailed/k9s#2738) "Faults" view should show all Terminating pods *** #### Contributed MRs Please be sure to give `Big Thanks!` and `ATTA Girls/Boys!` to all the fine contributors for making K9s better for all of us!! - [#​2937](derailed/k9s#2937) Adding Argo Rollouts plugin version for PowerShell - [#​2935](derailed/k9s#2935) fix: show all terminating pods in Faults view ([#​2738](derailed/k9s#2738)) - [#​2933](derailed/k9s#2933) chore: broken url in build-status tag in the readme.md - [#​2932](derailed/k9s#2932) fix: add kubeconfig if k9s is launched with --kubeconfig - [#​2930](derailed/k9s#2930) fixed conflicting plugin shortcuts, and added 2 new plugins - [#​2927](derailed/k9s#2927) Fix "Mark Range": reduce maximum namespaces in favorites, fix shadowing of ctrl+space - [#​2926](derailed/k9s#2926) chore(plugins,remove-finalizers): make sure the resources api group is respected - [#​2921](derailed/k9s#2921) feat: Add plugins for kubectl node-shell - [#​2920](derailed/k9s#2920) eat: added StartupProbes status (S) to the PROBES column in the container render - [#​2914](derailed/k9s#2914) Adding eks-node-viewer plugin - [#​2898](derailed/k9s#2898) Add argocd plugin to community plugins - [#​2896](derailed/k9s#2896) feat(2896): Add toggle keda plugin - [#​2890](derailed/k9s#2890) Update README.md - [#​2881](derailed/k9s#2881) Fix Mark-Range command: ensure that NS Favorite doesn't exceed the limit - [#​2861](derailed/k9s#2861) chore: fix function name - [#​2856](derailed/k9s#2856) fix internal/render/hpa.go merge issue - [#​2848](derailed/k9s#2848) Include sidecar containers requests and limits - [#​2844](derailed/k9s#2844) Update README GO Version Required - [#​2830](derailed/k9s#2830) update tview to fix log escaping problem completely - [#​2822](derailed/k9s#2822) Adding HolmesGPT plugin - [#​2821](derailed/k9s#2821) Add a spark-operator plugin - [#​2817](derailed/k9s#2817) Add comment about Escape keybinding - [#​2812](derailed/k9s#2812) fix: align build image Go version with go.mod - [#​2795](derailed/k9s#2795) add new plugin current-ctx-terminal - [#​2791](derailed/k9s#2791) Add leading space to Kubernetes context suggestions - [#​2789](derailed/k9s#2789) Create kubectl-get-in-shell.yaml - [#​2788](derailed/k9s#2788) Update README.md plugin format - [#​2787](derailed/k9s#2787) Update helm-purge.yaml - [#​2786](derailed/k9s#2786) Update README.md with plugin dangerous field - [#​2780](derailed/k9s#2780) install copyright file into correct location - [#​2775](derailed/k9s#2775) fix freebsd build failure - [#​2780](derailed/k9s#2780) install copyright file into correct location - [#​2772](derailed/k9s#2772) proper handle OwnerReference for manually created job - [#​2771](derailed/k9s#2771) feat: add duplik8s plugin - [#​2770](derailed/k9s#2770) feat: allow plugins block in plugin files - [#​2765](derailed/k9s#2765) fix: Shellin -> ShellIn - [#​2763](derailed/k9s#2763) enable "jump to owner" for more kinds - [#​2755](derailed/k9s#2755) Loki plugin - [#​2751](derailed/k9s#2751) container logs should be escaped when printed - [#​2750](derailed/k9s#2750) fix: should switching ctx before ns *** <img src="https://raw.githubusercontent.com/derailed/k9s/master/assets/imhotep_logo.png" width="32" height="auto"/> © 2024 Imhotep Software LLC. All materials licensed under [Apache v2.0](http://www.apache.org/licenses/LICENSE-2.0) </details> --- ### Configuration 📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined). 🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied. ♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox. 🔕 **Ignore**: Close this MR and you won't be reminded about this update again. --- - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box --- This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate). <!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
Describe the bug
Enabling the "Toggle Faults" view shows some Terminating pods, but not all. Enabling this view should display all Terminating pods (and indeed all pods not in a Running and Ready state). However, it is unclear why some pods show up as Terminating in this view, but others do not. I did some brief digging in the code and it is not entirely clear how k9s determines which pods are considered faulty - it's possible that some Terminating pods meet these criteria but not all.
Further investigation shows that some Terminating pods with Events such as Node Not Ready (which I would absolutely 100% expect to show up in Faults) do not show up in the Fault view. This is the case in the attached screenshots below.
To Reproduce
Steps to reproduce the behavior:
:pods [namespace]
where many pods are Terminatingctrl+z
by defaultExpected behavior
All pods not in a Running/Ready state should appear when Faults view is enabled.
Screenshots
I have had to heavily sanitize these but hopefully they help demonstrate the issue.
A view of all pods, in particular many that are Terminating
The same namespace captured moments later in Fault view. No Terminating pods are seen.
Versions
The text was updated successfully, but these errors were encountered: