Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI install/upgrade timing out waiting on pending pods #253

Closed
mjnagel opened this issue Mar 13, 2024 · 0 comments · Fixed by #426
Closed

CI install/upgrade timing out waiting on pending pods #253

mjnagel opened this issue Mar 13, 2024 · 0 comments · Fixed by #426
Assignees
Labels
ci Issues pertaining to CI / Pipelines / Testing

Comments

@mjnagel
Copy link
Contributor

mjnagel commented Mar 13, 2024

Describe what should be investigated or refactored

Periodically in CI, on the all tests, the test will hang/timeout on a package. It is commonly Grafana or Prometheus stack that the timeout occurs on. Generally Pending pods can be observed in the debug output, ex: https://github.com/defenseunicorns/uds-core/actions/runs/8268437284/job/22621379639?pr=210#step:9:55

We should identify where/why this is happening to make our CI more robust and less prone to flaky failures.

Additional context

A number of suspicions have been proposed but so far debug hasn't yielded anything concrete:

  • k8s resource issues (typically would show in events, haven't seen this)
  • pvc binding issues (typically would show in events, haven't seen this)
  • "host system" resource issues (the runner actually hitting max cpu/memory)
@rjferguson21 rjferguson21 self-assigned this Mar 14, 2024
@mjnagel mjnagel added the ci Issues pertaining to CI / Pipelines / Testing label Mar 28, 2024
@rjferguson21 rjferguson21 mentioned this issue May 23, 2024
5 tasks
rjferguson21 added a commit that referenced this issue May 23, 2024
## Description
Testing CI with upgraded k3s

## Related Issue

Fixes #253

## Type of change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Other (security config, docs update, etc)

## Checklist before merging

- [ ] Test, docs, adr added or updated as needed
- [ ] [Contributor Guide
Steps](https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md)(https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md#submitting-a-pull-request)
followed
rjferguson21 added a commit that referenced this issue Jul 11, 2024
## Description
Testing CI with upgraded k3s

## Related Issue

Fixes #253

## Type of change

- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Other (security config, docs update, etc)

## Checklist before merging

- [ ] Test, docs, adr added or updated as needed
- [ ] [Contributor Guide
Steps](https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md)(https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md#submitting-a-pull-request)
followed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Issues pertaining to CI / Pipelines / Testing
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants