Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitor runners in case of insufficient resources #251

Open
yorugac opened this issue Jul 20, 2023 · 5 comments
Open

Monitor runners in case of insufficient resources #251

yorugac opened this issue Jul 20, 2023 · 5 comments
Assignees
Labels
enhancement New feature or request PLZ

Comments

@yorugac
Copy link
Collaborator

yorugac commented Jul 20, 2023

Feature Description

When one of the runners does not have sufficient resources allocated for the test, it goes into OOM state (insufficient memory for VUs. There can be other types of error for the same case as well). This condition is not monitored by the operator in any way, resulting in infinite wait loop for the pods to bootstrap.

This case should be monitored by the operator, followed by abortion of the test.

Suggested Solution (optional)

By initial experiments, there are two loops that can become infinite in such cases, at stage = "created" and stage = "started".

Note that test runs in different modes need to handled this case differently.

Already existing or connected issues / PRs (optional)

Potentially connected issue: #222

@yorugac yorugac added enhancement New feature or request PLZ labels Jul 20, 2023
@yorugac yorugac self-assigned this Jul 20, 2023
@na--
Copy link
Member

na-- commented Jul 20, 2023

Some of the same considerations I mentioned about setup() and teardown() in #223 (comment) may also apply here 🤔 Though maybe not entirely, since for the best UX, I imagine it would be best to rely on both k6 and k8s for error handling 🤔

@freevatar
Copy link

freevatar commented Jul 27, 2023

I believe we're encountering infinite loop issue in version v0.0.10rc3.

We have hard limits set for K8S namespaces (CPU/Memory/Max number of pods). If a test setup violates the aforementioned limits, it results in an infinite loop. For instance, if someone sets parallelism to a number that exceeds the Max number of pods policy, the scheduled runner's pods end up in an infinite "running" state loop.

@yorugac
Copy link
Collaborator Author

yorugac commented Jul 28, 2023

@freevatar thanks! Your case is a "perfect" example of this problem. One thing I'd like to clarify: since you pointed out the version, did you not encounter this problem in previous versions, like v0.0.10rc2, etc.?

@yorugac
Copy link
Collaborator Author

yorugac commented Jul 28, 2023

@na-- I've missed your comments 🤦 thank you! But yes, this particular case is more about "Kubernetes level" UX rather than k6. Either way, it is in my TODO plans to go through your distributed updates in k6 repo - I'll comment then 👍

@freevatar
Copy link

freevatar commented Jul 28, 2023

@yorugac

did you not encounter this problem in previous versions, like v0.0.10rc2, etc.?

Sorry for confusion, what I meant is that we tested the latest available version as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request PLZ
Projects
None yet
Development

No branches or pull requests

3 participants