-
-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Operator doesn't manage the metric exporter sidecar #72
Comments
Hello @eduardchernomaz, Problem explanation: In all cases, till I push the fix, I have provided 2 simple workarounds to employ in the meanwhile. Workaround:
|
More info on the proposed fix (still under investigation), The proposed idea is to extend the Operator operation to include container management as a secondary resource to the custom resource. Meaning that after the Operator creates the main cluster resources, it need to register a reconciler to manage the secondary resources created by k8s itself. Doing so, the Operator can start reacting to to events coming from specific containers within specific pods. From a security perspective, I also need to investigate any additional (if any) cluster privileges that maybe required by such solution. |
I wonder if we can also just add a |
it is an interesting idea. One that make a lot of sense. I will dig into that and see what is needed to have this put in place. |
After some investigation, I am moving the investigation to assess if the liveness probe can be used to secure the desired effect without marking the job as "error" |
Possible implementation approach, Change metrics_exporter Liveness probes to ping the locust container every 10 seconds and on failure send a curl to quitequitequite. |
This will be solved with the fix for #50 |
Describe the bug
After the CRD has been applied and the test has completed running for the duration specified, the process fails. The worker job completes, while the master job continues to run.
To Reproduce
Steps to reproduce the behavior:
LocustTest
manifest to start the testExpected behavior
Once the test has completed, both the worker and the master pods should be in a
Completed
state and eventually removed.Screenshots
Pods status after the test has completed.
Jobs status after the test has completed.
Additional context
I suspect that the problem is that on the master node, the
locust-metrics-exporter
container never stop and continues to run. Failing to signal job completion.The text was updated successfully, but these errors were encountered: