Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove starter job in favor of built-in implementation #87

Open
yorugac opened this issue Dec 8, 2021 · 8 comments
Open

Remove starter job in favor of built-in implementation #87

yorugac opened this issue Dec 8, 2021 · 8 comments

Comments

@yorugac
Copy link
Collaborator

yorugac commented Dec 8, 2021

Problem

The starter is currently relying on REST API of k6 to start the runners. It is a workable solution that allows to synchronize the start of the test. But it requires to schedule a separate pod which might be time-consuming and / or ineffective depending on the setup and test run. Additionally, it is a limiting approach on very large tests.

Suggested solution

The starting functionality is built-in into k6 itself, even if just for 1-instance test. It is used in k6 resume CLI command of externally-controlled executor. So k6-operator could re-use existing k6 code base and implement a synchronized start of the test directly in the TestRun controller. To achieve synchronization, the implementation should rely on Golang routines.

This solution allows to resolve all the issue described above. But there are cons as well:

  • This solution implies deprecation of .spec.starter option as unnecessary so it's potentially breaking. Request: if you rely on .spec.starter in your TestRun's, please let us know!
  • This solution puts more pressure on the app than there is now. It'd be good to consider how it will scale.

Additional context

Additional pros:

k6 issue about REST API to keep an eye on:
grafana/k6#3497

@simskij
Copy link
Contributor

simskij commented Dec 26, 2021

Just thought I'd offer some background as we (read: I) opted for using the HTTP Rest API over going for direct pod invocation deliberately.

While using k6 resume would work, it would require higher privileges for the starter pod, which I'm not really sure is an excellent idea. Allowing the starter pod to execute HTTP requests is by far less likely to cause issues than giving a pod the privileges to execute arbitrary shell commands in the load generators.

@yorugac
Copy link
Collaborator Author

yorugac commented Jan 5, 2022

Hi @simskij, thank you for providing additional background and input 🙂

I don't think I'm following here: what do you mean by higher privileges? Internally, k6 resume is nothing but a wrapper around REST call.

@simskij
Copy link
Contributor

simskij commented Jan 5, 2022

My bad! I was interpreting your intention as calling k6 resume inside the runners using the programmatic equivalent to kubectl exec, which would then have required the starter to have execution privileges in the runner pods.

@yorugac
Copy link
Collaborator Author

yorugac commented Jan 5, 2022

Yes, that'd be quite a different scenario 😄 Thanks for clarifying 👍 I'll update description a bit to be more precise.

@yorugac
Copy link
Collaborator Author

yorugac commented Jan 6, 2022

This might not be fully viable after all 😞 given cases like #89

@frittentheke
Copy link
Contributor

frittentheke commented Aug 12, 2024

@yorugac @simskij

While I totally understand and appreciate the synchronization to start all k6 instances (parallel>1) I am wondering why the operator does not simply issue those HTTP calls ([https://github.com/grafana/k6-operator/blob/40dce3ffc13c55fb665c41b891c1459d3ad66068/controllers/k6_start.go#L93
->](

starter := jobs.NewStarterJob(k6, hostnames)
->
parts = append(parts, fmt.Sprintf("curl --retry 3 -X PATCH -H 'Content-Type: application/json' http://%s:6565/v1/status -d '%s'", hostname, req))
) to the REST API endpoints natively? It just seems rather complex to spin up another job / pod to then just issue a few HTTP calls via curl.

Especially if you compare this to the setup phase (

if err := runSetup(ctx, hostnames, log); err != nil {
-> https://github.com/grafana/k6-operator/blob/main/controllers/common.go#L231 ->
func RunSetup(ctx context.Context, hostname string) (_ json.RawMessage, err error) {

which is already using the golang k6 client.
Aligning this would also follow the idea of this issue to make the communication with K6 a little less freestyle.

The also some closer tracking of health and test progress of k6 pods coming up would likely a thing that could be added once the k6 REST API is used.

@yorugac
Copy link
Collaborator Author

yorugac commented Aug 20, 2024

I am wondering why the operator does not simply issue those HTTP calls to the REST API endpoints natively? It just seems rather complex to spin up another job / pod to then just issue a few HTTP calls via curl.

Given the discussion above, I believe this is what this issue is basically about at this point 😄 I should update the description to reflect that.

This issue wasn't prioritized so far simply because nobody seemed to care, until recently. So thank you greatly for sharing your opinion @frittentheke!

But also, do any of existing users actually depend on starter pod? One can specify starter in TestRun so if anyone does that, changing this behaviour now would mean breaking existing workflows.
If anyone uses .spec.starter in TestRun, please comment here!

For completeness sake, on this comment above. Currently I don't think this will be an issue as no other such cases came up, AFAIR... And the part of "waiting for services to be up" does not need to change. Only the logic after that will be impacted.

@yorugac yorugac changed the title Switch from REST API to k6 CLI in starter job Remove starter job in favor of built-in implementation Aug 20, 2024
@yorugac
Copy link
Collaborator Author

yorugac commented Aug 20, 2024

Updated description of the issue to be more up-to-date.

cc @ivanape I believe this issue is related to what we discussed before 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants