Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

k6runner: use check context for http request #715

Merged
merged 2 commits into from
Jun 11, 2024
Merged

k6runner: use check context for http request #715

merged 2 commits into from
Jun 11, 2024

Conversation

roobre
Copy link
Collaborator

@roobre roobre commented May 30, 2024

This PR instructs the HTTP-backed k6 runner (Grafana Cloud) to use the context carrying the script timeout as the request context for HTTP requests made to that k6 runner. An extra second of grace time is added to account for network latency is added to the check timeout.

This fixes a potential deadlock where the agent would wait for the HTTP request to complete forever, even longer than the maximum time the script is allowed to run. This does not occur frequently, but could happen on certain scenarios such as when GC runners are abruptly terminated.

Note that this is a stopgap measure to prevent deadlocks in the agent. Ideally, we should have a separate settings for the time we wait for the runner to respond, and the time that the runner allows k6 to wait for. See details about the long-term proposal here: https://docs.google.com/document/d/1kq5xC4izHyJ9WOonS308J8sE3uDG48c6gj4ZPHsWIx4/edit.

@roobre roobre requested a review from a team as a code owner May 30, 2024 14:32
// The context above carries the check timeout, which will be eventually passed to k6 by the runner at the other end
// of this request. To account for network overhead, we add a second of grace time to the script timeout to form
// the timeout of this request.
reqCtx, cancel := context.WithTimeout(ctx, k6Timeout+time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this actually work? The parent context ctx will timeout first, causing the child context to timeout, too, and ignore the extra second...

Copy link
Collaborator Author

@roobre roobre May 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you're right, I was thinking "if this is not timing out before it won't now" but that does not make any sense... 🤦🏻‍♀️

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've worked around this by creating a different context entirely: 409666a

@roobre roobre requested a review from mem May 31, 2024 09:00
@peterschretlen peterschretlen added bug Something isn't working GA-blocker This issue needs to be fixed prior to feature becoming GA and removed bug Something isn't working labels Jun 6, 2024
@mem mem merged commit 18992df into main Jun 11, 2024
4 checks passed
@mem mem deleted the k6-http-timeout branch June 11, 2024 00:22
The-9880 added a commit that referenced this pull request Jun 11, 2024
* Chore(deps): Bump golang.org/x/net from 0.24.0 to 0.25.0
* Chore(deps): Bump the prometheus-go group across 1 directory with 2 updates
* Chore(deps): Bump github.com/mccutchen/go-httpbin/v2
* Update to grafana-build-tools v0.11.0 (#705)
* Remove adhoc and traceroute feature flags. (#707)
* Chore(deps): Bump github.com/KimMachineGun/automemlimit
* checks/test: make timer big enough for context cancel to be picked up
* Fix: Interpolate variables into MultiHTTP request bodies (#713)
* Chore(deps): Bump github.com/prometheus/prometheus (#716)
* Chore(deps): Bump github.com/rs/zerolog from 1.32.0 to 1.33.0
* --- updated-dependencies: - dependency-name: kernel.org/pub/linux/libs/security/libcap/cap   dependency-type: direct:production   update-type: version-update:semver-patch ...
* MultiHttp script should decode the payload as a string and replace the variable placeholders with the values. The result should be assigned to the request body. (#717)
* Build(deps): Bump golang.org/x/net from 0.25.0 to 0.26.0
* Enable K6 by default in agent deployments (#722)
* Fix: deprecate --features and warn user (#726)
* k6runner: use check context for http request (#715)

Signed-off-by: Anant Sharma <anant.sharma@grafana.com>
@The-9880 The-9880 mentioned this pull request Jun 11, 2024
The-9880 added a commit that referenced this pull request Jun 13, 2024
* Chore(deps): Bump golang.org/x/net from 0.24.0 to 0.25.0
* Chore(deps): Bump the prometheus-go group across 1 directory with 2 updates
* Chore(deps): Bump github.com/mccutchen/go-httpbin/v2
* Update to grafana-build-tools v0.11.0 (#705)
* Remove adhoc and traceroute feature flags. (#707)
* Chore(deps): Bump github.com/KimMachineGun/automemlimit
* checks/test: make timer big enough for context cancel to be picked up
* Fix: Interpolate variables into MultiHTTP request bodies (#713)
* Chore(deps): Bump github.com/prometheus/prometheus (#716)
* Chore(deps): Bump github.com/rs/zerolog from 1.32.0 to 1.33.0
* --- updated-dependencies: - dependency-name: kernel.org/pub/linux/libs/security/libcap/cap   dependency-type: direct:production   update-type: version-update:semver-patch ...
* MultiHttp script should decode the payload as a string and replace the variable placeholders with the values. The result should be assigned to the request body. (#717)
* Build(deps): Bump golang.org/x/net from 0.25.0 to 0.26.0
* Enable K6 by default in agent deployments (#722)
* Fix: deprecate --features and warn user (#726)
* k6runner: use check context for http request (#715)

Signed-off-by: Anant Sharma <anant.sharma@grafana.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GA-blocker This issue needs to be fixed prior to feature becoming GA
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants