HTTP2 error under load. #1956

vishvajeet-patil · 2022-10-14T16:29:28Z

Describe the bug
When putting some load we are getting an error {fetch_error:hyper::Error(Http2, Error { kind: GoAway(b"", NO_ERROR, Remote) })}target:apollo_router::services::subgraph_service} which implies connection destroyed by subgraph and router does not retry the request thereby resulting in an error.

To Reproduce
Steps to reproduce the behavior:

Setup any subgraph services with http2 support
Submit some sort of load on the service.
Hopefully the above error will get logged.

Expected behavior
The error should not happen in the first place and if the http2 connection gets destroyed the router should retry the request automatically.

Output
Error log -
{fetch_error:hyper::Error(Http2, Error { kind: GoAway(b"", NO_ERROR, Remote) })}target:apollo_router::services::subgraph_service}

Desktop (please complete the following information):

OS: Docker container (Base image Debian:bullseye-slim)

Additional context
GO_AWAY frame error happens when the subgraph destroys the http2 connection gracefully (why would that happen?) and even if it happens we should retry the active requests over those connections.

The text was updated successfully, but these errors were encountered:

Geal · 2022-10-19T13:43:48Z

could you describe a bit more what the expected behavior would be here? Request retries are a tricky feature, especially under load.
Here, what's probably happening is that the subgraph is sending the GO_AWAY frame because it is overloaded, so retrying requests while new requests are adding up too, to that same subgraph, would make things worse.

Until now we have avoided implementing in the router some infrastructure related features like load balancing, circuit breaking or retries, because they would be better addressed at the service mesh layer. They could still be done in the router, but they require some thought to be done properly

vishvajeet-patil · 2022-10-19T19:56:48Z

@Geal But this GO_AWAY NO_ERROR implies graceful shutdown, isn't it? Will the subgraph throw this error if it is under load?

oliverjumpertz · 2022-10-20T06:32:49Z

To add to this:

This currently also happens to us.

Especially in a Kubernetes context, where nginx is the primary choice of ingress, this happens regularly.

nginx shuts its http2 connections down every 1000 streams (default) to free memory allocations it couldn't free earlier.

This basically means that for every 1000 requests (Router and also other callers), a connection drops.

In this case, hyper cannot gracefully handle the drop of a connection while a request is in flight, and this error happens.

See here: hyperium/hyper#2500

Additionally see this section of the nginx documentation: https://nginx.org/en/docs/http/ngx_http_core_module.html#keepalive_requests

vishvajeet-patil added raised by user triage labels Oct 14, 2022

vishvajeet-patil changed the title ~~HTTP2 error in rust gateway~~ HTTP2 error in rust router Oct 14, 2022

vishvajeet-patil changed the title ~~HTTP2 error in rust router~~ HTTP2 error in rust router under load. Oct 15, 2022

abernix changed the title ~~HTTP2 error in rust router under load.~~ HTTP2 error under load. Oct 17, 2022

Geal mentioned this issue Oct 26, 2022

subgraph retry policy and circuit breaking #338

Closed

Geal mentioned this issue Nov 18, 2022

Request retries for subgraph queries #2006

Merged

abernix removed the triage label Nov 18, 2022

Geal closed this as completed in #2006 Nov 24, 2022

BrynCooke modified the milestones: v1-NEXT, v1.5.0 Dec 2, 2022

BrynCooke assigned vishvajeet-patil and Geal and unassigned vishvajeet-patil Dec 3, 2022

garypen added this to the v1.5.0 milestone Dec 5, 2022

BrynCooke modified the milestone: v1.5.0 Dec 5, 2022

BrynCooke mentioned this issue Dec 5, 2022

Release 1.5.0 #2208

Merged

This was referenced Jan 18, 2023

Router blocks and doesn't handle any requests after being hit with load beyond a threshold #2377

Closed

hyper's connection pool only opens one TCP connection for HTTP/2 #2063

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP2 error under load. #1956

HTTP2 error under load. #1956

vishvajeet-patil commented Oct 14, 2022

Geal commented Oct 19, 2022

vishvajeet-patil commented Oct 19, 2022

oliverjumpertz commented Oct 20, 2022 •

edited

Loading

HTTP2 error under load. #1956

HTTP2 error under load. #1956

Comments

vishvajeet-patil commented Oct 14, 2022

Geal commented Oct 19, 2022

vishvajeet-patil commented Oct 19, 2022

oliverjumpertz commented Oct 20, 2022 • edited Loading

oliverjumpertz commented Oct 20, 2022 •

edited

Loading