Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DEADLINE_EXCEEDED not throwing an error and hangs indefinitely #576

Closed
RichieAHB opened this issue Apr 5, 2023 · 7 comments
Closed

DEADLINE_EXCEEDED not throwing an error and hangs indefinitely #576

RichieAHB opened this issue Apr 5, 2023 · 7 comments
Labels
bug Something isn't working

Comments

@RichieAHB
Copy link

Describe the bug

I'm calling a gRPC service written in Java, and when passing an invalid argument to the service, the service will hang until the deadline. On the backend for the client I'm using connect-node. The channel is configured like so:

const DYNAMO_CLIENT = createPromiseClient(
  DynamoService,
  createGrpcTransport({
    baseUrl: process.env.BACKEND_URL,
    httpVersion: '2',
    useBinaryFormat: true,
  }),
);

I'm then calling a unary method on the service like so:

try {
    const response = await DYNAMO_CLIENT.listTargetServices(
      {
        targetServiceDiscoveryName: decodeURIComponent(target as string),
      },
      { timeoutMs: 10000 },
    );
    console.log(response);
  } catch (e) {
    console.error('Error listing target services', e);
  }

With this setup the call above just hangs indefinitely. I can see the server is responding with a DEADLINE_EXCEEDED as when calling this same method with grpcurl like so:

grpcurl \           
  --plaintext \
  -d '{ "target_service_discovery_name": "shasdfasdf" }' \
  -max-time 10 \
  -H 'service: dynamo' \
  localhost:5990 \
  richieahb.dynamo.v1.DynamoService/ListTargetServices

I get a response after 10 seconds:

ERROR:
  Code: DeadlineExceeded
  Message: context deadline exceeded

I've also added breakpoints into the Java and tested locally and can see it's calling onError in the StreamObserver used for sending the responses to the client.

To Reproduce

If you encountered an error message, please copy and paste it verbatim.
If the bug is specific to an RPC or payload, please provide a reduced
example.

Environment (please complete the following information):

  • @bufbuild/connect-web version: (for example, 0.1.0)
  • @bufbuild/connect-node version: (for example, 0.8.5)
  • Frontend framework and version: (for example, next@12.1.0) (although this call is from the server)
  • Node.js version: (for example, 18.15.0)
  • Browser and version: (for example, Google Chrome 111.0.5563.146)

If your problem is specific to bundling, please also provide the following information:

  • Bundler and version: (for example, webpack@5.74.0)
  • Bundler plugins and version: (for example compression-webpack-plugin@10.0.0)
  • ~Bundler configuration file: ~

Additional context
N/A

@RichieAHB RichieAHB added the bug Something isn't working label Apr 5, 2023
@RichieAHB
Copy link
Author

I have a suspicion it may be the same as #463 but not sure. It looks like the stream is closing on the server but this isn't getting picked up for whatever reason.

@timostamm
Copy link
Member

Thank you for the report, Richard. Yes, this looks very similar like #463, and may be the same root cause. We understand that this is a blocking issue and have this very high on our list.

@RichieAHB
Copy link
Author

Thanks for the response. We've currently got around this by fixing the issue that was causing the timeout on the server, but not the most sustainable solution! Happy to try help with a reproducible example next week if needs be but I think all the info is there to reproduce it.

@timostamm
Copy link
Member

@RichieAHB, a reproducible example would actually be very helpful. I have not been able to reproduce this against connect-go and grpc-go so far.

@RichieAHB
Copy link
Author

Apologies but I've not had time to make a repro case for this but thought I'd share some more details around the specific call this seems to happen on.

Specifically, the case where I've noticed this failing has been when the Node promise client has called the server (say server A), which has then called another server (B). In my specific case, server A was unable to reach server B, and the call timed out before A ever maned to reach B (although I'm not sure whether or not the same would have happened if A would have opened a connection with B, but B just failed to respond within the deadline).

I still plan to try and make a repro but not sure when I'll be able to find time so thought I'd share.

@timostamm
Copy link
Member

Thank you for the details @RichieAHB. I am not sure that server B is relevant, as the Node.js client is only directly talking to server A.

The important question is what exactly this server is doing on the wire. It could respond with a gRPC error status in HTTP trailers, HTTP headers, or close the HTTP/2 stream with one of the error codes of the RST_STREAM frame. In the latest release v0.9.0, we make sure that all three situations are handled properly, and we fixed an issue in #619 that may be related.

Could you give it a try with the latest release?

@timostamm
Copy link
Member

@RichieAHB, we'll need a reproducible case to continue here. v0.10.0 adds support for basic keep-alive, which would help in case the connection dies before the deadline is reached.

Regardless of keep-alive, v0.9.0 also added client-side support for deadlines - the call is canceled on the client when the deadline is reached, regardless of what the server does, so you should never see the request hanging past the deadline in any case.

Closing this, but please let us know if you have any issues with the latest release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants