-
Notifications
You must be signed in to change notification settings - Fork 204
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Return transport errors to the caller #399
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, thanks so much! Just a couple of comments :)
@tikue Rebased and extended the test to all cases (I tried to compress it into one tabular test not to cause bloat). Please take anoter look. |
tarpc/src/client.rs
Outdated
let (tx, mut rx) = oneshot::channel(); | ||
let resp = send_request(&mut channel, "hi", tx, &mut rx).await; | ||
assert!(dispatch.as_mut().poll(cx).is_pending()); | ||
for (error, cause) in vec![ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ugh, I'm sorry about this, I just realized I had left my comment about test coverage on the wrong enum. This test, which is testing RpcError::Send, was totally fine as you had it before.
What I meant to say, but failed to, was that RpcError had some other variants added in this PR, and those new variants could use test coverage—are they returned in the proper circumstances? For example, Disconnected was replaced by Shutdown and Receive errors.
The additional test cases I had in mind:
- RpcError::Shutdown is returned when the dispatch task has shut down, preventing the client from:
- initiating an RPC
- completing an RPC
- An RPC should be completed with RpcError::Receive if the dispatch task receives an error when attempting to receive a message from the transport. Failing to receive is a permanent error, which shouldn't change, but the dispatch task should still send the more specific Receive error to all pending RPCs (of which there could be a few).
I've added comments to the specific places in the code that correspond to these test cases. I actually think RpcError::Receive is unused in this PR, so you can remove it if you'd like, and I can file an issue to add it back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No worries @tikue , it happens. I think now I added what you had in mind. I don't think it makes sense to delete what I already added though? It seems useful for the future and more tests usually don't hurt, WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me, though I think these are really 5 different test cases (should be different #[test]
fns). It's the match expression that leads me to think that — each error variant has different test logic.
I like the test coverage, but would prefer to put the test cases in different fns.
Previously, InFlightRequests required the client response type to be a server response. However, this prevented injection of non-server responses: for example, if the client fails to send a request, it should complete the request with an IO error rather than a server error.
Previously, a client channel would immediately disconnect when encountering an error in Transport::try_send. One kind of error that can occur in try_send is message validation, e.g. validating a message is not larger than a configured frame size. The problem with shutting down the client immediately is that debuggability suffers: it can be hard to understand what caused the client to fail. Also, these errors are not always fatal, as with frame size limits, so complete shutdown was extreme. By bubbling up errors, it's now possible for the caller to programmatically handle them. For example, the error could be walked via anyhow::Error: 2023-01-10T02:49:32.528939Z WARN client: the client failed to send the request Caused by: 0: could not write to the transport 1: frame size too big
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just 2 minor things and then I think we're good to merge :)
tarpc/src/client.rs
Outdated
let (tx, mut rx) = oneshot::channel(); | ||
let resp = send_request(&mut channel, "hi", tx, &mut rx).await; | ||
assert!(dispatch.as_mut().poll(cx).is_pending()); | ||
for (error, cause) in vec![ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That sounds good to me, though I think these are really 5 different test cases (should be different #[test]
fns). It's the match expression that leads me to think that — each error variant has different test logic.
I like the test coverage, but would prefer to put the test cases in different fns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for making this great improvement!
@tikue thanks you for very helpful guidance. May I ask for a release with this change so we can include it in our project? |
I published v0.32.0 immediately after merging this :) |
* Make client::InFlightRequests generic over result. Previously, InFlightRequests required the client response type to be a server response. However, this prevented injection of non-server responses: for example, if the client fails to send a request, it should complete the request with an IO error rather than a server error. * Gracefully handle client-side send errors. Previously, a client channel would immediately disconnect when encountering an error in Transport::try_send. One kind of error that can occur in try_send is message validation, e.g. validating a message is not larger than a configured frame size. The problem with shutting down the client immediately is that debuggability suffers: it can be hard to understand what caused the client to fail. Also, these errors are not always fatal, as with frame size limits, so complete shutdown was extreme. By bubbling up errors, it's now possible for the caller to programmatically handle them. For example, the error could be walked via anyhow::Error: ``` 2023-01-10T02:49:32.528939Z WARN client: the client failed to send the request Caused by: 0: could not write to the transport 1: frame size too big ``` * Some follow-up work: right now, read errors will bubble up to all pending RPCs. However, on the write side, only `start_send` bubbles up. `poll_ready`, `poll_flush`, and `poll_close` do not propagate back to pending RPCs. This is probably okay in most circumstances, because fatal write errors likely coincide with fatal read errors, which *do* propagate back to clients. But it might still be worth unifying this logic. --------- Co-authored-by: Tim Kuehn <tikue@google.com>
* Make client::InFlightRequests generic over result. Previously, InFlightRequests required the client response type to be a server response. However, this prevented injection of non-server responses: for example, if the client fails to send a request, it should complete the request with an IO error rather than a server error. * Gracefully handle client-side send errors. Previously, a client channel would immediately disconnect when encountering an error in Transport::try_send. One kind of error that can occur in try_send is message validation, e.g. validating a message is not larger than a configured frame size. The problem with shutting down the client immediately is that debuggability suffers: it can be hard to understand what caused the client to fail. Also, these errors are not always fatal, as with frame size limits, so complete shutdown was extreme. By bubbling up errors, it's now possible for the caller to programmatically handle them. For example, the error could be walked via anyhow::Error: ``` 2023-01-10T02:49:32.528939Z WARN client: the client failed to send the request Caused by: 0: could not write to the transport 1: frame size too big ``` * Some follow-up work: right now, read errors will bubble up to all pending RPCs. However, on the write side, only `start_send` bubbles up. `poll_ready`, `poll_flush`, and `poll_close` do not propagate back to pending RPCs. This is probably okay in most circumstances, because fatal write errors likely coincide with fatal read errors, which *do* propagate back to clients. But it might still be worth unifying this logic. --------- Co-authored-by: Tim Kuehn <tikue@google.com>
* Make client::InFlightRequests generic over result. Previously, InFlightRequests required the client response type to be a server response. However, this prevented injection of non-server responses: for example, if the client fails to send a request, it should complete the request with an IO error rather than a server error. * Gracefully handle client-side send errors. Previously, a client channel would immediately disconnect when encountering an error in Transport::try_send. One kind of error that can occur in try_send is message validation, e.g. validating a message is not larger than a configured frame size. The problem with shutting down the client immediately is that debuggability suffers: it can be hard to understand what caused the client to fail. Also, these errors are not always fatal, as with frame size limits, so complete shutdown was extreme. By bubbling up errors, it's now possible for the caller to programmatically handle them. For example, the error could be walked via anyhow::Error: ``` 2023-01-10T02:49:32.528939Z WARN client: the client failed to send the request Caused by: 0: could not write to the transport 1: frame size too big ``` * Some follow-up work: right now, read errors will bubble up to all pending RPCs. However, on the write side, only `start_send` bubbles up. `poll_ready`, `poll_flush`, and `poll_close` do not propagate back to pending RPCs. This is probably okay in most circumstances, because fatal write errors likely coincide with fatal read errors, which *do* propagate back to clients. But it might still be worth unifying this logic. --------- Co-authored-by: Tim Kuehn <tikue@google.com>
resolves #389
This is just cherry picking relevant commits from https://github.com/tikue/tarpc/tree/gats and adding a test with a mock transport that errors out on sending.