-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Automatically Retry Retriable RPC failures to CSI Plugins #6863
Comments
This has been done for most of the attachment flow in recent PRs using grpc-retry middleware on a per request basis. |
Just for tracking my work to audit the calls we're making, here's the list of interfaces, whether they're "ok" as is, and whether I've "done" (checked that off in the PR I'll open) yet. With some of the calls where we're making gRPC retries already, there's some language in the spec about which things we can retry and whether we need to modify the request first; I'm going to re-audit those as part of this just to make sure we're not assuming some of our early work was 100% without checking. Edit: also added a field for #7278 Controller client RPCs
Node client RPCs
Identity client RPCs
|
The CSI Specification defines various gRPC Errors and how they may be retried. After auditing all our CSI RPC calls in #6863, this changeset: * adds retries and backoffs to the where they were needed but not implemented * annotates those CSI RPCs that do not need retries so that we don't wonder whether it's been left off accidentally * added a timeout and cancellation context to the `Probe` call, which didn't have one.
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
The CSI Specification defines various gRPC Errors and how they may be retried. As part of our work to implement support for CSI, we should audit the CSI Calls that we require and implement a reasonable backoff strategy and automatic retries where possible.
The text was updated successfully, but these errors were encountered: