From 40ff9f00272a21af0b4f46766c68af7fb91e6554 Mon Sep 17 00:00:00 2001 From: Stepan Rakitin Date: Fri, 8 Apr 2022 17:25:27 +0200 Subject: [PATCH 1/3] Retry RESOURCE_EXHAUSTED only if the server can recover --- specification/protocol/otlp.md | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/specification/protocol/otlp.md b/specification/protocol/otlp.md index 4634937faa4..3b25ba6fdee 100644 --- a/specification/protocol/otlp.md +++ b/specification/protocol/otlp.md @@ -223,7 +223,7 @@ not-retryable according to the following table: |ALREADY_EXISTS|No| |PERMISSION_DENIED|No| |UNAUTHENTICATED|No| -|RESOURCE_EXHAUSTED|Yes| +|RESOURCE_EXHAUSTED|Only if the server can recover (see below)| |FAILED_PRECONDITION|No| |ABORTED|Yes| |OUT_OF_RANGE|Yes| @@ -236,6 +236,8 @@ When retrying, the client SHOULD implement an exponential backoff strategy. An exception to this is the Throttling case explained below, which provides explicit instructions about retrying interval. +The client SHOULD interpret `RESOURCE_EXHAUSTED` code as retryable only if the server signals backpressure to indicate a possible recovery. + #### OTLP/gRPC Throttling OTLP allows backpressure signalling. From e8f1fa4f37b015a0c7b607fdc8257d4b744452bb Mon Sep 17 00:00:00 2001 From: Stepan Rakitin Date: Fri, 8 Apr 2022 18:16:33 +0200 Subject: [PATCH 2/3] specify implementation details --- CHANGELOG.md | 2 ++ specification/protocol/otlp.md | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 5a9b68363b1..2eed259d177 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -23,6 +23,8 @@ release. ### OpenTelemetry Protocol +- Specify that OTLP/gRPC clients should retry on `RESOURCE_EXHAUSTED` code only if the server signals backpressure to indicate a possible recovery. ([#2480](https://github.com/open-telemetry/opentelemetry-specification/pull/2480)) + ### SDK Configuration ### Telemetry Schemas diff --git a/specification/protocol/otlp.md b/specification/protocol/otlp.md index 3b25ba6fdee..e03bd21710e 100644 --- a/specification/protocol/otlp.md +++ b/specification/protocol/otlp.md @@ -236,7 +236,7 @@ When retrying, the client SHOULD implement an exponential backoff strategy. An exception to this is the Throttling case explained below, which provides explicit instructions about retrying interval. -The client SHOULD interpret `RESOURCE_EXHAUSTED` code as retryable only if the server signals backpressure to indicate a possible recovery. +The client SHOULD interpret `RESOURCE_EXHAUSTED` code as retryable only if the server supplies [RetryInfo](https://github.com/googleapis/googleapis/blob/6a8c7914d1b79bd832b5157a09a9332e8cbd16d4/google/rpc/error_details.proto#L40) via [status details](https://godoc.org/google.golang.org/grpc/status#Status.WithDetails) to signal backpressure and indicate a possible recovery. For details, see [OTLP/gRPC Throttling](#otlpgrpc-throttling). #### OTLP/gRPC Throttling From f1bc9beda0d89029371963a1726d8e76106ae248 Mon Sep 17 00:00:00 2001 From: Stepan Rakitin Date: Fri, 8 Apr 2022 18:24:41 +0200 Subject: [PATCH 3/3] Use wording from a PR comment See: https://github.com/open-telemetry/opentelemetry-specification/pull/2480#discussion_r846287681 --- specification/protocol/otlp.md | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/specification/protocol/otlp.md b/specification/protocol/otlp.md index e03bd21710e..2868340aec4 100644 --- a/specification/protocol/otlp.md +++ b/specification/protocol/otlp.md @@ -236,7 +236,10 @@ When retrying, the client SHOULD implement an exponential backoff strategy. An exception to this is the Throttling case explained below, which provides explicit instructions about retrying interval. -The client SHOULD interpret `RESOURCE_EXHAUSTED` code as retryable only if the server supplies [RetryInfo](https://github.com/googleapis/googleapis/blob/6a8c7914d1b79bd832b5157a09a9332e8cbd16d4/google/rpc/error_details.proto#L40) via [status details](https://godoc.org/google.golang.org/grpc/status#Status.WithDetails) to signal backpressure and indicate a possible recovery. For details, see [OTLP/gRPC Throttling](#otlpgrpc-throttling). +The client SHOULD interpret `RESOURCE_EXHAUSTED` code as retryable only if the server signals that the recovery from resource exhaustion is possible. This is signalled by the server by returning [a status](https://godoc.org/google.golang.org/grpc/status#Status.WithDetails) +containing +[RetryInfo](https://github.com/googleapis/googleapis/blob/6a8c7914d1b79bd832b5157a09a9332e8cbd16d4/google/rpc/error_details.proto#L40). In this case the behavior of the server and the client is exactly as described in [OTLP/gRPC Throttling](#otlpgrpc-throttling) section. +If no such status is returned then the `RESOURCE_EXHAUSTED` code SHOULD be treated as non-retryable. #### OTLP/gRPC Throttling