Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Otlp] Add disk retry enablement #5527

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions src/OpenTelemetry.Exporter.OpenTelemetryProtocol/CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,17 @@
`parent_is_remote` information.
([#5563](https://github.com/open-telemetry/opentelemetry-dotnet/pull/5563))

* Introduced experimental support for automatically retrying export to the otlp
endpoint by storing the telemetry offline during transient network errors.
Users can enable this feature by setting the
`OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY` environment variable to `disk`. The
default path where the telemetry is stored is obtained by calling
[Path.GetTempPath()](https://learn.microsoft.com/dotnet/api/system.io.path.gettemppath)
or can be customized by setting
`OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH` environment
variable.
([#5527](https://github.com/open-telemetry/opentelemetry-dotnet/pull/5527))

## 1.8.1

Released 2024-Apr-17
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@ internal sealed class ExperimentalOptions

public const string OtlpRetryEnvVar = "OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY";

public const string OtlpDiskRetryDirectoryPathEnvVar = "OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an app is designed to run in a platform-independent environment and wants to use a temporary location, the restriction to set the environment variable OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH can make things complex. Should we write to the temporary path if this environment variable is not set? We could rely on .NET API to get temp path Path.GetTempPath. Customers will use OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY to enable offline storage anyway. We could explain the behavior in the documentation along with this environment variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me. Its more simple for users who want minimal configuration for trying out this feature.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated


public ExperimentalOptions()
: this(new ConfigurationBuilder().AddEnvironmentVariables().Build())
{
Expand All @@ -29,9 +31,29 @@ public ExperimentalOptions(IConfiguration configuration)
this.EmitLogEventAttributes = emitLogEventAttributes;
}

if (configuration.TryGetStringValue(OtlpRetryEnvVar, out var retryPolicy) && retryPolicy != null && retryPolicy.Equals("in_memory", StringComparison.OrdinalIgnoreCase))
if (configuration.TryGetStringValue(OtlpRetryEnvVar, out var retryPolicy) && retryPolicy != null)
{
this.EnableInMemoryRetry = true;
if (retryPolicy.Equals("in_memory", StringComparison.OrdinalIgnoreCase))
{
this.EnableInMemoryRetry = true;
}
else if (retryPolicy.Equals("disk", StringComparison.OrdinalIgnoreCase))
{
this.EnableDiskRetry = true;
if (configuration.TryGetStringValue(OtlpDiskRetryDirectoryPathEnvVar, out var path) && path != null)
{
this.DiskRetryDirectoryPath = path;
}
else
{
// Fallback to temp location.
this.DiskRetryDirectoryPath = Path.GetTempPath();
}
}
else
{
throw new NotSupportedException($"Retry Policy '{retryPolicy}' is not supported.");
}
}
}

Expand All @@ -48,4 +70,14 @@ public ExperimentalOptions(IConfiguration configuration)
/// href="https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/exporter.md#retry"/>.
/// </remarks>
public bool EnableInMemoryRetry { get; }

/// <summary>
/// Gets a value indicating whether or not retry via disk should be enabled for transient errors.
/// </summary>
public bool EnableDiskRetry { get; }

/// <summary>
/// Gets the path on disk where the telemetry will be stored for retries at a later point.
/// </summary>
public string? DiskRetryDirectoryPath { get; }
}
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,8 @@
#if NETSTANDARD2_1 || NET6_0_OR_GREATER
using Grpc.Net.Client;
#endif
using System.Diagnostics;
using Google.Protobuf;
using OpenTelemetry.Exporter.OpenTelemetryProtocol.Implementation.Transmission;
using LogOtlpCollector = OpenTelemetry.Proto.Collector.Logs.V1;
using MetricsOtlpCollector = OpenTelemetry.Proto.Collector.Metrics.V1;
Expand Down Expand Up @@ -100,9 +102,29 @@ public static THeaders GetHeaders<THeaders>(this OtlpExporterOptions options, Ac
? httpTraceExportClient.HttpClient.Timeout.TotalMilliseconds
: options.TimeoutMilliseconds;

return experimentalOptions.EnableInMemoryRetry
? new OtlpExporterRetryTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(exportClient, timeoutMilliseconds)
: new OtlpExporterTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(exportClient, timeoutMilliseconds);
if (experimentalOptions.EnableInMemoryRetry)
{
return new OtlpExporterRetryTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(exportClient, timeoutMilliseconds);
}
else if (experimentalOptions.EnableDiskRetry)
{
vishweshbankwar marked this conversation as resolved.
Show resolved Hide resolved
Debug.Assert(!string.IsNullOrEmpty(experimentalOptions.DiskRetryDirectoryPath), $"{nameof(experimentalOptions.DiskRetryDirectoryPath)} is null or empty");

return new OtlpExporterPersistentStorageTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(
exportClient,
timeoutMilliseconds,
(byte[] data) =>
{
var request = new TraceOtlpCollector.ExportTraceServiceRequest();
request.MergeFrom(data);
return request;
},
Path.Combine(experimentalOptions.DiskRetryDirectoryPath, "traces"));
}
else
{
return new OtlpExporterTransmissionHandler<TraceOtlpCollector.ExportTraceServiceRequest>(exportClient, timeoutMilliseconds);
}
}

public static OtlpExporterTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest> GetMetricsExportTransmissionHandler(this OtlpExporterOptions options, ExperimentalOptions experimentalOptions)
Expand All @@ -116,9 +138,29 @@ public static THeaders GetHeaders<THeaders>(this OtlpExporterOptions options, Ac
? httpMetricsExportClient.HttpClient.Timeout.TotalMilliseconds
: options.TimeoutMilliseconds;

return experimentalOptions.EnableInMemoryRetry
? new OtlpExporterRetryTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(exportClient, timeoutMilliseconds)
: new OtlpExporterTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(exportClient, timeoutMilliseconds);
if (experimentalOptions.EnableInMemoryRetry)
{
return new OtlpExporterRetryTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(exportClient, timeoutMilliseconds);
}
else if (experimentalOptions.EnableDiskRetry)
{
Debug.Assert(!string.IsNullOrEmpty(experimentalOptions.DiskRetryDirectoryPath), $"{nameof(experimentalOptions.DiskRetryDirectoryPath)} is null or empty");

return new OtlpExporterPersistentStorageTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(
exportClient,
timeoutMilliseconds,
(byte[] data) =>
{
var request = new MetricsOtlpCollector.ExportMetricsServiceRequest();
request.MergeFrom(data);
return request;
},
Path.Combine(experimentalOptions.DiskRetryDirectoryPath, "metrics"));
}
else
{
return new OtlpExporterTransmissionHandler<MetricsOtlpCollector.ExportMetricsServiceRequest>(exportClient, timeoutMilliseconds);
}
}

public static OtlpExporterTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest> GetLogsExportTransmissionHandler(this OtlpExporterOptions options, ExperimentalOptions experimentalOptions)
Expand All @@ -128,9 +170,29 @@ public static THeaders GetHeaders<THeaders>(this OtlpExporterOptions options, Ac
? httpLogExportClient.HttpClient.Timeout.TotalMilliseconds
: options.TimeoutMilliseconds;

return experimentalOptions.EnableInMemoryRetry
? new OtlpExporterRetryTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(exportClient, timeoutMilliseconds)
: new OtlpExporterTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(exportClient, timeoutMilliseconds);
if (experimentalOptions.EnableInMemoryRetry)
{
return new OtlpExporterRetryTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(exportClient, timeoutMilliseconds);
}
else if (experimentalOptions.EnableDiskRetry)
{
Debug.Assert(!string.IsNullOrEmpty(experimentalOptions.DiskRetryDirectoryPath), $"{nameof(experimentalOptions.DiskRetryDirectoryPath)} is null or empty");

return new OtlpExporterPersistentStorageTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(
exportClient,
timeoutMilliseconds,
(byte[] data) =>
{
var request = new LogOtlpCollector.ExportLogsServiceRequest();
request.MergeFrom(data);
return request;
},
Path.Combine(experimentalOptions.DiskRetryDirectoryPath, "logs"));
}
else
{
return new OtlpExporterTransmissionHandler<LogOtlpCollector.ExportLogsServiceRequest>(exportClient, timeoutMilliseconds);
}
}

public static IExportClient<TraceOtlpCollector.ExportTraceServiceRequest> GetTraceExportClient(this OtlpExporterOptions options) =>
Expand Down
11 changes: 9 additions & 2 deletions src/OpenTelemetry.Exporter.OpenTelemetryProtocol/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -627,10 +627,17 @@ want to solicit feedback from the community.

* `OTEL_DOTNET_EXPERIMENTAL_OTLP_RETRY`

When set to `in_memory`, it enables in-memory retry for transient errors
* When set to `in_memory`, it enables in-memory retry for transient errors
encountered while sending telemetry.

Added in `1.8.0`.
Added in `1.8.0`.

* When set to `disk` along with setting
`OTEL_DOTNET_EXPERIMENTAL_OTLP_DISK_RETRY_DIRECTORY_PATH` to the path on
disk, it enables retries by storing telemetry on disk during transient
errors.

Added in **TBD** (Unreleased).

* Logs

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
#if NETFRAMEWORK
using System.Net.Http;
#endif
using Microsoft.Extensions.Configuration;
using OpenTelemetry.Exporter.OpenTelemetryProtocol.Implementation;
using OpenTelemetry.Exporter.OpenTelemetryProtocol.Implementation.ExportClient;
using OpenTelemetry.Exporter.OpenTelemetryProtocol.Implementation.Transmission;
Expand Down Expand Up @@ -130,16 +131,34 @@ public void AppendPathIfNotPresent_TracesPath_AppendsCorrectly(string inputUri,
}

[Theory]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcTraceExportClient), false, 10000)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpTraceExportClient), false, 10000)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpTraceExportClient), true, 8000)]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcMetricsExportClient), false, 10000)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpMetricsExportClient), false, 10000)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpMetricsExportClient), true, 8000)]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcLogExportClient), false, 10000)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpLogExportClient), false, 10000)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpLogExportClient), true, 8000)]
public void GetTransmissionHandler_InitializesCorrectExportClientAndTimeoutValue(OtlpExportProtocol protocol, Type exportClientType, bool customHttpClient, int expectedTimeoutMilliseconds)
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcTraceExportClient), false, 10000, null)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpTraceExportClient), false, 10000, null)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpTraceExportClient), true, 8000, null)]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcMetricsExportClient), false, 10000, null)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpMetricsExportClient), false, 10000, null)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpMetricsExportClient), true, 8000, null)]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcLogExportClient), false, 10000, null)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpLogExportClient), false, 10000, null)]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpLogExportClient), true, 8000, null)]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcTraceExportClient), false, 10000, "in_memory")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpTraceExportClient), false, 10000, "in_memory")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpTraceExportClient), true, 8000, "in_memory")]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcMetricsExportClient), false, 10000, "in_memory")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpMetricsExportClient), false, 10000, "in_memory")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpMetricsExportClient), true, 8000, "in_memory")]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcLogExportClient), false, 10000, "in_memory")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpLogExportClient), false, 10000, "in_memory")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpLogExportClient), true, 8000, "in_memory")]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcTraceExportClient), false, 10000, "disk")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpTraceExportClient), false, 10000, "disk")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpTraceExportClient), true, 8000, "disk")]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcMetricsExportClient), false, 10000, "disk")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpMetricsExportClient), false, 10000, "disk")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpMetricsExportClient), true, 8000, "disk")]
[InlineData(OtlpExportProtocol.Grpc, typeof(OtlpGrpcLogExportClient), false, 10000, "disk")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpLogExportClient), false, 10000, "disk")]
[InlineData(OtlpExportProtocol.HttpProtobuf, typeof(OtlpHttpLogExportClient), true, 8000, "disk")]
public void GetTransmissionHandler_InitializesCorrectHandlerExportClientAndTimeoutValue(OtlpExportProtocol protocol, Type exportClientType, bool customHttpClient, int expectedTimeoutMilliseconds, string retryStrategy)
{
var exporterOptions = new OtlpExporterOptions() { Protocol = protocol };
if (customHttpClient)
Expand All @@ -150,28 +169,45 @@ public void GetTransmissionHandler_InitializesCorrectExportClientAndTimeoutValue
};
}

var configuration = new ConfigurationBuilder()
.AddInMemoryCollection(new Dictionary<string, string> { [ExperimentalOptions.OtlpRetryEnvVar] = retryStrategy })
.Build();

if (exportClientType == typeof(OtlpGrpcTraceExportClient) || exportClientType == typeof(OtlpHttpTraceExportClient))
{
var transmissionHandler = exporterOptions.GetTraceExportTransmissionHandler(new ExperimentalOptions());
var transmissionHandler = exporterOptions.GetTraceExportTransmissionHandler(new ExperimentalOptions(configuration));

AssertTransmissionHandlerProperties(transmissionHandler, exportClientType, expectedTimeoutMilliseconds);
AssertTransmissionHandler(transmissionHandler, exportClientType, expectedTimeoutMilliseconds, retryStrategy);
}
else if (exportClientType == typeof(OtlpGrpcMetricsExportClient) || exportClientType == typeof(OtlpHttpMetricsExportClient))
{
var transmissionHandler = exporterOptions.GetMetricsExportTransmissionHandler(new ExperimentalOptions());
var transmissionHandler = exporterOptions.GetMetricsExportTransmissionHandler(new ExperimentalOptions(configuration));

AssertTransmissionHandlerProperties(transmissionHandler, exportClientType, expectedTimeoutMilliseconds);
AssertTransmissionHandler(transmissionHandler, exportClientType, expectedTimeoutMilliseconds, retryStrategy);
}
else
{
var transmissionHandler = exporterOptions.GetLogsExportTransmissionHandler(new ExperimentalOptions());
var transmissionHandler = exporterOptions.GetLogsExportTransmissionHandler(new ExperimentalOptions(configuration));

AssertTransmissionHandlerProperties(transmissionHandler, exportClientType, expectedTimeoutMilliseconds);
AssertTransmissionHandler(transmissionHandler, exportClientType, expectedTimeoutMilliseconds, retryStrategy);
}
}

private static void AssertTransmissionHandlerProperties<T>(OtlpExporterTransmissionHandler<T> transmissionHandler, Type exportClientType, int expectedTimeoutMilliseconds)
private static void AssertTransmissionHandler<T>(OtlpExporterTransmissionHandler<T> transmissionHandler, Type exportClientType, int expectedTimeoutMilliseconds, string retryStrategy)
{
if (retryStrategy == "in_memory")
{
Assert.True(transmissionHandler is OtlpExporterRetryTransmissionHandler<T>);
}
else if (retryStrategy == "disk")
{
Assert.True(transmissionHandler is OtlpExporterPersistentStorageTransmissionHandler<T>);
}
else
{
Assert.True(transmissionHandler is OtlpExporterTransmissionHandler<T>);
}

Assert.Equal(exportClientType, transmissionHandler.ExportClient.GetType());

Assert.Equal(expectedTimeoutMilliseconds, transmissionHandler.TimeoutMilliseconds);
Expand Down