Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endless OperationCancelledException errors in GrpcCall after Cancellation in SocketConnectivitySubchannelTransport.TryConnectAsync #2420

Closed
kolonist opened this issue Apr 27, 2024 · 6 comments
Labels
bug Something isn't working
Milestone

Comments

@kolonist
Copy link

Hi, @JamesNK

We detected the following endless errors from several pods when they are trying to perform Grpc calls to several endpoints:
изображение

Exception itself is always the same:

Source context: Grpc.Net.Client.Internal.GrpcCall

Status message: Error connecting to subchannel.

Exception:
System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at System.Threading.CancellationToken.ThrowIfCancellationRequested()
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|285_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt)

Luckily we could capture start of this in another service in Debug:

изображение

After that all requests to URL related to Subchanel id 19-2 (look at the picture above) started to throw the following errors:

изображение

(don't be confused with log messages order because messages with the same timestamp can follow in mixed order)

We could reproduce this error with the following code:

Main routine:

// here is service initialization, client side balancing used, but there is only one available endpoint
// we use latest grpc-dotnet from main branch

for (var i = 0; i < 200; i++)
{
    try
    {
        logger.LogWarning("##### START LOOP");

        var client = host.Services.GetRequiredService<GrpcSomeServiceInterface.GrpcSomeServiceInterfaceClient>();
        var request = new GrpcSomeRequest
        {
            Content = "Random message"
        };

        logger.LogWarning("##### BEFORE Unary call");

        var response = await client.UnaryCallAsync(request);

        logger.LogWarning("##### Response: {Response}", response.Content);
    }
    catch (Exception ex)
    {
        logger.LogError(ex, "##### SOME ERROR");
    }
    finally
    {
        logger.LogWarning("##### BEFORE Delay");
        await Task.Delay(150 * 1000);
        logger.LogWarning("##### END LOOP");
    }
}

In SocketConnectivitySubchannelTransport.TryConnectAsync() we synthetically cancel context as it was cancelled from the outside:

    public async ValueTask<ConnectResult> TryConnectAsync(ConnectContext context, int attempt)
    {
        // ensure we do it only once
        if (!_wasCancellation)
        {
            _logger.LogWarning("##### context.CancelConnect();");

            context.CancelConnect();

            // ensure we do it only once
            _wasCancellation = true; 
        }

        _logger.LogWarning(
            "##### TryConnectAsync context.CancellationToken.IsCancellationRequested = {CancellationRequested}",
            context.CancellationToken.IsCancellationRequested);

        Debug.Assert(CurrentEndPoint == null);

        // continuation...

The output is the following:

[16:42:12 WRN ]  ##### START LOOP
[16:42:12 WRN ]  ##### BEFORE Unary call
[16:42:12 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall'.
[16:42:12 DBG ]  Subchannel id '2-1' created with addresses: 10.124.143.49:82
[16:42:12 WRN ]  ##### SocketConnectivitySubchannelTransport Created
[16:42:12 VRB ]  Subchannel id '2-1' state changed registration '2-1-1' created.
[16:42:12 DBG ]  Channel picker updated.
[16:42:12 VRB ]  Subchannel id '2-1' connection requested.
[16:42:12 DBG ]  Subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'.
[16:42:12 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[16:42:12 VRB ]  Processing subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'.
[16:42:12 DBG ]  Channel state updated to Connecting.
[16:42:12 VRB ]  Pick started.
[16:42:12 DBG ]  Picked queued.
[16:42:12 VRB ]  Waiting for a new picker.
[16:42:12 VRB ]  Subchannel id '2-1' connecting to transport.

[16:42:12 WRN ]  ##### context.CancelConnect();
[16:42:12 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = True

[16:42:12 VRB ]  Subchannel id '2-1' connecting socket to 10.124.143.49:82.
[16:42:12 DBG ]  Subchannel id '2-1' error connecting to socket 10.124.143.49:82.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
[16:42:12 DBG ]  Subchannel id '2-1' state changed to TransientFailure. Detail: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
[16:42:12 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[16:42:12 VRB ]  Processing subchannel id '2-1' state changed to TransientFailure. Detail: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
[16:42:12 DBG ]  Channel state updated to TransientFailure.
[16:42:12 DBG ]  Channel picker updated.
[16:42:12 VRB ]  Subchannel id '2-1' connect canceled.
[16:42:12 VRB ]  Pick started.
[16:42:12 DBG ]  Picked failure with status: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
[16:42:12 DBG ]  Error starting gRPC call.
[16:42:12 INF ]  Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
[16:42:12 DBG ]  Finished gRPC call.
[16:42:12 DBG ]  gRPC call canceled.

[16:42:12 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/ConnectionManager.cs:line 341
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 117
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   at Grpc.Net.Client.Internal.GrpcCall`2.GetResponseHeadersCoreAsync() in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 300
[16:42:12 WRN ]  ##### BEFORE Delay
[16:44:42 WRN ]  ##### END LOOP

[16:44:42 WRN ]  ##### START LOOP
[16:44:42 WRN ]  ##### BEFORE Unary call
[16:44:42 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall'.
[16:44:42 VRB ]  Pick started.

[16:44:42 DBG ]  Picked failure with status: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")

[16:44:42 DBG ]  Error starting gRPC call.
[16:44:42 INF ]  Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
[16:44:42 DBG ]  Finished gRPC call.
[16:44:42 DBG ]  gRPC call canceled.

[16:44:42 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/ConnectionManager.cs:line 341
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 117
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   at SomeName.SomeAnotherName.SomeKindOfInterceptor.Handlers.SomeKindOfInterceptorGrpcInterceptor.HandleExceptionAsync[TResponse](Task`1 response) in /Users/kolonist/dev/SomeAnotherName/src/core/SomeName.SomeAnotherName/SomeKindOfInterceptor/Handlers/SomeKindOfInterceptorGrpcInterceptor.cs:line 88
   at Program.<Main>$(String[] args) in /Users/kolonist/dev/SomeAnotherName/sandbox/SomeName.SomeAnotherName.Sandbox.Grpc/Program.cs:line 48
[16:44:42 WRN ]  ##### BEFORE Delay
[16:47:12 WRN ]  ##### END LOOP

[16:47:12 WRN ]  ##### START LOOP
[16:47:12 WRN ]  ##### BEFORE Unary call
[16:47:12 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall'.
[16:47:12 VRB ]  Pick started.
[16:47:12 DBG ]  Picked failure with status: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
[16:47:12 DBG ]  Error starting gRPC call.
[16:47:12 INF ]  Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
[16:47:12 DBG ]  Finished gRPC call.
[16:47:12 DBG ]  gRPC call canceled.
[16:47:12 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/ConnectionManager.cs:line 341
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 117
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   at SomeName.SomeAnotherName.SomeKindOfInterceptor.Handlers.SomeKindOfInterceptorGrpcInterceptor.HandleExceptionAsync[TResponse](Task`1 response) in /Users/kolonist/dev/mega_app/src/core/SomeName.SomeAnotherName/SomeKindOfInterceptor/Handlers/SomeKindOfInterceptorGrpcInterceptor.cs:line 88
   at Program.<Main>$(String[] args) in /Users/kolonist/dev/mega_app/sandbox/SomeName.SomeAnotherName.Sandbox.Grpc/Program.cs:line 48
[16:47:12 WRN ]  ##### BEFORE Delay
[16:49:42 WRN ]  ##### END LOOP

[16:49:42 WRN ]  ##### START LOOP
[16:49:42 WRN ]  ##### BEFORE Unary call
[16:49:42 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall'.
[16:49:42 VRB ]  Pick started.
[16:49:42 DBG ]  Picked failure with status: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
[16:49:42 DBG ]  Error starting gRPC call.
[16:49:42 INF ]  Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
[16:49:42 DBG ]  Finished gRPC call.
[16:49:42 DBG ]  gRPC call canceled.
[16:49:42 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 186
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/ConnectionManager.cs:line 341
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 117
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/kolonist/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   at SomeName.SomeAnotherName.SomeKindOfInterceptor.Handlers.SomeKindOfInterceptorGrpcInterceptor.HandleExceptionAsync[TResponse](Task`1 response) in /Users/kolonist/dev/mega_app/src/core/SomeName.SomeAnotherName/SomeKindOfInterceptor/Handlers/SomeKindOfInterceptorGrpcInterceptor.cs:line 88
   at Program.<Main>$(String[] args) in /Users/kolonist/dev/mega_app/sandbox/SomeName.SomeAnotherName.Sandbox.Grpc/Program.cs:line 48
[16:49:42 WRN ]  ##### BEFORE Delay

... And so on...
... And so on...
... And so on...

As you can see SocketConnectivitySubchannelTransport.TryConnectAsync() never called again. No backoff or subchannels recreation ever happen.

If we have no exception in TryConnectAsync() then we expect to get the following output:

[16:57:27 WRN ]  ##### START LOOP
[16:57:27 WRN ]  ##### BEFORE Unary call
[16:57:27 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall'.
[16:57:27 DBG ]  Subchannel id '2-1' created with addresses: 10.124.143.49:82
[16:57:27 WRN ]  ##### SocketConnectivitySubchannelTransport Created
[16:57:27 VRB ]  Subchannel id '2-1' state changed registration '2-1-1' created.
[16:57:27 DBG ]  Channel picker updated.
[16:57:27 VRB ]  Subchannel id '2-1' connection requested.
[16:57:27 DBG ]  Subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'.
[16:57:27 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[16:57:27 VRB ]  Processing subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'.
[16:57:27 DBG ]  Channel state updated to Connecting.
[16:57:27 VRB ]  Subchannel id '2-1' connecting to transport.
[16:57:27 VRB ]  Pick started.
[16:57:27 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = False
[16:57:27 DBG ]  Picked queued.
[16:57:27 VRB ]  Waiting for a new picker.
[16:57:27 VRB ]  Subchannel id '2-1' connecting socket to 10.124.143.49:82.
[16:57:27 DBG ]  Subchannel id '2-1' connected to socket 10.124.143.49:82.
[16:57:27 DBG ]  Subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[16:57:27 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[16:57:27 VRB ]  Processing subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[16:57:27 DBG ]  Channel state updated to Ready.
[16:57:27 DBG ]  Channel picker updated.
[16:57:27 VRB ]  Pick started.
[16:57:27 DBG ]  Successfully picked subchannel id '2-1' with address 10.124.143.49:82. Transport status: InitialSocket
[16:57:27 VRB ]  Sending request http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall.
[16:57:27 VRB ]  Starting connect callback for 10.124.143.49:82.
[16:57:27 VRB ]  Subchannel id '2-1' creating stream for 10.124.143.49:82.
[16:57:27 VRB ]  Subchannel id '2-1' checking socket 10.124.143.49:82.
[16:57:27 VRB ]  Subchannel id '2-1' created stream for 10.124.143.49:82 with 0 buffered bytes. Transport has 1 active streams.
[16:57:27 DBG ]  Sending message.
[16:57:27 VRB ]  Serialized 'SomeName.SomeAnotherName.SomeService.V1.GrpcSomeRequest' to 16 byte message.
[16:57:27 VRB ]  Message sent.
[16:57:27 VRB ]  Response headers received.
[16:57:27 DBG ]  Reading message.
[16:57:27 VRB ]  Deserializing 16 byte message to 'SomeName.SomeAnotherName.SomeService.V1.GrpcSomeResponse'.
[16:57:27 VRB ]  Received message.
[16:57:27 DBG ]  Finished gRPC call.
[16:57:27 WRN ]  ##### Response: Random message
[16:57:27 WRN ]  ##### BEFORE Delay
[16:58:42 VRB ]  Subchannel id '2-1' disposing stream for 10.124.143.49:82. Transport has 0 active streams.
[16:58:42 DBG ]  Subchannel id '2-1' state changed to Idle. Detail: 'Disconnected.'.
[16:58:42 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[16:58:42 VRB ]  Processing subchannel id '2-1' state changed to Idle. Detail: 'Disconnected.'.
[16:58:42 DBG ]  Channel state updated to Idle.
[16:58:42 DBG ]  Channel picker updated.
[16:59:57 WRN ]  ##### END LOOP

[16:59:57 WRN ]  ##### START LOOP
[16:59:57 WRN ]  ##### BEFORE Unary call
[16:59:57 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall'.
[16:59:57 VRB ]  Pick started.
[16:59:57 VRB ]  Subchannel id '2-1' connection requested.
[16:59:57 DBG ]  Subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'.
[16:59:57 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[16:59:57 VRB ]  Processing subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'.
[16:59:57 DBG ]  Channel state updated to Connecting.
[16:59:57 DBG ]  Channel picker updated.
[16:59:57 DBG ]  Picked subchannel id '2-1' doesn't have a current address.
[16:59:57 VRB ]  Pick started.
[16:59:57 DBG ]  Picked queued.
[16:59:57 VRB ]  Waiting for a new picker.
[16:59:57 VRB ]  Subchannel id '2-1' connecting to transport.
[16:59:57 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = False
[16:59:57 VRB ]  Subchannel id '2-1' connecting socket to 10.124.143.49:82.
[16:59:58 DBG ]  Subchannel id '2-1' connected to socket 10.124.143.49:82.
[16:59:58 DBG ]  Subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[16:59:58 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[16:59:58 VRB ]  Processing subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[16:59:58 DBG ]  Channel state updated to Ready.
[16:59:58 DBG ]  Channel picker updated.
[16:59:58 VRB ]  Pick started.
[16:59:58 DBG ]  Successfully picked subchannel id '2-1' with address 10.124.143.49:82. Transport status: InitialSocket
[16:59:58 VRB ]  Sending request http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall.
[16:59:58 VRB ]  Starting connect callback for 10.124.143.49:82.
[16:59:58 VRB ]  Subchannel id '2-1' creating stream for 10.124.143.49:82.
[16:59:58 VRB ]  Subchannel id '2-1' checking socket 10.124.143.49:82.
[16:59:58 VRB ]  Subchannel id '2-1' created stream for 10.124.143.49:82 with 0 buffered bytes. Transport has 1 active streams.
[16:59:58 DBG ]  Sending message.
[16:59:58 VRB ]  Serialized 'SomeName.SomeAnotherName.SomeService.V1.GrpcSomeRequest' to 16 byte message.
[16:59:58 VRB ]  Message sent.
[16:59:58 VRB ]  Response headers received.
[16:59:58 DBG ]  Reading message.
[16:59:58 VRB ]  Deserializing 16 byte message to 'SomeName.SomeAnotherName.SomeService.V1.GrpcSomeResponse'.
[16:59:58 VRB ]  Received message.
[16:59:58 DBG ]  Finished gRPC call.
[16:59:58 WRN ]  ##### Response: Random message
[16:59:58 WRN ]  ##### BEFORE Delay
[17:01:13 VRB ]  Subchannel id '2-1' disposing stream for 10.124.143.49:82. Transport has 0 active streams.
[17:01:13 DBG ]  Subchannel id '2-1' state changed to Idle. Detail: 'Disconnected.'.
[17:01:13 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[17:01:13 VRB ]  Processing subchannel id '2-1' state changed to Idle. Detail: 'Disconnected.'.
[17:01:13 DBG ]  Channel state updated to Idle.
[17:01:13 DBG ]  Channel picker updated.
[17:02:28 WRN ]  ##### END LOOP

[17:02:28 WRN ]  ##### START LOOP
[17:02:28 WRN ]  ##### BEFORE Unary call
[17:02:28 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall'.
[17:02:28 VRB ]  Pick started.
[17:02:28 VRB ]  Subchannel id '2-1' connection requested.
[17:02:28 DBG ]  Subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'.
[17:02:28 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[17:02:28 VRB ]  Processing subchannel id '2-1' state changed to Connecting. Detail: 'Connection requested.'.
[17:02:28 DBG ]  Channel state updated to Connecting.
[17:02:28 DBG ]  Channel picker updated.
[17:02:28 DBG ]  Picked subchannel id '2-1' doesn't have a current address.
[17:02:28 VRB ]  Pick started.
[17:02:28 DBG ]  Picked queued.
[17:02:28 VRB ]  Waiting for a new picker.
[17:02:28 VRB ]  Subchannel id '2-1' connecting to transport.
[17:02:28 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = False
[17:02:28 VRB ]  Subchannel id '2-1' connecting socket to 10.124.143.49:82.
[17:02:28 DBG ]  Subchannel id '2-1' connected to socket 10.124.143.49:82.
[17:02:28 DBG ]  Subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[17:02:28 VRB ]  Subchannel id '2-1' executing state changed registration '2-1-1'.
[17:02:28 VRB ]  Processing subchannel id '2-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[17:02:28 DBG ]  Channel state updated to Ready.
[17:02:28 DBG ]  Channel picker updated.
[17:02:28 VRB ]  Pick started.
[17:02:28 DBG ]  Successfully picked subchannel id '2-1' with address 10.124.143.49:82. Transport status: InitialSocket
[17:02:28 VRB ]  Sending request http://10.124.143.49:82/SomeName.SomeAnotherName.SomeService.V1.GrpcSomeServiceInterface/UnaryCall.
[17:02:28 VRB ]  Starting connect callback for 10.124.143.49:82.
[17:02:28 VRB ]  Subchannel id '2-1' creating stream for 10.124.143.49:82.
[17:02:28 VRB ]  Subchannel id '2-1' checking socket 10.124.143.49:82.
[17:02:28 VRB ]  Subchannel id '2-1' created stream for 10.124.143.49:82 with 0 buffered bytes. Transport has 1 active streams.
[17:02:28 DBG ]  Sending message.
[17:02:28 VRB ]  Serialized 'SomeName.SomeAnotherName.SomeService.V1.GrpcSomeRequest' to 16 byte message.
[17:02:28 VRB ]  Message sent.
[17:02:28 VRB ]  Response headers received.
[17:02:28 DBG ]  Reading message.
[17:02:28 VRB ]  Deserializing 16 byte message to 'SomeName.SomeAnotherName.SomeService.V1.GrpcSomeResponse'.
[17:02:28 VRB ]  Received message.
[17:02:28 DBG ]  Finished gRPC call.
[17:02:28 WRN ]  ##### Response: Random message
[17:02:28 WRN ]  ##### BEFORE Delay

... And so on...
... And so on...
... And so on...

We found no workaround yet, so It would be great if you take a look at this issue.

Best regards, Alexander

@JamesNK
Copy link
Member

JamesNK commented Apr 29, 2024

Thanks for the detailed report. I can reproduce the error by adding the code you suggested. However, that doesn't answer the question of how the subchannel got in that state. I haven't reproduced it through normal usage.

I have a theory about what is causing this. PR: #2422

@kolonist
Copy link
Author

kolonist commented May 6, 2024

Hi, @JamesNK

Thank you for such quick solution. We've tested it in our environment and the problem is no longer reproduced.

Please could you tell us when are you gonna to build release version of grpc-dotnet with this fix?

@JamesNK
Copy link
Member

JamesNK commented May 8, 2024

https://www.nuget.org/packages/Grpc.Net.Client/2.63.0-pre1 is on NuGet with this change. A non-preview version will come in about 2 weeks.

@kolonist
Copy link
Author

kolonist commented Jun 3, 2024

Hi, @JamesNK

We have another case leading the problem on grpc-dotnet version is stable 2.63 (branch v2.63.x).

Reproduction:

We use GRPC to connect to third-party server written in Go programming language (if it is important) with TLS certificates to authenticate.

Code used to authenticate:

.ConfigureChannel((provider, options) =>
{
    // very simple factory with very simple resolver
    ResolverFactory factory = provider.GetRequiredService<MyResolverFactory>();

    var grpcServices = new ServiceCollection();
    grpcServices.AddSingleton(factory);

    options.ServiceProvider = grpcServices.BuildServiceProvider();
 
    options.Credentials = ChannelCredentials.SecureSsl;
 
    options.ServiceConfig = new ServiceConfig
    {
        LoadBalancingConfigs = { new RoundRobinConfig() },
        MethodConfigs = {
            new MethodConfig
            {
                RetryPolicy = new RetryPolicy
                {
                    MaxAttempts = ...,
                    RetryableStatusCodes =
                    {
                        StatusCode.Aborted,
                        StatusCode.DeadlineExceeded,
                        StatusCode.ResourceExhausted,
                        StatusCode.Unavailable
                    }
                }
            }
        }
    };
})
.ConfigurePrimaryHttpMessageHandler(serviceProvider =>
{
    var socketsHandler = new SocketsHttpHandler
    {
        ConnectTimeout = ...,
        PooledConnectionIdleTimeout = ...,
        KeepAlivePingDelay = ...,
        KeepAlivePingTimeout = ...,
        EnableMultipleHttp2Connections = true
    };

    var clientX509Certificate = X509Certificate2.CreateFromPem(ClientCert, ClientKey);

    var clientX509CertificatesCollection = new X509CertificateCollection(
        new X509Certificate[]
        {
            clientX509Certificate
        });

    socketsHandler.SslOptions.ClientCertificates = clientX509CertificatesCollection;

    var certificatesChain = ParseCertificates(certificates.CaCertChain);

    socketsHandler.SslOptions.RemoteCertificateValidationCallback = CreateCertificateValidator(certificatesChain);

    return socketsHandler;
});

...
private static RemoteCertificateValidationCallback CreateCertificateValidator(X509Certificate2[] ca)
{
    return (_, cert, chain, _) =>
    {
        if (chain is null || cert is null)
        {
            return false;
        }

        chain.ChainPolicy.TrustMode = X509ChainTrustMode.CustomRootTrust;
        chain.ChainPolicy.CustomTrustStore.AddRange(ca);
        return chain.Build(new X509Certificate2(cert));
    };
}

I use this simple code to reproduce this issue:

var client = host.Services.GetRequiredService<3rdPartyService.3rdPartyServiceClient>();

for (var i = 0; i < 200; i++)
{
    try
    {
        logger.LogWarning("##### START LOOP");

        var request = new 3rdPartyServiceRequest
        {
            SomeFieldName = ByteString.CopyFrom("lllllllll", Encoding.UTF8)
        };

        logger.LogWarning("##### BEFORE Unary call");

        var response = await client.3rdPartyServicePerformRequestAsync(request);

        logger.LogWarning("##### Response: {Response}", response.SomeResponceFieldName.ToString());
    }
    catch (Exception ex)
    {
        logger.LogError(ex, "##### SOME ERROR");
    }
    finally
    {
        logger.LogWarning("##### BEFORE Delay");
        // await Task.Delay(1 * 1000);
        logger.LogWarning("##### END LOOP");
    }
}

To reproduce the issue we need to specify well-formed TLS-certificate but without proper permissions (there were some cases when we used invalid certificates and got the same issue but we can't reproduce it now). For this example I took only one endpoint in balancing, but if I have more then I get the same result for every endpoint.

Log is as foollows:

...
# application logs, nothing useful
...

[13:41:56 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'https://loadbalancer.temporary.invalid/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest'.
[13:41:56 VRB ]  Starting deadline timeout. Duration: 00:00:29.9928750.
[13:41:56 VRB ]  MyResolver refresh requested.
[13:41:56 WRN ]  ##### START LOOP
[13:41:56 WRN ]  ##### BEFORE Unary call
[13:41:56 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'https://loadbalancer.temporary.invalid/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest'.
[13:41:56 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:56 DBG ]  Subchannel id '3-1' created with addresses: service-endpoint.zone:2379

# it's from `SocketConnectivitySubchannelTransport.cs`, constructor, after all initialization
[13:41:56 WRN ]  ##### SocketConnectivitySubchannelTransport Created

[13:41:56 VRB ]  Subchannel id '3-1' state changed registration '3-1-1' created.
[13:41:56 VRB ]  Subchannel id '3-1' connection requested.
[13:41:56 DBG ]  Subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:56 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:56 VRB ]  Processing subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:56 DBG ]  Channel state updated to Connecting.
[13:41:56 DBG ]  Channel picker updated.
[13:41:56 VRB ]  MyResolver resolve task completed.
[13:41:56 VRB ]  Pick started.
[13:41:56 DBG ]  Picked queued.
[13:41:56 VRB ]  Waiting for a new picker.
[13:41:56 VRB ]  Pick started.
[13:41:56 DBG ]  Picked queued.
[13:41:56 VRB ]  Waiting for a new picker.
[13:41:56 VRB ]  Subchannel id '3-1' connecting to transport.

# it's from `SocketConnectivitySubchannelTransport.cs`, `TryConnectAsync()`, first line of the method
[13:41:56 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = False

[13:41:56 VRB ]  Subchannel id '3-1' connecting socket to service-endpoint.zone:2379.
[13:41:56 DBG ]  Subchannel id '3-1' connected to socket service-endpoint.zone:2379.
[13:41:56 DBG ]  Subchannel id '3-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[13:41:56 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:56 VRB ]  Processing subchannel id '3-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[13:41:56 VRB ]  Creating ready picker with 1 subchannels: id '3-1' (service-endpoint.zone:2379)
[13:41:56 DBG ]  Channel state updated to Ready.
[13:41:56 DBG ]  Channel picker updated.
[13:41:56 VRB ]  Pick started.
[13:41:56 VRB ]  Pick started.
[13:41:56 DBG ]  Successfully picked subchannel id '3-1' with address service-endpoint.zone:2379. Transport status: InitialSocket
[13:41:56 VRB ]  Sending request https://service-endpoint.zone:2379/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest.
[13:41:56 DBG ]  Successfully picked subchannel id '3-1' with address service-endpoint.zone:2379. Transport status: InitialSocket
[13:41:56 VRB ]  Sending request https://service-endpoint.zone:2379/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest.
[13:41:56 VRB ]  Starting connect callback for service-endpoint.zone:2379.
[13:41:56 VRB ]  Subchannel id '3-1' creating stream for service-endpoint.zone:2379.
[13:41:56 VRB ]  Subchannel id '3-1' checking socket service-endpoint.zone:2379.
[13:41:56 VRB ]  Subchannel id '3-1' created stream for service-endpoint.zone:2379 with 0 buffered bytes. Transport has 1 active streams.
[13:41:57 VRB ]  Subchannel id '3-1' disposing stream for service-endpoint.zone:2379. Transport has 0 active streams.
[13:41:57 DBG ]  Subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:57 DBG ]  Channel state updated to Connecting.
[13:41:57 DBG ]  Channel picker updated.
[13:41:57 VRB ]  Refreshing resolver because subchannel id '3-1' is in state Idle.
[13:41:57 VRB ]  MyResolver refresh requested.
[13:41:57 VRB ]  Requesting connection for subchannel id '3-1' because it is in state Idle.
[13:41:57 VRB ]  Subchannel id '3-1' connection requested.
[13:41:57 DBG ]  Subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:57 VRB ]  Starting connect callback for service-endpoint.zone:2379.
[13:41:57 VRB ]  Subchannel id '3-1' creating stream for service-endpoint.zone:2379.
[13:41:57 VRB ]  Subchannel id '3-1' doesn't have a connected socket available. Connecting new stream socket for service-endpoint.zone:2379.
[13:41:57 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:57 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:57 VRB ]  Connections unchanged.
[13:41:57 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:57 VRB ]  Subchannel id '3-1' updated with addresses: service-endpoint.zone:2379
[13:41:57 DBG ]  Subchannel id '3-1' is connecting when its addresses are updated. Restarting connect.
[13:41:57 DBG ]  Subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:57 VRB ]  Refreshing resolver because subchannel id '3-1' is in state Idle.
[13:41:57 VRB ]  MyResolver refresh requested.
[13:41:57 VRB ]  MyResolver refresh ignored because resolve is already in progress.
[13:41:57 VRB ]  Requesting connection for subchannel id '3-1' because it is in state Idle.
[13:41:57 VRB ]  Subchannel id '3-1' connection requested.
[13:41:57 DBG ]  Subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:57 DBG ]  Subchannel id '3-1' connection requested in non-idle state of Connecting.
[13:41:57 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:57 VRB ]  Connections unchanged.
[13:41:57 VRB ]  MyResolver resolve task completed.
[13:41:57 VRB ]  Subchannel id '3-1' connecting to transport.
[13:41:57 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = False
[13:41:57 VRB ]  Subchannel id '3-1' connecting socket to service-endpoint.zone:2379.
[13:41:57 DBG ]  Subchannel id '3-1' canceling connect.
[13:41:57 VRB ]  Subchannel id '3-1' connecting to transport.
[13:41:57 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = False
[13:41:57 VRB ]  Subchannel id '3-1' connecting socket to service-endpoint.zone:2379.
[13:41:57 DBG ]  Subchannel id '3-1' error connecting to socket service-endpoint.zone:2379.
System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at System.Threading.CancellationToken.ThrowIfCancellationRequested()
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|285_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:41:57 DBG ]  Subchannel id '3-1' state changed to TransientFailure. Detail: 'Error connecting to subchannel.'.
System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at System.Threading.CancellationToken.ThrowIfCancellationRequested()
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|285_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to TransientFailure. Detail: 'Error connecting to subchannel.'.
System.OperationCanceledException: The operation was canceled.
   at System.Threading.CancellationToken.ThrowOperationCanceledException()
   at System.Threading.CancellationToken.ThrowIfCancellationRequested()
   at System.Net.Sockets.Socket.<ConnectAsync>g__WaitForConnectWithCancellation|285_0(AwaitableSocketAsyncEventArgs saea, ValueTask connectTask, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:41:57 DBG ]  Channel state updated to TransientFailure.
[13:41:57 DBG ]  Channel picker updated.
[13:41:57 VRB ]  Refreshing resolver because subchannel id '3-1' is in state TransientFailure.
[13:41:57 VRB ]  MyResolver refresh requested.
[13:41:57 VRB ]  Subchannel id '3-1' connect canceled.
[13:41:57 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:57 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:57 VRB ]  Connections unchanged.
[13:41:57 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:57 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:57 VRB ]  Connections unchanged.
[13:41:57 VRB ]  MyResolver resolve task completed.

# Exception we expect - The SSL connection could not be established, see inner exception.
# AuthenticationException: Authentication failed, see inner exception. SslException: unknown Cert Authority
# OK here

[13:41:57 DBG ]  Error starting gRPC call.

[13:41:57 INF ]  Call failed with gRPC error status. Status code: 'Internal', Message: 'Error starting gRPC call. HttpRequestException: The SSL connection could not be established, see inner exception. AuthenticationException: Authentication failed, see inner exception. SslException: unknown Cert Authority'.
System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.Security.Authentication.AuthenticationException: Authentication failed, see inner exception.
 ---> Interop+AppleCrypto+SslException: unknown Cert Authority
   --- End of inner exception stack trace ---
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](Boolean receiveFirst, Byte[] reAuthenticationData, CancellationToken cancellationToken)
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp2ConnectionAsync(QueueItem queueItem)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at MyTestApp.Grpc.Client.ClientSideBalancing.Handlers.GrpcLoadStatisticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/path/to/testapp/MyTestApp/Grpc/Client/ClientSideBalancing/Handlers/GrpcLoadStatisticsHandler.cs:line 27
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 150
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502

[13:41:57 DBG ]  Finished gRPC call.
[13:41:57 VRB ]  Subchannel id '3-1' created stream for service-endpoint.zone:2379 with 0 buffered bytes. Transport has 1 active streams.
[13:41:57 DBG ]  Subchannel id '3-1' connected to socket service-endpoint.zone:2379.
[13:41:57 DBG ]  Subchannel id '3-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[13:41:57 VRB ]  Creating ready picker with 1 subchannels: id '3-1' (service-endpoint.zone:2379)
[13:41:57 DBG ]  Channel state updated to Ready.
[13:41:57 DBG ]  Channel picker updated.
[13:41:57 DBG ]  gRPC call canceled.

# We captured exception above in `catch` block, it's OK

[13:41:57 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Internal", Detail="Error starting gRPC call. HttpRequestException: The SSL connection could not be established, see inner exception. AuthenticationException: Authentication failed, see inner exception. SslException: unknown Cert Authority", DebugException="System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.")
 ---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.Security.Authentication.AuthenticationException: Authentication failed, see inner exception.
 ---> Interop+AppleCrypto+SslException: unknown Cert Authority
   --- End of inner exception stack trace ---
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](Boolean receiveFirst, Byte[] reAuthenticationData, CancellationToken cancellationToken)
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp2ConnectionAsync(QueueItem queueItem)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at MyTestApp.Grpc.Client.ClientSideBalancing.Handlers.GrpcLoadStatisticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/path/to/testapp/MyTestApp/Grpc/Client/ClientSideBalancing/Handlers/GrpcLoadStatisticsHandler.cs:line 27
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 150
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Internal.GrpcCall`2.GetResponseHeadersCoreAsync() in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 300
   at MyTestApp.CircuitBreaker.Handlers.CircuitBreakerGrpcInterceptor.HandleExceptionAsync[TResponse](Task`1 response) in /Users/username/dev/path/to/testapp/MyTestApp/CircuitBreaker/Handlers/CircuitBreakerGrpcInterceptor.cs:line 88
   at Program.<Main>$(String[] args) in /Users/username/dev/testapp/path/MyTest.Grpc/Program.cs:line 60

[13:41:57 WRN ]  ##### BEFORE Delay

[13:41:57 VRB ]  Subchannel id '3-1' disposing stream for service-endpoint.zone:2379. Transport has 0 active streams.
[13:41:57 DBG ]  Subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:57 DBG ]  Channel state updated to Connecting.
[13:41:57 DBG ]  Channel picker updated.
[13:41:57 VRB ]  Refreshing resolver because subchannel id '3-1' is in state Idle.
[13:41:57 VRB ]  MyResolver refresh requested.
[13:41:57 VRB ]  Requesting connection for subchannel id '3-1' because it is in state Idle.
[13:41:57 VRB ]  Subchannel id '3-1' connection requested.
[13:41:57 DBG ]  Subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.

[13:41:57 DBG ]  Error starting gRPC call.

# The same exception again (But why? The same call again in another thread?)

[13:41:57 INF ]  Call failed with gRPC error status. Status code: 'Internal', Message: 'Error starting gRPC call. HttpRequestException: The SSL connection could not be established, see inner exception. AuthenticationException: Authentication failed, see inner exception. SslException: unknown Cert Authority'.
System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.Security.Authentication.AuthenticationException: Authentication failed, see inner exception.
 ---> Interop+AppleCrypto+SslException: unknown Cert Authority
   --- End of inner exception stack trace ---
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](Boolean receiveFirst, Byte[] reAuthenticationData, CancellationToken cancellationToken)
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp2ConnectionAsync(QueueItem queueItem)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at MyTestApp.Grpc.Client.ClientSideBalancing.Handlers.GrpcLoadStatisticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/path/to/testapp/MyTestApp/Grpc/Client/ClientSideBalancing/Handlers/GrpcLoadStatisticsHandler.cs:line 27
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 150
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502

[13:41:57 DBG ]  Finished gRPC call.
[13:41:57 DBG ]  gRPC call canceled.
[13:41:57 VRB ]  Subchannel id '3-1' connecting to transport.
[13:41:57 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = False
[13:41:57 VRB ]  Subchannel id '3-1' connecting socket to service-endpoint.zone:2379.
[13:41:57 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:57 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:57 VRB ]  Connections unchanged.
[13:41:57 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:57 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:57 VRB ]  Connections unchanged.
[13:41:57 VRB ]  MyResolver resolve task completed.
[13:41:57 DBG ]  Subchannel id '3-1' connected to socket service-endpoint.zone:2379.
[13:41:57 DBG ]  Subchannel id '3-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[13:41:57 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:57 VRB ]  Processing subchannel id '3-1' state changed to Ready. Detail: 'Successfully connected to socket.'.
[13:41:57 VRB ]  Creating ready picker with 1 subchannels: id '3-1' (service-endpoint.zone:2379)
[13:41:57 DBG ]  Channel state updated to Ready.
[13:41:57 DBG ]  Channel picker updated.

[13:41:58 WRN ]  ##### END LOOP

[13:41:58 WRN ]  ##### START LOOP

# Second attempt

[13:41:58 WRN ]  ##### BEFORE Unary call
[13:41:58 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'https://loadbalancer.temporary.invalid/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest'.
[13:41:58 VRB ]  Pick started.
[13:41:58 DBG ]  Successfully picked subchannel id '3-1' with address service-endpoint.zone:2379. Transport status: InitialSocket
[13:41:58 VRB ]  Sending request https://service-endpoint.zone:2379/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest.
[13:41:58 VRB ]  Starting connect callback for service-endpoint.zone:2379.
[13:41:58 VRB ]  Subchannel id '3-1' creating stream for service-endpoint.zone:2379.
[13:41:58 VRB ]  Subchannel id '3-1' checking socket service-endpoint.zone:2379.
[13:41:58 VRB ]  Subchannel id '3-1' created stream for service-endpoint.zone:2379 with 0 buffered bytes. Transport has 1 active streams.
[13:41:58 VRB ]  Subchannel id '3-1' disposing stream for service-endpoint.zone:2379. Transport has 0 active streams.
[13:41:58 DBG ]  Subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:58 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:58 VRB ]  Processing subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:58 DBG ]  Channel state updated to Connecting.
[13:41:58 DBG ]  Channel picker updated.
[13:41:58 VRB ]  Refreshing resolver because subchannel id '3-1' is in state Idle.
[13:41:58 VRB ]  MyResolver refresh requested.
[13:41:58 VRB ]  Requesting connection for subchannel id '3-1' because it is in state Idle.
[13:41:58 VRB ]  Subchannel id '3-1' connection requested.
[13:41:58 DBG ]  Subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:58 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:58 VRB ]  Processing subchannel id '3-1' state changed to Connecting. Detail: 'Connection requested.'.
[13:41:58 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:58 VRB ]  Subchannel id '3-1' connecting to transport.
[13:41:58 WRN ]  ##### TryConnectAsync context.CancellationToken.IsCancellationRequested = False
[13:41:58 VRB ]  Subchannel id '3-1' updated with addresses: service-endpoint.zone:2379
[13:41:58 DBG ]  Subchannel id '3-1' is connecting when its addresses are updated. Restarting connect.
[13:41:58 DBG ]  Subchannel id '3-1' canceling connect.
[13:41:58 DBG ]  Subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:58 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:58 VRB ]  Processing subchannel id '3-1' state changed to Idle. Detail: 'Disconnected.'.
[13:41:58 VRB ]  Refreshing resolver because subchannel id '3-1' is in state Idle.
[13:41:58 VRB ]  MyResolver refresh requested.
[13:41:58 VRB ]  MyResolver refresh ignored because resolve is already in progress.
[13:41:58 VRB ]  Requesting connection for subchannel id '3-1' because it is in state Idle.
[13:41:58 DBG ]  Subchannel id '3-1' connection requested in non-idle state of Connecting.
[13:41:58 DBG ]  Subchannel id '3-1' connection requested in non-idle state of Connecting.
[13:41:58 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:58 VRB ]  Connections unchanged.
[13:41:58 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:58 DBG ]  Subchannel id '3-1' state changed to Connecting. Detail: 'Connecting to socket.'.
[13:41:58 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.
[13:41:58 VRB ]  Processing subchannel id '3-1' state changed to Connecting. Detail: 'Connecting to socket.'.
[13:41:58 VRB ]  Subchannel id '3-1' connecting socket to service-endpoint.zone:2379.
[13:41:58 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:58 VRB ]  Connections unchanged.
[13:41:58 VRB ]  MyResolver resolve task completed.
[13:41:58 DBG ]  Error starting gRPC call.
[13:41:58 DBG ]  Subchannel id '3-1' error connecting to socket service-endpoint.zone:2379.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:41:58 DBG ]  Subchannel id '3-1' state changed to TransientFailure. Detail: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:41:58 VRB ]  Subchannel id '3-1' executing state changed registration '3-1-1'.

# still expected exception, it's OK

[13:41:58 INF ]  Call failed with gRPC error status. Status code: 'Internal', Message: 'Error starting gRPC call. HttpRequestException: The SSL connection could not be established, see inner exception. AuthenticationException: Authentication failed, see inner exception. SslException: unknown Cert Authority'.
System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.Security.Authentication.AuthenticationException: Authentication failed, see inner exception.
 ---> Interop+AppleCrypto+SslException: unknown Cert Authority
   --- End of inner exception stack trace ---
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](Boolean receiveFirst, Byte[] reAuthenticationData, CancellationToken cancellationToken)
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp2ConnectionAsync(QueueItem queueItem)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at MyTestApp.Grpc.Client.ClientSideBalancing.Handlers.GrpcLoadStatisticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/path/to/testapp/MyTestApp/Grpc/Client/ClientSideBalancing/Handlers/GrpcLoadStatisticsHandler.cs:line 27
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 150
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
[13:41:58 DBG ]  Finished gRPC call.
[13:41:58 DBG ]  gRPC call canceled.
[13:41:58 VRB ]  Processing subchannel id '3-1' state changed to TransientFailure. Detail: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:41:58 DBG ]  Channel state updated to TransientFailure.
[13:41:58 DBG ]  Channel picker updated.
[13:41:58 VRB ]  Refreshing resolver because subchannel id '3-1' is in state TransientFailure.
[13:41:58 VRB ]  MyResolver refresh requested.

# catched exception

[13:41:58 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Internal", Detail="Error starting gRPC call. HttpRequestException: The SSL connection could not be established, see inner exception. AuthenticationException: Authentication failed, see inner exception. SslException: unknown Cert Authority", DebugException="System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.")
 ---> System.Net.Http.HttpRequestException: The SSL connection could not be established, see inner exception.
 ---> System.Security.Authentication.AuthenticationException: Authentication failed, see inner exception.
 ---> Interop+AppleCrypto+SslException: unknown Cert Authority
   --- End of inner exception stack trace ---
   at System.Net.Security.SslStream.ForceAuthenticationAsync[TIOAdapter](Boolean receiveFirst, Byte[] reAuthenticationData, CancellationToken cancellationToken)
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   --- End of inner exception stack trace ---
   at System.Net.Http.ConnectHelper.EstablishSslConnectionAsync(SslClientAuthenticationOptions sslOptions, HttpRequestMessage request, Boolean async, Stream stream, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.ConnectAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.AddHttp2ConnectionAsync(QueueItem queueItem)
   at System.Threading.Tasks.TaskCompletionSourceWithCancellation`1.WaitWithCancellationAsync(CancellationToken cancellationToken)
   at System.Net.Http.HttpConnectionPool.SendWithVersionDetectionAndRetryAsync(HttpRequestMessage request, Boolean async, Boolean doRequestAuth, CancellationToken cancellationToken)
   at System.Net.Http.DiagnosticsHandler.SendAsyncCore(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at System.Net.Http.RedirectHandler.SendAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)
   at Microsoft.Extensions.Http.Logging.LoggingHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at MyTestApp.Grpc.Client.ClientSideBalancing.Handlers.GrpcLoadStatisticsHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/path/to/testapp/MyTestApp/Grpc/Client/ClientSideBalancing/Handlers/GrpcLoadStatisticsHandler.cs:line 27
   at Microsoft.Extensions.Http.Logging.LoggingScopeHttpMessageHandler.<SendCoreAsync>g__Core|5_0(HttpRequestMessage request, Boolean useAsync, CancellationToken cancellationToken)
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 150
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Internal.GrpcCall`2.GetResponseHeadersCoreAsync() in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 300

# No second background request this time

[13:41:58 WRN ]  ##### BEFORE Delay
[13:41:58 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:58 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:58 VRB ]  Connections unchanged.
[13:41:58 VRB ]  MyResolver result with status code 'OK' and 1 addresses.
[13:41:58 VRB ]  Subchannel id '3-1' matches address 'service-endpoint.zone:2379' and is preserved.
[13:41:58 VRB ]  Connections unchanged.
[13:41:58 VRB ]  MyResolver resolve task completed.
[13:41:58 VRB ]  Subchannel id '3-1' connect canceled.
[13:41:59 WRN ]  ##### END LOOP

# 3rd attempt

[13:41:59 WRN ]  ##### START LOOP
[13:41:59 WRN ]  ##### BEFORE Unary call
[13:41:59 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'https://loadbalancer.temporary.invalid/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest'.
[13:41:59 VRB ]  Pick started.


# !!!!!!!!!                                                                       !!!!!!!!!
# !!!!!!!!! Got unexpected exception this time. No SSL, but TaskCanceledException !!!!!!!!!
# !!!!!!!!!                                                                       !!!!!!!!!

[13:41:59 DBG ]  Picked failure with status: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
[13:41:59 DBG ]  Error starting gRPC call.
[13:41:59 INF ]  Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184

# `SocketConnectivitySubchannelTransport.cs:line 184` is 
# `await _socketConnect(socket, currentEndPoint, context.CancellationToken).ConfigureAwait(false);`
# (differs from original bacause of my logs)

[13:41:59 DBG ]  Finished gRPC call.
[13:41:59 DBG ]  gRPC call canceled.

[13:41:59 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/ConnectionManager.cs:line 341
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 117
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   at MyTestApp.CircuitBreaker.Handlers.CircuitBreakerGrpcInterceptor.HandleExceptionAsync[TResponse](Task`1 response) in /Users/username/dev/path/to/testapp/MyTestApp/CircuitBreaker/Handlers/CircuitBreakerGrpcInterceptor.cs:line 88
   at Program.<Main>$(String[] args) in /Users/username/dev/testapp/path/MyTest.Grpc/Program.cs:line 60
[13:41:59 WRN ]  ##### BEFORE Delay
[13:42:00 WRN ]  ##### END LOOP
[13:42:00 WRN ]  ##### START LOOP
[13:42:00 WRN ]  ##### BEFORE Unary call
[13:42:00 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'https://loadbalancer.temporary.invalid/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest'.
[13:42:00 VRB ]  Pick started.
[13:42:00 DBG ]  Picked failure with status: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
[13:42:00 DBG ]  Error starting gRPC call.
[13:42:00 INF ]  Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:42:00 DBG ]  Finished gRPC call.
[13:42:00 DBG ]  gRPC call canceled.
[13:42:00 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/ConnectionManager.cs:line 341
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 117
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   at MyTestApp.CircuitBreaker.Handlers.CircuitBreakerGrpcInterceptor.HandleExceptionAsync[TResponse](Task`1 response) in /Users/username/dev/path/to/testapp/MyTestApp/CircuitBreaker/Handlers/CircuitBreakerGrpcInterceptor.cs:line 88
   at Program.<Main>$(String[] args) in /Users/username/dev/testapp/path/MyTest.Grpc/Program.cs:line 60

[13:42:00 WRN ]  ##### BEFORE Delay
[13:42:01 WRN ]  ##### END LOOP

[13:42:01 WRN ]  ##### START LOOP
[13:42:01 WRN ]  ##### BEFORE Unary call
[13:42:01 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'https://loadbalancer.temporary.invalid/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest'.
[13:42:01 VRB ]  Pick started.
[13:42:01 DBG ]  Picked failure with status: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")

# !!!!!!!!!                                                                   !!!!!!!!!
# !!!!!!!!! Got unexpected exception again. No SSL, but TaskCanceledException !!!!!!!!!
# !!!!!!!!!                                                                   !!!!!!!!!

[13:42:01 DBG ]  Error starting gRPC call.
[13:42:01 INF ]  Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:42:01 DBG ]  Finished gRPC call.
[13:42:01 DBG ]  gRPC call canceled.
[13:42:01 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/ConnectionManager.cs:line 341
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 117
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   at MyTestApp.CircuitBreaker.Handlers.CircuitBreakerGrpcInterceptor.HandleExceptionAsync[TResponse](Task`1 response) in /Users/username/dev/path/to/testapp/MyTestApp/CircuitBreaker/Handlers/CircuitBreakerGrpcInterceptor.cs:line 88
   at Program.<Main>$(String[] args) in /Users/username/dev/testapp/path/MyTest.Grpc/Program.cs:line 60
[13:42:01 WRN ]  ##### BEFORE Delay
[13:42:02 WRN ]  ##### END LOOP

# !!!!!!!!!                                                                            !!!!!!!!!
# !!!!!!!!! Got unexpected exception again. We will get this exception again and again !!!!!!!!!
# !!!!!!!!!                                                                            !!!!!!!!!

[13:42:02 WRN ]  ##### START LOOP
[13:42:02 WRN ]  ##### BEFORE Unary call
[13:42:02 DBG ]  Starting gRPC call. Method type: 'Unary', URI: 'https://loadbalancer.temporary.invalid/3rdpartynspace.3rdPartyService/3rdPartyServicePerformRequest'.
[13:42:02 VRB ]  Pick started.
[13:42:02 DBG ]  Picked failure with status: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
[13:42:02 DBG ]  Error starting gRPC call.
[13:42:02 INF ]  Call failed with gRPC error status. Status code: 'Unavailable', Message: 'Error connecting to subchannel.'.
System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
[13:42:02 DBG ]  Finished gRPC call.
[13:42:02 DBG ]  gRPC call canceled.
[13:42:02 ERR ]  ##### SOME ERROR
Grpc.Core.RpcException: Status(StatusCode="Unavailable", Detail="Error connecting to subchannel.", DebugException="System.Threading.Tasks.TaskCanceledException: A task was canceled.")
 ---> System.Threading.Tasks.TaskCanceledException: A task was canceled.
   at Grpc.Net.Client.Balancer.Internal.SocketConnectivitySubchannelTransport.TryConnectAsync(ConnectContext context, Int32 attempt) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/SocketConnectivitySubchannelTransport.cs:line 184
   --- End of inner exception stack trace ---
   at Grpc.Net.Client.Balancer.Internal.ConnectionManager.PickAsync(PickContext context, Boolean waitForReady, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/ConnectionManager.cs:line 341
   at Grpc.Net.Client.Balancer.Internal.BalancerHttpHandler.SendAsync(HttpRequestMessage request, CancellationToken cancellationToken) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Balancer/Internal/BalancerHttpHandler.cs:line 117
   at Grpc.Net.Client.Internal.GrpcCall`2.RunCall(HttpRequestMessage request, Nullable`1 timeout) in /Users/username/dev/grpc-dotnet-forked/src/Grpc.Net.Client/Internal/GrpcCall.cs:line 502
   at MyTestApp.CircuitBreaker.Handlers.CircuitBreakerGrpcInterceptor.HandleExceptionAsync[TResponse](Task`1 response) in /Users/username/dev/path/to/testapp/MyTestApp/CircuitBreaker/Handlers/CircuitBreakerGrpcInterceptor.cs:line 88
   at Program.<Main>$(String[] args) in /Users/username/dev/testapp/path/MyTest.Grpc/Program.cs:line 60
[13:42:02 WRN ]  ##### BEFORE Delay
[13:42:03 WRN ]  ##### END LOOP
...

I waited for backoff to handle this but it has no effect here.

Please could you look at this issue too. The problem could be somewhere near the one you fixed.

@JamesNK
Copy link
Member

JamesNK commented Jun 5, 2024

So to be clear, the problem is you're sometimes getting the wrong exception? You expect to always get "The SSL connection could not be established" but sometimes you get "A task was canceled.".

@JamesNK JamesNK added this to the 9.0 milestone Jun 5, 2024
@JamesNK
Copy link
Member

JamesNK commented Jun 5, 2024

Btw, this is a different bug and should be in a different issue. Please create a new one.

Original issue should be fixed in 2.63.0.

@JamesNK JamesNK closed this as completed Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants