Execution Timeout Expired Error (258, ReadSniSyncOverAsync) #647

frankyuan · 2020-07-12T14:52:30Z

Describe the bug

When executing SQL such as SELECT FieldA, FieldB FROM A INNER JOIN C ON A.FieldId = C.FieldId UNION SELECT FieldA, FieldD FROM A INNER JOIN D ON A.FieldId = D.FieldId, throw the error like below, not every time, just a little part of queries have this issue.

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.\n
---> System.ComponentModel.Win32Exception (258): Unknown error 258\n
at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction)\n 
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadByteArray(Span1 buff, Int32 len, Int32& totalRead)\n
at Microsoft.Data.SqlClient.TdsParser.TrySkipValue(SqlMetaDataPriv md, Int32 columnOrdinal, TdsParserStateObject stateObj)\n
at Microsoft.Data.SqlClient.TdsParser.TrySkipRow(SqlMetaDataSet columns, Int32 startCol, TdsParserStateObject stateObj)\n
at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)\n
at Microsoft.Data.SqlClient.SqlDataReader.TryReadInternal(Boolean setTimeout, Boolean& more)\n
at Microsoft.Data.SqlClient.SqlDataReader.ReadAsync(CancellationToken cancellationToken)\n
--- End of stack trace from previous location where exception was thrown ---\n
at Dapper.SqlMapper.QueryAsync[T](IDbConnection cnn, Type effectiveType, CommandDefinition command) in //Dapper/SqlMapper.Async.cs:line 437\n

To reproduce

Sorry, currently can't reproduce in the local environment, so can't provide more detail to reproduce.

Expected behavior

SQL should execute successfully every time.

Further technical details

Microsoft.Data.SqlClient version: 1.1.3
.NET target: Core 3.1
Operating system: Docker container

What I found/tried

https://stackoverflow.com/questions/57270245/sql-server-dbcommand-timeout-with-net-core-container-under-load

DapperLib/Dapper#1435

The text was updated successfully, but these errors were encountered:

ErikEJ · 2020-07-12T15:02:16Z

Have you tried increasing the command timeout?

frankyuan · 2020-07-12T16:12:30Z

Thanks, @ericstj, No, I didn't try increasing command timeout. Currently, the command timeout is 3s. If a query executes timeout, I got stack like below, seems it is different than the above stack, do you mean the above stack is also a SQL execution timeout? By the way, from New relic, the select query's duration is 83.4ms, I'm not sure if it is the execution time.

Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding
Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.\n
---> System.ComponentModel.Win32Exception (258): Unknown error 258\n
at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__164_0(Task`1 result)\n
at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()\n
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)\n
--- End of stack trace from previous location where exception was thrown ---\n
at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)\n
--- End of stack trace from previous location where exception was thrown ---\n
at Dapper.SqlMapper.QueryAsync[T](IDbConnection cnn, Type effectiveType, CommandDefinition command) in /_/Dapper/SqlMapper.Async.cs:line 419\n

JRahnama · 2020-07-13T17:22:59Z

@frankyuan, thanks for bringing this up to our attention, can you please let me know if:

Docker container is using Linux?
Are you using Azure Sql?
If I have noticed correctly you could not see the error on your local machine. If I am right, is your local machine using
Windows OS?
Is multi threading, such as PLINQ or Parallel queries are used inside the code?

Thank you

frankyuan · 2020-07-15T08:10:09Z

Sorry for reply delay, here is more detail information(by the way, not every cluster have above error, just in one cluster):

Docker container is using Linux?
Yes
Are you using Azure SQL?
No
If I have noticed correctly you could not see the error on your local machine. If I am right, is your local machine using
Windows OS?
Yes, my local machine is windows 10.
Is multi-threading, such as PLINQ or Parallel queries are used inside the code?
No, we didn't use multi-threading, just using Dapper.SqlMapper's async method (QueryAsync). And I have asked in Dapper. Unknown error 258 : The timeout period elapsed prior to completion DapperLib/Dapper#1435

JRahnama · 2020-07-15T17:18:25Z

@frankyuan thanks for the response. Can you provide sample repro code? If sample repro is not possible (without sample app it is really difficult to understand what is happening), can you share the connection string properties please?

frankyuan · 2020-07-16T02:53:36Z

SqlConnectionString like this: Data Source=xxx;Initial Catalog=xx;User ID=xx;Password=xx;Min Pool Size=5;Max Pool Size=100;Connect Timeout=8. CommandTimeout=3. As I can't reproduce in my local environment, I'm afraid can't provide sample repro for now. But I'm trying to reproduce, if find a method, will update here.

frankyuan · 2020-07-20T03:25:55Z

Seems only the SQL mentioned in description will throw the exception. And sometimes there is another similar exception like below when executing the same SQL.

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.\n
 ---> System.ComponentModel.Win32Exception (258): Unknown error 258\n
 at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)\n
 at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)\n
 at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)\n
 at Microsoft.Data.SqlClient.SqlDataReader.TryReadInternal(Boolean setTimeout, Boolean& more)\n
 at Microsoft.Data.SqlClient.SqlDataReader.ReadAsync(CancellationToken cancellationToken)\n
 --- End of stack trace from previous location where exception was thrown ---\n
 at Dapper.SqlMapper.QueryAsync[T](IDbConnection cnn, Type effectiveType, CommandDefinition command) in /_/Dapper/SqlMapper.Async.cs:line 437\n

JRahnama · 2020-07-23T23:52:50Z

@frankyuan any update on the issue on your side? Were you able to repro the issue? If not, can you kindly share your code or a similar code with me? I need the query you run and setting you have on your machine please. On the other note, can you test it with increasing Connection and command time out as well?

Svisstack · 2020-08-27T09:11:56Z

I'm experiencing a similar issue

Microsoft.EntityFrameworkCore.DbUpdateException: An error occurred while updating the entries. See the inner exception for details. ---> Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding. ---> System.ComponentModel.Win32Exception (258): No error information at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__164_0(Task1 result)
at System.Threading.Tasks.ContinuationResultTaskFromResultTask2.InnerInvoke() at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state) --- End of stack trace from previous location where exception was thrown --- at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread) --- End of stack trace from previous location where exception was thrown --- at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.ExecuteAsync(IRelationalConnection connection, CancellationToken cancellationToken) ClientConnectionId:559501c9-13be-4150-a8bc-ef047cfe491a Error Number:-2,State:0,Class:11 --- End of inner exception stack trace --- at Microsoft.EntityFrameworkCore.Update.ReaderModificationCommandBatch.ExecuteAsync(IRelationalConnection connection, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken) at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(IList1 entriesToSave, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(DbContext _, Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func4 operation, Func4 verifySucceeded, CancellationToken cancellationToken)
at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
`

.NET target: Core 3.1
Operating system: Docker container (alpine)

lukas-navratil · 2020-09-18T12:07:10Z

We are also having similar issue like @Svisstack .

Microsoft.Data.SqlClient 2.0.1, .NET Core 3.1, running in Windows container and connecting to SQL Azure Hyperscale DB.

We have a multi-threaded service, lets say about 30 parallel threads that are writing data to the same database. There is a transaction per thread. The service is running fine for most of the time, but from time to time, some threads start to get the exceptions below and they are stuck on it. The command that is failing on this timeout is the same for all threads, but in each occurrence it's usually a different command. Quite often it's a command that is sending some table valued parameters, but we saw it to fail on a very basic SELECT with a single parameter as well. There is no activity on the DB side, connections are in awaiting command state.

From our observations it seems that when this exception is thrown, all threads are blocked on it. They never recovered from it.

 ---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
   at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__169_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location where exception was thrown ---
   at Dapper.SqlMapper.QueryRowAsync[T](IDbConnection cnn, Row row, Type effectiveType, CommandDefinition command) in /_/Dapper/SqlMapper.Async.cs:line 483

When we noticed this exception, it seems that only some threads were affected and after approximately 30 minutes the issue disappeared.

 ---> System.ComponentModel.Win32Exception (121): The semaphore timeout period has expired.
   at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__169_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location where exception was thrown ---
   at Dapper.SqlMapper.QueryAsync[T](IDbConnection cnn, Type effectiveType, CommandDefinition command) in /_/Dapper/SqlMapper.Async.cs:line 419

I would like to provide a repro, but it's quite hard to reproduce that nondeterministic behavior.

Interesting is that we have 6 instances of this service, where each instance is connecting to different DB and has different number of threads, but we noticed the issue only on 2 instances. One of them is the above mentioned one with 30-35 threads, another one is way smaller, only 3-4 threads.

frankyuan · 2020-09-19T13:36:11Z

@frankyuan any update on the issue on your side? Were you able to repro the issue? If not, can you kindly share your code or a similar code with me? I need the query you run and setting you have on your machine please. On the other note, can you test it with increasing Connection and command time out as well?

Sorry for getting back to you late, I couldn't reproduce this issue currently, and the only clue is that there are more than 3000 records in the result, after I changed the logic to get fewer records, the exception is gone.

JRahnama · 2020-09-25T17:26:49Z

@frankyuan thanks for the update.

@Svisstack, @lukas-navratil can any of you guys provide a sample repro application please?

Svisstack · 2020-09-25T23:11:40Z

I dont have this possibility unfortunately.

…

Sent from my iPhone

On 25 Sep 2020, at 19:27, Javad ***@***.***> wrote: @frankyuan thanks for the update. @Svisstack, @lukas-navratil can any of you guys provide a sample repro application please? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

mischaherbrand · 2020-09-28T13:50:49Z

We are having the same issue, it happens randomly on our Azure SQL Server instance. Usually the query is very fast and sometimes it raises the error:

twurm · 2020-09-28T20:29:10Z

@JRahnama we are seeing the same issues, we will keep an eye on it an see if we can replicate it at will. At the momement is very random:

Docker container is using Linux?
- Yes
Are you using Azure SQL?
- Yes - Standard Tier 10DTU (this is just a dev site)
If I have noticed correctly you could not see the error on your local machine. If I am right, is your local machine using
Windows OS?
- We are using windows and SQL 2019 locally. This issue only occurs with Azure SQL Database at the moment
Is multi-threading, such as PLINQ or Parallel queries are used inside the code?
- No

JRahnama · 2020-09-28T21:02:53Z

@twurm can you give us a repo application?

twurm · 2020-09-28T21:14:29Z

@JRahnama we are working on a repo app. The event doesn't happen all the time and happens in different areas of the application so we are trying to figure that out.

Svisstack · 2020-09-29T12:34:43Z

same experiences

…

On Mon, Sep 28, 2020 at 11:14 PM Tom Wurm ***@***.***> wrote: @JRahnama <https://github.com/JRahnama> we are working on a repo app. The event doesn't happen all the time and happens in different areas of the application so we are trying to figure that out.

twurm · 2020-09-29T13:17:15Z

@JRahnama we also stumbled across #422 which does have a sample program to recreate the issue. While not the exact issue it is similar for us as we do use MARS.

JRahnama · 2020-09-30T19:36:45Z

@twurm, regarding the mentioned issue #422, if we reduce the number of parallel runs on that issue to anything less than or equal to 23 everything works fine. As a work around, if you have similar setup, you can try that or set MARS to false ,if possible, for now.

JRahnama · 2020-10-16T21:55:54Z

@frankyuan, @twurm, @mischaherbrand, @Svisstack, @lukas-navratil while I was trying to make a scenario to reproduce the issue
, I came up with a sample code the each first try to read from the pool was throwing the very same error message while the SqlCommand was not wrapped in a using block, but the rest of the recycled connection worked fine until the pool was renewed by reaching its max poolsize or a fatal error happened and pool got cleared and then the first connection out of the pool had the very same issue again. As a workaround can you make sure all your SqlConnection, SqlCommands and SqlDataReaders are wrapped in a using block properly and see if the issue comes up? That will also help me to understand if I am on the right track with the repro.

Thanks everybody

Svisstack · 2020-10-16T22:54:20Z

Cant confirm that currently, too much code on my side, but will have this in mind

…

Sent from my iPhone

On 16 Oct 2020, at 23:56, Javad ***@***.***> wrote: @frankyuan, @twurm, @mischaherbrand, @Svisstack, @lukas-navratil while I was trying to make a scenario to reproduce the issue , I came up with a sample code the each first try to read from the pool was throwing the very same error message while the SqlCommand was not wrapped in a using block, but the rest of the recycled connection worked fine until the pool was renewed by reaching its max poolsize or a fatal error happened and pool got cleared and then the first connection out of the pool had the very same issue again. As a workaround can you make sure all your SqlConnection, SqlCommands and SqlDataReaders are wrapped in a using block properly and see if the issue comes up? That will also help me to understand if I am on the right track with the repro. Thanks everybody — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

frankyuan · 2020-10-19T13:28:49Z

@JRahnama , If I didn't miss something, all SqlCommands in my code are wrapped in using block.

lukas-navratil · 2020-10-21T11:45:05Z

@JRahnama , I found one command that wasn't wrapped in using block. I will deploy the fix and try to monitor if it helped or not.

lukas-navratil · 2020-10-30T11:12:22Z

@JRahnama , so it seems that since we deployed 8 days ago the version with added using around SqlCommand, the Win32Exception 258 hasn't been thrown.

However exception 121 was thrown several times. I can't tell whether it would recover from that as we implemented automatic restart of service when it's stuck on the same SqlException for too long. I'm not sure either that it's related to this issue, but it's just a heads up that this error still happens in our environment as I reported it above.

Microsoft.Data.SqlClient.SqlException (0x80131904): A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 0 - The semaphore timeout period has expired.)
 ---> System.ComponentModel.Win32Exception (121): The semaphore timeout period has expired.
   at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__169_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location where exception was thrown ---
   at Dapper.SqlMapper.QueryAsync[T](IDbConnection cnn, Type effectiveType, CommandDefinition command) in /_/Dapper/SqlMapper.Async.cs:line 419

lukas-navratil · 2020-11-10T10:49:08Z

@JRahnama, unfortunately, our app got stuck today on this issue again:

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): The wait operation timed out.
   at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__169_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location where exception was thrown ---
   at Dapper.SqlMapper.QueryRowAsync[T](IDbConnection cnn, Row row, Type effectiveType, CommandDefinition command) in /_/Dapper/SqlMapper.Async.cs:line 483```

mrmartan · 2020-12-01T14:06:51Z

I am getting this as well. In my case its from within EFC DbContext. ASP.NET Core 3.1 (and its bundled EFC and SqlClient).
Running in a container (base image mcr.microsoft.com/dotnet/core/aspnet:3.1) and the SQL Server instance runs on Windows.

I also use the EFC's DB Context Pool: services.AddDbContextPool<SqlStore>(options => options.UseSqlServer(this.configuration.GetSqlConnectionString(nameof(SqlStore))));

One of my colleagues insisted this is comming from SQL login timeouts (you can use that as a hint but I would not rely on that)

My connection string: Data source=***;Initial catalog=***;User ID=***;Connection Timeout=120;TrustServerCertificate=True;Application Name=***

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): Unknown error 258
   at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__164_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location where exception was thrown ---
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location where exception was thrown ---
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.AsyncEnumerator.InitializeReaderAsync(DbContext _, Boolean result, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.QueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()
ClientConnectionId:d2ee303b-d340-4514-aeff-5a0f5b1fdf04
Error Number:-2,State:0,Class:11

EvgenyGrishnov · 2023-05-29T10:37:42Z

Could you clarify what do you mean here: "implementing 'asyn'c and 'await' with 'Task'" ?
Was your API synchronized when that error appeared?

sajidur · 2023-05-29T13:37:26Z

Could you clarify what do you mean here: "implementing 'asyn'c and 'await' with 'Task'" ? Was your API synchronized when that error appeared?

just i make my api asyncronized. Yes it was synchronized.

tobyreid · 2023-06-22T16:33:04Z

I had a very similar problem recently:

In my case I have a dockerized webapi where I start a new TransactionScope whenever a PUT/POST/PATCH/DELETE call is received.
Depending on the operation, one or more connections are open to different Azure SQL databases via EF Core (and ultimately the SqlClient) - queries to the first connection will work, but any queries to other connections won't as I'm not using the Azure DTC preview.

Not creating a TransactionScope solved my immediate issue in these instances..

The exception was the same @frankyuan's error and appeared intermittently in logs, due to application usage and behaviour:

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.\n
---> System.ComponentModel.Win32Exception (258): Unknown error 258\n

....which was not a super helpful error message - while trying to figure this out 👎

svap-roshan · 2023-07-04T10:34:03Z

Describe the bug

When executing SQL such as SELECT FieldA, FieldB FROM A INNER JOIN C ON A.FieldId = C.FieldId UNION SELECT FieldA, FieldD FROM A INNER JOIN D ON A.FieldId = D.FieldId, throw the error like below, not every time, just a little part of queries have this issue.

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed prior to completion of the operation or the server is not responding.\n
---> System.ComponentModel.Win32Exception (258): Unknown error 258\n
at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action1 wrapCloseInAction)\n 
at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()\n 
at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadByteArray(Span1 buff, Int32 len, Int32& totalRead)\n
at Microsoft.Data.SqlClient.TdsParser.TrySkipValue(SqlMetaDataPriv md, Int32 columnOrdinal, TdsParserStateObject stateObj)\n
at Microsoft.Data.SqlClient.TdsParser.TrySkipRow(SqlMetaDataSet columns, Int32 startCol, TdsParserStateObject stateObj)\n
at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)\n
at Microsoft.Data.SqlClient.SqlDataReader.TryReadInternal(Boolean setTimeout, Boolean& more)\n
at Microsoft.Data.SqlClient.SqlDataReader.ReadAsync(CancellationToken cancellationToken)\n
--- End of stack trace from previous location where exception was thrown ---\n
at Dapper.SqlMapper.QueryAsync[T](IDbConnection cnn, Type effectiveType, CommandDefinition command) in //Dapper/SqlMapper.Async.cs:line 437\n

To reproduce

Sorry, currently can't reproduce in the local environment, so can't provide more detail to reproduce.

Expected behavior

SQL should execute successfully every time.

Further technical details

Microsoft.Data.SqlClient version: 1.1.3 .NET target: Core 3.1 Operating system: Docker container

What I found/tried

https://stackoverflow.com/questions/57270245/sql-server-dbcommand-timeout-with-net-core-container-under-load

DapperLib/Dapper#1435

Were you able to resolve this?

bennil · 2023-07-05T07:03:40Z

Minimizing Task.Run(...) calls did not solve the problem in my case. Looks like this is a regular threading bug.

Read the other day of a SNI threading bugfix in a ef core preview version (can't find it right now). Hope this will solve this leaking problem soon.

Malgefor · 2023-08-10T09:59:04Z

Still receiving this error quite often on .NET 7 running in a Linux (Debian) container. Any update on what might be the cause?

svap-roshan · 2023-08-10T10:44:19Z

@Malgefor Convert all asynchronous SQL calls into synchronous ones, and everything will be sorted :)

jbogard · 2023-08-16T20:02:53Z

We are also experiencing this issue with Linux App Services against Azure SQL. We only began seeing this after migrating our App Services from Windows to Linux. Also on 5.1.1 of SqlClient.

sliekens · 2023-09-21T23:14:46Z

I spent a few hours fighting this error. In my case, it was really just a slow transaction that timed out (I use the default CommandTimeout).

My app runs in a dotnet 6.0-bullseye-slim Docker container. I use NServiceBus to handle incoming messages, up to 25 concurrently. When a message arrives, I use EF Core 7.0 to fetch a record from the database using SingleOrDefaultAsync and then save edits to the record using SaveChangesAsync. Sometimes this takes longer than the command timeout. It's a bit unfortunate that timeouts result in an Unknown Error, which I guess is a Linux-only problem. (Why?)

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): Unknown error 258
   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniError(TdsParserStateObject stateObj, UInt32 error)
   at Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync()
   at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadNetworkPacket()
   at Microsoft.Data.SqlClient.TdsParserStateObject.TryPrepareBuffer()
   at Microsoft.Data.SqlClient.TdsParserStateObject.TryReadByte(Byte& value)
   at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
   at Microsoft.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
   at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction2005(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
   at Microsoft.Data.SqlClient.SqlInternalTransaction.Commit()
   at Microsoft.Data.SqlClient.SqlTransaction.Commit()
   at System.Data.Common.DbTransaction.CommitAsync(CancellationToken cancellationToken)
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.RelationalTransaction.CommitAsync(CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalTransaction.CommitAsync(CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(IList`1 entriesToSave, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(StateManager stateManager, Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
ClientConnectionId:2762e296-cb69-492f-811e-7f35977086b1
Error Number:-2,State:0,Class:11

The stacktrace doesn't always look the same. Here is another example without ReadSniSyncOverAsync:

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
Operation cancelled by user.
 ---> System.ComponentModel.Win32Exception (258): Unknown error 258
   at Microsoft.Data.SqlClient.SqlConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.SqlInternalConnection.OnError(SqlException exception, Boolean breakConnection, Action`1 wrapCloseInAction)
   at Microsoft.Data.SqlClient.TdsParser.ThrowExceptionAndWarning(TdsParserStateObject stateObj, Boolean callerHasConnectionLock, Boolean asyncClose)
   at Microsoft.Data.SqlClient.TdsParser.TryRun(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj, Boolean& dataReady)
   at Microsoft.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior, SqlCommand cmdHandler, SqlDataReader dataStream, BulkCopySimpleResultSet bulkCopyHandler, TdsParserStateObject stateObj)
   at Microsoft.Data.SqlClient.TdsParser.TdsExecuteTransactionManagerRequest(Byte[] buffer, TransactionManagerRequestType request, String transactionName, TransactionManagerIsolationLevel isoLevel, Int32 timeout, SqlInternalTransaction transaction, TdsParserStateObject stateObj, Boolean isDelegateControlRequest)
   at Microsoft.Data.SqlClient.SqlInternalConnectionTds.ExecuteTransaction2005(TransactionRequest transactionRequest, String transactionName, IsolationLevel iso, SqlInternalTransaction internalTransaction, Boolean isDelegateControlRequest)
   at Microsoft.Data.SqlClient.SqlInternalTransaction.Commit()
   at Microsoft.Data.SqlClient.SqlTransaction.Commit()
   at System.Data.Common.DbTransaction.CommitAsync(CancellationToken cancellationToken)
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.RelationalTransaction.CommitAsync(CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalTransaction.CommitAsync(CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Update.Internal.BatchExecutor.ExecuteAsync(IEnumerable`1 commandBatches, IRelationalConnection connection, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(IList`1 entriesToSave, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.ChangeTracking.Internal.StateManager.SaveChangesAsync(StateManager stateManager, Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.DbContext.SaveChangesAsync(Boolean acceptAllChangesOnSuccess, CancellationToken cancellationToken)
ClientConnectionId:711c262b-080f-49bb-aa05-12539b982735
Error Number:-2,State:0,Class:11

A side-effect of this Unknown Error seems to be that the connection is NOT (always?) returned to the connection pool. I think this has also been discussed previously as a side-effect of using sync-over-async, but I don't understand the machinery that leads to the connection remaining open but unavailable for reuse.

Connections not being returned to the pool causes my application to come to almost a complete stop, with 100 open connections left unused. I have confirmed on SQL Server that my application has 100 active sessions, yet my message handlers are all crashing with a timeout due to connection pool starvation.

System.InvalidOperationException: Timeout expired.  The timeout period elapsed prior to obtaining a connection from the pool.  This may have occurred because all pooled connections were in use and max pool size was reached.
   at Microsoft.Data.Common.ADP.ExceptionWithStackTrace(Exception e)
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternalAsync(Boolean errorsExpected, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenInternalAsync(Boolean errorsExpected, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalConnection.OpenAsync(CancellationToken cancellationToken, Boolean errorsExpected)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.InitializeReaderAsync(AsyncEnumerator enumerator, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.SqlServer.Storage.Internal.SqlServerExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.Internal.SingleQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync

In the end, I just ended up rewriting my queries and adding indexes, so the timeouts disappeared. But it still seems to me that there is something not really working right in the SqlClient on Linux.

co7e · 2023-11-17T11:25:31Z

This issue has been open for over 3 years and I think that there are probably a lot of people who would be really appreciative of an update on this, even if it is just to say this isn't getting fixed any time soon.

We experienced this issue about a year ago. We ended up shifting our hosting to Windows, at additional cost, to be sure that we wouldn't encounter it again. I came back here a year later expecting (well probably more like hoping) that it would have been resolved, or at least for some sort of update.

IMO, this is a massive issue for a huge number of teams wanting to host on Linux. I'd probably go as far as saying that you shouldn't really be hosting on Linux in a production environment if you are using SqlClient. That's a lot of people and a really big deal. As far as I can tell, any resolutions anyone has above are just workarounds. They are not really fixing the issue. If I'm wrong, please let me know.

Of the above workarounds, is there one that will totally prevent the issue from occurring, please?

Thanks in advance for any feedback.

jperlope · 2023-11-17T11:40:42Z

I would like to share my experience on this. I used to get exactly the same exception/stack trace when running a query. It was odd because running the execution plan it always showed super optimized, it seemed not a performance problem, but often the exception occurred. We had no idea what to do. But since a time ago, we are rebuilding indexes every week, and exception was not raised anymore. My conclusion is, sometimes SQL server was doing something, or some overload in the SQL server but for any reason refreshing indexes reduces these overloads.

MichelZ · 2023-11-17T11:59:06Z

For us, increasing the threadpool threads eliminated this issue. We're still on linux and are not seeing this (anymore)

bailsman · 2023-11-17T13:31:39Z

We also started seeing unknown error 258 after migrating from Windows to Linux App Services, but strangely only on one of our environments.

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): Unknown error 258
   at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__209_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.<PopulateSplitIncludeCollectionAsync>g__InitializeReaderAsync|27_0[TIncludingEntity,TIncludedEntity](RelationalQueryContext queryContext, RelationalCommandCache relationalCommandCache, IReadOnlyList`1 readerColumns, Boolean detailedErrorsEnabled, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass30_0`2.<<ExecuteAsync>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.PopulateSplitIncludeCollectionAsync[TIncludingEntity,TIncludedEntity](Int32 collectionId, RelationalQueryContext queryContext, IExecutionStrategy executionStrategy, RelationalCommandCache relationalCommandCache, IReadOnlyList`1 readerColumns, Boolean detailedErrorsEnabled, SplitQueryResultCoordinator resultCoordinator, Func`3 childIdentifier, IReadOnlyList`1 identifierValueComparers, Func`5 innerShaper, Func`4 relatedDataLoaders, INavigationBase inverseNavigation, Action`2 fixup, Boolean trackingQuery)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.TaskAwaiter(Func`1[] taskFactories)
   at Microsoft.EntityFrameworkCore.Query.Internal.SplitQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()

@MichelZ Out of curiosity, what did you increase the minimum threads to?

MichelZ · 2023-11-17T13:44:49Z

We also started seeing unknown error 258 after migrating from Windows to Linux App Services, but strangely only on one of our environments.

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): Unknown error 258
   at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__209_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.<PopulateSplitIncludeCollectionAsync>g__InitializeReaderAsync|27_0[TIncludingEntity,TIncludedEntity](RelationalQueryContext queryContext, RelationalCommandCache relationalCommandCache, IReadOnlyList`1 readerColumns, Boolean detailedErrorsEnabled, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass30_0`2.<<ExecuteAsync>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.PopulateSplitIncludeCollectionAsync[TIncludingEntity,TIncludedEntity](Int32 collectionId, RelationalQueryContext queryContext, IExecutionStrategy executionStrategy, RelationalCommandCache relationalCommandCache, IReadOnlyList`1 readerColumns, Boolean detailedErrorsEnabled, SplitQueryResultCoordinator resultCoordinator, Func`3 childIdentifier, IReadOnlyList`1 identifierValueComparers, Func`5 innerShaper, Func`4 relatedDataLoaders, INavigationBase inverseNavigation, Action`2 fixup, Boolean trackingQuery)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.TaskAwaiter(Func`1[] taskFactories)
   at Microsoft.EntityFrameworkCore.Query.Internal.SplitQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()

@MichelZ Out of curiosity, what did you increase the minimum threads to?

We went aggressive/overkill, but we haven't seen negative side effects currently.

  ThreadPool.GetMaxThreads(out int workerThreads, out int completionPortThreads);
  ThreadPool.SetMaxThreads(Math.Max(workerThreads, 2048), Math.Max(completionPortThreads, 2048));
  ThreadPool.SetMinThreads(1024, 512);

co7e · 2023-11-17T15:42:12Z

In my last post I guessed the 258 error occurs because of full tcp buffers and wrong tcp socket handling on linux (see my previous post above). Further investigation made me more confident that this is the case.

The additional queue I added (last post) helped a bit, but after adding more features (service implementations) to our host process the error came back quite frequently.

Because most 258 errors occured in DbContext.SaveChanges() I synchronized calls in my DbContext base implementation like this:
public class MyBaseDbContext : DbContext
{
    private static readonly object syncObj = new object();

    public override int SaveChanges()
    {
        lock (syncObj)
        {
            return base.SaveChanges();
        }
    }
}
This dirty code slowed down my execution flow and helped reducing the 258 errors. But like others reported here about disabling MARS it did not solve the problem completly.

I examined that the problem occurs often during extensive external web API rest calls. So I synchronized them as well:
static SemaphoreSlim semaphoreSlimGeocodeLock = new SemaphoreSlim(1, 1);

public void MyRestCallMethod() 
{
    try
    {
        await semaphoreSlimGeocodeLock.WaitAsync();

        HttpClient httpClient = new HttpClient();
        string result = await httpClient.GetStringAsync(url);
        ...

    }
    finally
    {
        httpClient.Dispose();
        semaphoreSlimGeocodeLock.Release();
    }
}
No parallel web calls helped a lot and made me sure the error is related to the current tcp buffer usage within the application.

I guess the error originates in some way like this:

SqlClient reads/writes to the sql server tcp socket connection.

Because of the buffer handling bug the read/write loop returns indicating the remotehost has terminated the connection.

SqlClient internal reconnect feature reconnects immediately to the SQL Server.

On the next tcp read/write the same error occures because the application tcp buffer is still full.

This is repeated many times and there might be a threading issue playing a role as well.

The application is freezed and timeout "Unknwon error 258" occurs

Slowing down the execution flow and hoping not reaching tcp buffer limits is not a production ready workaround :) I wonder if this is a common problem for all high tcp i/o apps running on Linux and connecting to Sql Server?

This error occured some time after upgrading ef core (not sure which .net core/ef core version). My application runs currently on:

Bare metal server: AMD Ryzen 7 2700X

Ubuntu 18.04.2 LTS

.NET 5.0

Microsoft.EntityFrameworkCore.SqlServer 5.0.1

Used SQL Server: Latest SQL Server 2019 Docker container

Side notes:

Upgrading from a two years old SQL Server 2017 Container to the latest 2019 version did not change anything.

I did try collecting traces with dotnet-trace collect --process-id 27455 --profile database
but the results did not include the expected events.

Thought I'd add that when we investigated this a year ago, we felt that analysis from @bennil, above, married up with what we were seeing.

Not sure I'm comfortable with your solution @MichelZ even though you haven't suffered any negative consequences. Great it is working for you though.

bennil · 2023-11-18T09:19:17Z

Thanks @co7e for bringing this matter up again.

My insight after watching/analyzing this for years are the following:

SqlClient is not thread safe on Linux (saw it on Intel and AMD). Reducing parallel work or adjust thread execution might help but not solve the problem. Sooner or later the exception will hit you.

Surprisingly Microsoft doesn't care for years even though a huge number of enterprise software is running on this db access lottery.

The bug is hard to reproduce as all threading issues are. I have some ideas how to construct testcode for further investigations, but this is not a charity project...

dazinator · 2023-11-19T05:44:36Z

We also started seeing unknown error 258 after migrating from Windows to Linux App Services, but strangely only on one of our environments.

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired.  The timeout period elapsed prior to completion of the operation or the server is not responding.
 ---> System.ComponentModel.Win32Exception (258): Unknown error 258
   at Microsoft.Data.SqlClient.SqlCommand.<>c.<ExecuteDbDataReaderAsync>b__209_0(Task`1 result)
   at System.Threading.Tasks.ContinuationResultTaskFromResultTask`2.InnerInvoke()
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
--- End of stack trace from previous location ---
   at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state)
   at System.Threading.Tasks.Task.ExecuteWithThreadLocal(Task& currentTaskSlot, Thread threadPoolThread)
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.RelationalCommand.ExecuteReaderAsync(RelationalCommandParameterObject parameterObject, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.<PopulateSplitIncludeCollectionAsync>g__InitializeReaderAsync|27_0[TIncludingEntity,TIncludedEntity](RelationalQueryContext queryContext, RelationalCommandCache relationalCommandCache, IReadOnlyList`1 readerColumns, Boolean detailedErrorsEnabled, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.<>c__DisplayClass30_0`2.<<ExecuteAsync>b__0>d.MoveNext()
--- End of stack trace from previous location ---
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteImplementationAsync[TState,TResult](Func`4 operation, Func`4 verifySucceeded, TState state, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Storage.ExecutionStrategy.ExecuteAsync[TState,TResult](TState state, Func`4 operation, Func`4 verifySucceeded, CancellationToken cancellationToken)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.PopulateSplitIncludeCollectionAsync[TIncludingEntity,TIncludedEntity](Int32 collectionId, RelationalQueryContext queryContext, IExecutionStrategy executionStrategy, RelationalCommandCache relationalCommandCache, IReadOnlyList`1 readerColumns, Boolean detailedErrorsEnabled, SplitQueryResultCoordinator resultCoordinator, Func`3 childIdentifier, IReadOnlyList`1 identifierValueComparers, Func`5 innerShaper, Func`4 relatedDataLoaders, INavigationBase inverseNavigation, Action`2 fixup, Boolean trackingQuery)
   at Microsoft.EntityFrameworkCore.Query.RelationalShapedQueryCompilingExpressionVisitor.ShaperProcessingExpressionVisitor.TaskAwaiter(Func`1[] taskFactories)
   at Microsoft.EntityFrameworkCore.Query.Internal.SplitQueryingEnumerable`1.AsyncEnumerator.MoveNextAsync()

@MichelZ Out of curiosity, what did you increase the minimum threads to?

Your stack trace doesn't contain Microsoft.Data.SqlClient.TdsParserStateObject.ReadSniSyncOverAsync() - be aware that error 258 is a fairly broad error, see my comment #647 (comment)

bailsman · 2023-12-11T17:21:25Z

In our case the problem has now as mysteriously disappeared as it has mysteriously appeared. The only change we made was migrate to a different virtual machine than the previous SQL Server was running on.

By taking some packet captures by logging in with ssh on the Linux Web App, and typing ip a to get the interface name and then tcpdump -i vnet0g6qkt5hl -w ~/network_traces/0001.pcap we noticed something strange while the problem was still occurring.

A packet containing a SQL query that in the tcpdump was registered with size 2130 bytes and data length 2064 was being "partially acked". After sending the large packet out, the ACK came back only for the first 1398 bytes, causing a retransmission of the last 666 bytes. However, in response to this retransmission only came another ACK for only the first 1398 bytes, causing another retransmission of the last 666 bytes, and so on and so forth, until eventually the connection timed out. Rather than a threading issue this potentially seems like some kind of network layer problem. However, we can no longer reproduce it, so it's difficult to investigate more deeply.

As @dazinator already suggested our particular problem was probably something entirely different than what this ticket is about. The 258 timeout error is fairly general - it just means that some kind of wait operation timed out and it can have multiple different underlying causes. Sorry for the noise.

jaq316 · 2024-02-12T13:19:56Z

In our case the problem has now as mysteriously disappeared as it has mysteriously appeared. The only change we made was migrate to a different virtual machine than the previous SQL Server was running on.

By taking some packet captures by logging in with ssh on the Linux Web App, and typing ip a to get the interface name and then tcpdump -i vnet0g6qkt5hl -w ~/network_traces/0001.pcap we noticed something strange while the problem was still occurring.

A packet containing a SQL query that in the tcpdump was registered with size 2130 bytes and data length 2064 was being "partially acked". After sending the large packet out, the ACK came back only for the first 1398 bytes, causing a retransmission of the last 666 bytes. However, in response to this retransmission only came another ACK for only the first 1398 bytes, causing another retransmission of the last 666 bytes, and so on and so forth, until eventually the connection timed out. Rather than a threading issue this potentially seems like some kind of network layer problem. However, we can no longer reproduce it, so it's difficult to investigate more deeply.

As @dazinator already suggested our particular problem was probably something entirely different than what this ticket is about. The 258 timeout error is fairly general - it just means that some kind of wait operation timed out and it can have multiple different underlying causes. Sorry for the noise.

We have also experienced the same issue.

The solution, for us, was to limit the packet size in the Connection string by appending ;Packet Size=512. We chose 512 as it is the minimum size.

See https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlconnection.packetsize?view=dotnet-plat-ext-8.0 for more info on SqlConnection.PacketSize property.

Vandersteen · 2024-02-12T16:17:37Z

Could this be related ?
https://medium.com/swlh/fix-a-random-network-connection-reset-issue-in-docker-kubernetes-5c57a11de170

bennil · 2024-02-26T08:44:24Z

https://www.codeproject.com/Tips/5378079/Challenges-in-Migrating-ASP-NET-Apps-to-Containers

Maybe after all it relates to MARS? (DB connection string: MultipleActiveResultSets=True)

dazinator · 2024-02-26T23:34:49Z

https://www.codeproject.com/Tips/5378079/Challenges-in-Migrating-ASP-NET-Apps-to-Containers

Maybe after all it relates to MARS? (DB connection string: MultipleActiveResultSets=True)

Not in mine or many other cases. The MARS issue was seperate as far as I know. I use MultipleActiveResultSets=False and still saw this problem.

Note: After moving the work to a seperate dedicated process (and therefore removing contention of the application thread pool in the process) I no longer see this issue. The new process starts - on the same machine, runs the same queries, and terminates as expected. Also, not only are we running this logic in a dedicated process, we have 10 jobs to run, rather than them all being run in application at same time as before, they are now 10 seperate dedicated processes that start then terminate. So this has resulted in 11 seperate processes, each with own thread pool and using own connection.

So it seems to be to do with:-

A ramp up in load - placed on either a connection or the thread pool

dazinator · 2024-02-26T23:45:36Z

I also wondered if Azure SQL has some sort of connection "rejection" mechanism, such that when it scales up or CPU usage is high, it can reject queries with a 258 "immediately" and that perhaps when a connection is in this state, it doesn't clear. A bit like a rate limiting feature. However this is pure speculation, I don't have the networking skills to trace what packets is being sent when this occurs, as our apps run in docker, on linux azure VM's and things get pretty complicated pretty fast.

David-Engel · 2024-05-28T23:27:35Z

We are closing this issue as the error is very generic and applies to several valid failure scenarios that might have different underlying causes including network issues and thread starvation. There is no real fix for this.

cheenamalhotra added this to Needs triage in SqlClient Triage Board via automation Jul 15, 2020

cheenamalhotra moved this from Needs triage to Waiting for customer in SqlClient Triage Board Jul 15, 2020

NickCraver mentioned this issue Aug 19, 2020

Unknown error 258 : The timeout period elapsed prior to completion DapperLib/Dapper#1435

Closed

David-Engel mentioned this issue Jul 5, 2023

An unidentifiable error with code 258 has occurred on an ECS Linux container #2081

Closed

canerdikkollu mentioned this issue Sep 18, 2023

[BUG] Azure Kubernetes .NET Core 5.0 Unknown Error 258 Azure/AKS#3910

Open

dbeavon mentioned this issue Jan 23, 2024

A transport-level error has occurred when receiving results from the server. (provider: TCP Provider, error: 35 - An internal exception was caught) #2103

Closed

ebeeton mentioned this issue Apr 16, 2024

SQL times out when getting image while plot is in progress ebeeton/buddhabrot#5

Open

David-Engel closed this as not planned Won't fix, can't repro, duplicate, stale May 28, 2024

SqlClient Triage Board automation moved this from Under Investigation to Closed May 28, 2024

Execution Timeout Expired Error (258, ReadSniSyncOverAsync) #647

Execution Timeout Expired Error (258, ReadSniSyncOverAsync) #647

Comments

frankyuan commented Jul 12, 2020 • edited Loading

Describe the bug

To reproduce

Expected behavior

Further technical details

What I found/tried

ErikEJ commented Jul 12, 2020

frankyuan commented Jul 12, 2020 • edited Loading

JRahnama commented Jul 13, 2020

frankyuan commented Jul 15, 2020 • edited Loading

JRahnama commented Jul 15, 2020

frankyuan commented Jul 16, 2020 • edited Loading

frankyuan commented Jul 20, 2020 • edited Loading

JRahnama commented Jul 23, 2020

Svisstack commented Aug 27, 2020 • edited Loading

lukas-navratil commented Sep 18, 2020 • edited Loading

frankyuan commented Sep 19, 2020

JRahnama commented Sep 25, 2020

Svisstack commented Sep 25, 2020 via email

mischaherbrand commented Sep 28, 2020

twurm commented Sep 28, 2020

JRahnama commented Sep 28, 2020

twurm commented Sep 28, 2020

Svisstack commented Sep 29, 2020 via email • edited Loading

twurm commented Sep 29, 2020

JRahnama commented Sep 30, 2020

JRahnama commented Oct 16, 2020

Svisstack commented Oct 16, 2020 via email

frankyuan commented Oct 19, 2020 • edited Loading

lukas-navratil commented Oct 21, 2020

lukas-navratil commented Oct 30, 2020

lukas-navratil commented Nov 10, 2020

mrmartan commented Dec 1, 2020 • edited Loading

EvgenyGrishnov commented May 29, 2023

sajidur commented May 29, 2023

tobyreid commented Jun 22, 2023 • edited Loading

svap-roshan commented Jul 4, 2023

Describe the bug

To reproduce

Expected behavior

Further technical details

What I found/tried

bennil commented Jul 5, 2023

Malgefor commented Aug 10, 2023

svap-roshan commented Aug 10, 2023

jbogard commented Aug 16, 2023

sliekens commented Sep 21, 2023

co7e commented Nov 17, 2023 • edited Loading

jperlope commented Nov 17, 2023

MichelZ commented Nov 17, 2023

bailsman commented Nov 17, 2023

MichelZ commented Nov 17, 2023

co7e commented Nov 17, 2023

bennil commented Nov 18, 2023

dazinator commented Nov 19, 2023

bailsman commented Dec 11, 2023

jaq316 commented Feb 12, 2024 • edited Loading

Vandersteen commented Feb 12, 2024

bennil commented Feb 26, 2024

dazinator commented Feb 26, 2024 • edited Loading

dazinator commented Feb 26, 2024

David-Engel commented May 28, 2024

frankyuan commented Jul 12, 2020 •

edited

Loading

frankyuan commented Jul 12, 2020 •

edited

Loading

frankyuan commented Jul 15, 2020 •

edited

Loading

frankyuan commented Jul 16, 2020 •

edited

Loading

frankyuan commented Jul 20, 2020 •

edited

Loading

Svisstack commented Aug 27, 2020 •

edited

Loading

lukas-navratil commented Sep 18, 2020 •

edited

Loading

Svisstack commented Sep 29, 2020 via email •

edited

Loading

frankyuan commented Oct 19, 2020 •

edited

Loading

mrmartan commented Dec 1, 2020 •

edited

Loading

tobyreid commented Jun 22, 2023 •

edited

Loading

co7e commented Nov 17, 2023 •

edited

Loading

jaq316 commented Feb 12, 2024 •

edited

Loading

dazinator commented Feb 26, 2024 •

edited

Loading