Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error codes for timeout different per operating system #1902

Closed
joshbartley opened this issue Jan 25, 2023 · 3 comments
Closed

Error codes for timeout different per operating system #1902

joshbartley opened this issue Jan 25, 2023 · 3 comments
Labels
🎨 By Design Issues due to driver feature design and will not be fixed.

Comments

@joshbartley
Copy link

Describe the bug

I'm setting up Polly with SqlClient and using the error codes to track if a request should be retried or not. I'm testing resiliency with a network failure resulting in a BGP event that takes 2-3 minutes to resolve. I'm using the error codes since it's a number and not a string, but they seem to be different per operating system. My only thought could be that maybe once a connection is in the pool, connecting to a pooled connection that has a TCP timeout results in an error code of 0. Then it should still return some number saying that instead of 0.

Windows 10
If I point to a server that is not accessible, I get
"A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible."
Error Code: 53

Ubuntu 22.04
If I point to a server that is not accessible, I get
"A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible."
Error Code: 0

Exception message:
Stack trace:
A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: TCP Provider, error: 35 - An internal exception was caught)    at Microsoft.Data.ProviderBase.DbConnectionPool.TryGetConnection(DbConnection owningObject, UInt32 waitForMultipleObjectsTimeout, Boolean allowCreate, Boolean onlyOneCheckConnection, DbConnectionOptions userOptions, DbConnectionInternal& connection)

To reproduce

using (var sqlConn = new SqlConnection(connectionString))
            { 
                sqlConn.Open();    
                SqlCommand cmd = new SqlCommand("SELECT 1;", sqlConn);
                cmd.CommandType = System.Data.CommandType.Text;
                var rdr = cmd.ExecuteReader();
            }

using (var sqlConn = new SqlConnection(connectionString))
{
    await sqlConn.ExecuteAsyncWithRetry("SELECT 1");
}

Code for the dapper/Polly work from https://gist.github.com/hyrmn/ce124e9b1f50dbf9d241390ebc8f6df3 with the added error code 53 added to the case statement.

Expected behavior

I would expect that the Exception.Errors contain an error code of 53 and not 0 like it does on a Linux machine. SqlException.Number should be 53 as well thought I may not know all the details on how that is set.

Further technical details

Microsoft.Data.SqlClient version: 5.0.1
.NET target: 6.0
SQL Server version: SQL Server 2019
Operating system: Windows 10, Ubuntu 22.04

@lcheunglci lcheunglci added the 🆕 Triage Needed For new issues, not triaged yet. label Jan 25, 2023
@lcheunglci
Copy link
Contributor

lcheunglci commented Jan 25, 2023

Thanks @joshbartley for bringing this to our attention and providing a repro sample. I feel that it's similar to this issue #1773 except in your case, the exception message is still the same, but the error code differs. We'll take a look soon.

@lcheunglci lcheunglci added 🎨 By Design Issues due to driver feature design and will not be fixed. and removed 🆕 Triage Needed For new issues, not triaged yet. labels Jan 31, 2023
@lcheunglci
Copy link
Contributor

The error 53 comes from the operating system on Windows. On Linux having error code 0 means that we were not able to resolve the error, and the error depends on the .NET API that we use, so it's by design that we cannot capture the same error code between different platforms.

@joshbartley
Copy link
Author

joshbartley commented Jan 31, 2023

This is not great and a sign of a leaky abstraction. Writing a cross platform app I have to check error strings instead of an error code for a can't connect message? How is exposing an error code of 0 helpful at all? This would be like an http client returning a status code 404 on windows but a 0 on linux for not found. This has downstream implications for Entity Framework's own retry logic which relies on error codes as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🎨 By Design Issues due to driver feature design and will not be fixed.
Projects
None yet
Development

No branches or pull requests

2 participants