Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

An unidentifiable error with code 258 has occurred on an ECS Linux container #2081

Closed
svap-roshan opened this issue Jul 5, 2023 · 4 comments
Labels
Duplicate This issue or pull request already exists

Comments

@svap-roshan
Copy link

Describe the bug

I had an application that was functioning properly on an IIS server. However, after hosting it on an ECS fargate Linux container and conducting a load test, the ECS task crashes due to the following error:

Microsoft.Data.SqlClient.SqlException (0x80131904): Execution Timeout Expired. The timeout period elapsed before the operation could be completed, or the server is not responding.

Also, I occasionally encounter the following error message:

System.InvalidOperationException: Timeout expired. The timeout period elapsed before obtaining a connection from the pool. This could be due to all pooled connections being in use and the maximum pool size being reached.

Even though my query has a proper index and a timeout of 5 seconds, it executes within only 20ms. Therefore, the query itself is not the cause of the issue.

Here is my connection string:

Data Source=x.x.x.x;Initial Catalog=DbName;User ID=Username;Password=mypassword;TrustServerCertificate=True

Previously, I had "MultipleActiveResultSets=True" in my connection string, but even after removing it, I still encounter the same error.

Here is my dockerfile

FROM mcr.microsoft.com/dotnet/aspnet:7.0 AS base
WORKDIR /app
EXPOSE 80
# copy csproj and restore as distinct layers
FROM mcr.microsoft.com/dotnet/sdk:7.0 AS build
WORKDIR /src
COPY *.sln .
COPY ["PdfApp/PdfApp.csproj", "PdfApp/"]
RUN dotnet restore "PdfApp/PdfApp.csproj"
# copy and publish app and libraries
COPY . .
WORKDIR "/src/PdfApp"
RUN dotnet build "PdfApp.csproj" -c Release -o /app/build
FROM build AS publish
RUN dotnet publish "PdfApp.csproj" -c Release -o /app/publish /p:UseAppHost=false
FROM base AS final
WORKDIR /app
COPY --from=publish /app/publish .
ENTRYPOINT ["dotnet", "PdfApp.dll"]

Further technical details

I am utilizing .NET Core 7.0, with the package version of Microsoft.Data.SqlClient is set to 5.1.1, while the SQL server is running on Microsoft SQL Server 2019.

Microsoft.Data.SqlClient version: 5.1.1
.NET target: Core 7.0
SQL Server version: SQL Server 2019)
Operating system: Linux Docker container

@JRahnama JRahnama added this to Needs triage in SqlClient Triage Board via automation Jul 5, 2023
@David-Engel
Copy link
Contributor

This looks like a duplicate of #647. The underlying issue is a couple of the async code paths in MDS can block threads during high load and the system thread pool runs out of threads to serve other async commands in a timely manner. This results in those timeouts. Some of the workarounds have been mentioned in #647 and include increasing the minimum thread count in the system thread pool.

The reason this doesn't happen on Windows (or is much more rare, at least) is because Windows utilizes a native SNI implementation where the socket reads/writes happen. The native library uses IOCP for IO completions, which doesn't use the .NET system thread pool. Linux/macOS use a pure managed SNI implementation that does use the .NET system thread pool for completion requests.

@svap-roshan
Copy link
Author

@David-Engel I apologize for not understanding clearly. Could you please provide more specific details on how to increase the minimum thread count? Are you referring to making changes in the application code? Also, could you guide me on how to check the default value for the minimum thread count? Alternatively, would it be possible to solve this issue by increasing the CPU core allocation for the task?

@David-Engel
Copy link
Contributor

Use the .NET ThreadPool APIs: https://learn.microsoft.com/en-us/dotnet/api/system.threading.threadpool.setminthreads?view=net-7.0

Yes, this would need to be done in application code. The default values are described in the above article. No, changing the CPU core allocation wouldn't make a significant difference in this scenario. By "high-load", I'm specifically referring to a high rate of parallel calls into MDS, like connection open. It's not necessarily CPU bound. The system thread pool can be slow to add new threads (seconds for each new thread) while connection open requests are blocking each other and approach their timeout waiting on the connection pool to hand them a connection.

@Kaur-Parminder Kaur-Parminder added Duplicate This issue or pull request already exists and removed untriaged labels Jul 6, 2023
@Kaur-Parminder
Copy link
Contributor

Closing as a duplicate for #647

SqlClient Triage Board automation moved this from Needs triage to Closed Jul 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Duplicate This issue or pull request already exists
Projects
Development

No branches or pull requests

4 participants