Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NRE in Kestrel Telemetry EventSources #44817

Open
Tratcher opened this issue Nov 1, 2022 · 3 comments
Open

NRE in Kestrel Telemetry EventSources #44817

Tratcher opened this issue Nov 1, 2022 · 3 comments
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-kestrel

Comments

@Tratcher
Copy link
Member

Tratcher commented Nov 1, 2022

Mirror of dotnet/runtime#77434

EventSource.IsEnabled() may start returning true before the OnEventCommand completes.
We initialize shared EventCounter instances in OnEventCommand that we then use from instrumented code paths.

If the EventSource is initialized before being enabled, and the instance is accessed from multiple threads, a thread could see IsEnabled() == true and then read a null EventCounter instance.
This race condition only appears once per process.

I believe this is the cause of an exception YARP hit in CI:

System.NullReferenceException : Object reference not set to an instance of an object.
   at System.Net.Http.HttpTelemetry.Http11RequestLeftQueue(Double timeOnQueueMilliseconds)
   at System.Net.Http.HttpConnectionPool.GetHttp11ConnectionAsync(HttpRequestMessage request, Boolean async, CancellationToken cancellationToken)

We should fix these cases.

cc: @Tratcher aspnetcore has 2 such cases as well:


An internal customer hit something similar:
MicrosoftTeams-image (1)

MicrosoftTeams-image (2)

@adityamandaleeka
Copy link
Member

cc: @davmason

@adityamandaleeka adityamandaleeka added this to the .NET 8 Planning milestone Nov 2, 2022
@ghost
Copy link

ghost commented Nov 2, 2022

Thanks for contacting us.

We're moving this issue to the .NET 8 Planning milestone for future evaluation / consideration. We would like to keep this around to collect more feedback, which can help us with prioritizing this work. We will re-evaluate this issue, during our next planning meeting(s).
If we later determine, that the issue has no community involvement, or it's very rare and low-impact issue, we will close it - so that the team can focus on more important and high impact issues.
To learn more about what to expect next and how this issue will be handled you can read more about our triage process here.

@davmason
Copy link
Member

davmason commented Nov 2, 2022

This looks like the same underlying cause as the issue fixed in dotnet/runtime#76965.

EventSource has always called DoCommand in the base EventSouce constructor (before any derived class constructors are run), we've just happened to not run in to it so far because most of our internal EventSources have a static Log field and were being touched before they were enabled.

If you enable an EventSource via the pause on startup feature, then it is enabled before any managed code runs and will always hit this issue. We solved it in MetricsEventSource by moving initialization outside of the constructor.

@amcasey amcasey added area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions and removed area-runtime labels Jun 2, 2023
@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
@wtgodbe wtgodbe removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
@dotnet-policy-service dotnet-policy-service bot added the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 6, 2024
@wtgodbe wtgodbe removed the pending-ci-rerun When assigned to a PR indicates that the CI checks should be rerun label Feb 13, 2024
@dotnet dotnet deleted a comment from dotnet-policy-service bot Feb 13, 2024
@dotnet dotnet deleted a comment from dotnet-policy-service bot Feb 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-networking Includes servers, yarp, json patch, bedrock, websockets, http client factory, and http abstractions feature-kestrel
Projects
None yet
Development

No branches or pull requests

5 participants