Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance Improvements - Logging #2946

Merged
merged 19 commits into from
Apr 4, 2023
Merged

Conversation

RohitRanjanMS
Copy link
Member

@RohitRanjanMS RohitRanjanMS commented Jan 9, 2023

This PR addresses the issue #2842.
Enabling Application Insights has significant performance and throughput overhead. Function supports adding context to log/metric telemetry using BeginScope(). Each call to BeginScope() creates another level of nesting, so that logs/metrics produced within the scope can be tagged with information about where they came from. While this is done extensively in host, customers can create scopes as well (.Net In-Proc) . Nick did an investigation to highlight the performance overhead of using scopes. This is primarily due to multiple Dictionaries that get allocated to support scope. One at the time of passing the scope attributes, another when these attributes are received and wrapped in ReadOnlyDictionary. There’s another set of allocation when the current scope is built where-in it recursively iterate through the nested scopes.

Improvements in this PR

  1. ApplicationInsightsLogger.cs
    a. Removed BeginScope from the Log() method. All the known properties are passed as param and added to the telemetry property. This reduces the level of nesting and also reduces the cost of building scopeInfo.
    b. Reduced the numer of times GetMergedStateDictionaryOrNull is called.
    c. Removed Linq from ApplyScopeProperties().
    d. Saved some boxing/unboxing and made type casting more efficient.

  2. LogLevelEnumHelper .cs
    Enum helper to convert enum into string. This uses a cached string array. No allocation and very efficient compared to ToString().

  3. WebJobsTelemetryInitializer.cs
    a. Restricted the call to GetMergedStateDictionaryOrNull() for only DependencyTelemetry.
    b. Category, LogLevel, EventId and EventName are set in ApplicationInsightsLogger.Log() method.

  4. DictionaryLoggerScope.cs
    a. Push() - If state is a dictionary, then use it otherwise build a new dictionary.
    b. GetMergedStateDictionaryOrNull() – Return null if Current is null.
    c. Cached the scopeInfo and return from cache if exist. The cache is tied to the current scope.
    e. _itemCount – to avoid dictionary resizing.

Benchmarking:
Memory Allocation:

  • Dictionary<string, object> allocation is down by ~99.75%.
  • Resize is almost negligible.

image

  • Req/sec and avg response time is down by ~8%

Screenshot 2023-03-10 112101

image

Execution Time
image

@RohitRanjanMS RohitRanjanMS marked this pull request as draft January 9, 2023 11:40
@RohitRanjanMS RohitRanjanMS marked this pull request as ready for review January 24, 2023 20:36
@RohitRanjanMS RohitRanjanMS self-assigned this Jan 24, 2023
@RohitRanjanMS RohitRanjanMS requested a review from jviau January 24, 2023 21:10
…ationInsightsLogger.cs

Co-authored-by: Jacob Viau <javia@microsoft.com>
Copy link
Member

@NickCraver NickCraver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes are looking good - My main concern is around the GetMergedStateDictionaryOrNull shifts - have we done memory profiling on that tree in particular? (I see the total profiling - nice! - I'm only asking about the remainder) I worry with the new approach we may be adding dictionary allocations overall. I think we can scope this in some places but that's messy too - IMO we need another way to go about doing this all together, but I'm not sure where else it's still used looking only at the diff scope of this PR.

Overall: I think there's probably a round of changes here I'm not seeing in GitHub's current version given resolved comments, happy to take a peek at latest if helpful - nice work!

@@ -12,16 +12,16 @@ internal class DictionaryLoggerScope
{
private static AsyncLocal<DictionaryLoggerScope> _value = new AsyncLocal<DictionaryLoggerScope>();

private DictionaryLoggerScope(IReadOnlyDictionary<string, object> state, DictionaryLoggerScope parent)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked into getting rid of this type entirely? I wrote this years ago before ExternalScopes were supported and have changed every other instance of logging to use that rather than this. The only holdout is how we access the scope in the TelemetryInitializer, but I never did enough research to figure out how to get around that.

I'd be curious how much improvement we get by using the built-in ScopeProvider, which I assume is optimized for performance...

I'd be interested to see performance reports if this was used (even if it is fairly hacked together for now).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did try IExternalScopeProvider but it didn't work.

  1. As you mentioned, we need the scope in the TelemetryInitializer for Dependencies. All telemetry items flow through the ApplicationInsightsLogger except dependencies. TelemetryInitializer is the only way I can think of to enrich dependencies with additional information.
  2. IExternalScopeProvider calls an internal BeginScope instead of the one provided by us. We need this for StartTelemetryIfFunctionInvocation. ApplicationInsightsLogger is the only logger that uses BeginScope.

I did some benchmarking against IExternalScopeProvider and it's very efficient and works like magic. Unfortunately, couldn't use it here.
The proposed changes are efficient as well, we are saving almost 99.74% allocation even with all the complexities.

Copy link
Contributor

@jviau jviau Mar 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RohitRanjanMS can you share your code with the IExternalScopeProvider implementation?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, I don't have a commit, but I can make a quick change and push to a branch.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created a branch that uses IExternalScopeProvider. I don't see any issue with BeginScope() now and the only thing that's stopping us from using this is WebJobsTelemetryInitializer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some benchmarking using LoggerFactoryScopeProvider. While the performance is almost same, memory allocation is significantly less.

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the branch. I understand your issue now of needing access to the log scope - or rather to a telemetry property bag for multiple forms of telemetry (traces and logs here). I believe this is a concept solved by OTel Baggage, but we don't have access to it here. I don't really like the idea of rolling our own scope and having this discrepancy in how this one ILogger works - but addressing this is not that straight forward.

As for the allocation difference you see - yes, your sample branch is allocating an extra dictionary. You can remove that perf hit by avoiding the dictionary allocation and attaching scope data directly to where it needs to go within the ForEachScope delegate. Additionally, you want to use the state parameter to pass in any variables you access within the ForEachScope delegate to avoid function closures. With those 2 changes you should see significantly less allocations.

Copy link
Member

@NickCraver NickCraver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Current code looks good from an efficiency standpoint - I'm only reviewing in text so limited here and profiles are very much appreciated!

My only suggestion here would be: do we want to beef up the testing slightly specifically on scopes/expected values? There's enough of a change here I'm not sure existing tests cover all the cases that can happen with overrides, what if the value set in an override is null, etc - just from a 1:1 behavior with the old for nested properties.

@RohitRanjanMS
Copy link
Member Author

Current code looks good from an efficiency standpoint - I'm only reviewing in text so limited here and profiles are very much appreciated!

My only suggestion here would be: do we want to beef up the testing slightly specifically on scopes/expected values? There's enough of a change here I'm not sure existing tests cover all the cases that can happen with overrides, what if the value set in an override is null, etc - just from a 1:1 behavior with the old for nested properties.

I will have a look at test cases and add more if we don't have enough coverage.

Copy link
Contributor

@jviau jviau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Small nits and some perf suggestions.

My primary concern is the way this makes our one ApplicationInsightsLogger unique. I worry that we have two ways log scopes are done, and it is not clear if you need to use ILogger.BeginScope or this other DictionaryLoggerScope. I get that we are trying to bring log scope properties over to distributed tracing (AI dependencies here). Did we explore if Activity.Tags would work here? Or is there some other prescribed pattern for having a general "telemetry scope" instead of a log scope?

I am approving this as the PR goal is performance and the DictionaryLoggerScope deviation (which is my primary concern) predates this PR.

@@ -12,16 +12,16 @@ internal class DictionaryLoggerScope
{
private static AsyncLocal<DictionaryLoggerScope> _value = new AsyncLocal<DictionaryLoggerScope>();

private DictionaryLoggerScope(IReadOnlyDictionary<string, object> state, DictionaryLoggerScope parent)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the branch. I understand your issue now of needing access to the log scope - or rather to a telemetry property bag for multiple forms of telemetry (traces and logs here). I believe this is a concept solved by OTel Baggage, but we don't have access to it here. I don't really like the idea of rolling our own scope and having this discrepancy in how this one ILogger works - but addressing this is not that straight forward.

As for the allocation difference you see - yes, your sample branch is allocating an extra dictionary. You can remove that perf hit by avoiding the dictionary allocation and attaching scope data directly to where it needs to go within the ForEachScope delegate. Additionally, you want to use the state parameter to pass in any variables you access within the ForEachScope delegate to avoid function closures. With those 2 changes you should see significantly less allocations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants