-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
System.Diagnostics.Activity Perf Improvement Part 2 #40544
Conversation
Tagging subscribers to this area: @tommcdon |
@@ -255,9 +265,6 @@ public string DisplayName | |||
/// </summary> | |||
public IEnumerable<KeyValuePair<string, object?>> TagObjects | |||
{ | |||
#if ALLOW_PARTIALLY_TRUSTED_CALLERS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tarekgh I removed the SecuritySafeCritical
. Assuming this is no longer needed because we removed the Unsafe.As
on the last PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make sense. If you run the tests locally should catch any issue with the security attributes when running against NetFX.
@@ -76,10 +76,20 @@ public partial class Activity : IDisposable | |||
|
|||
private byte _w3CIdFlags; | |||
|
|||
/* | |||
* Note: | |||
* DynamicDependency is used here to prevent the linker from removing the struct GetEnumerator on the internal types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking more at this code, you could also change the Enumerator implementations like the following:
public Enumerator<T> GetEnumerator() => new Enumerator<T>(_first);
IEnumerator<T> IEnumerable<T>.GetEnumerator() => GetEnumerator();
IEnumerator IEnumerable.GetEnumerator() => GetEnumerator();
And the public Enumerator<T> GetEnumerator()
method would be preserved, since now it is being called. This would be more preferrable, since if we were linking an application, preserving ALL public methods on these types is more than necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good call. I updated the code. Should have just done that initially, would have saved myself a lot of time 🤣 Learned a bit about the linker though so not totally wasted 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may make sense to add a test to ensure the public Enumerator<T> GetEnumerator()
method is preserved, so OpenTelemetry can invoke it through reflection.
@@ -1273,8 +1270,8 @@ public void Add(T value) | |||
} | |||
|
|||
public Enumerator<T> GetEnumerator() => new Enumerator<T>(_first); | |||
IEnumerator<T> IEnumerable<T>.GetEnumerator() => new Enumerator<T>(_first); | |||
IEnumerator IEnumerable.GetEnumerator() => new Enumerator<T>(_first); | |||
IEnumerator<T> IEnumerable<T>.GetEnumerator() => GetEnumerator(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit - you may want to add a comment here and below describing the situation here - "in order to preserve the non-allocating GetEnumerator() so OpenTelemetry can call it..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments & tests.
@CodeBlanch could you please re-run the perf tests again with the change? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Thanks for following up on this.
@tarekgh Here's the perf on the latest changes. I added a benchmark for ActivityLinkTags which is testing the
|
|
||
bool foundGetEnumerator = false; | ||
foreach (MethodInfo method in enumerable.GetType().GetMethods(BindingFlags.Public | BindingFlags.Instance | BindingFlags.NonPublic)) | ||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need foreach here. yu can request the method name directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@tarekgh I just removed the link. The comment seemed to have enough info already? |
@CodeBlanch - thanks for doing this! Can you also add some baseline numbers for comparison (or point out which numbers above were baseline if I missed it?). My goal is to demonstrate that the code change improved performance because new numbers are better than baseline numbers. [EDIT]: nevermind, I see the baseline in the other issue |
Here is the numbers when running same exact tests on my machine: Baseline
Updated
Gain & Losspositive is gain and negative is loss in the perf.
Looks we are gaining in general in the memory allocation but seeing some regression in the speed. mainly the tags scenario:
It is a good memory allocation gain 14% but looks we are losing almost 10% of the speed. we need to weigh here the cost benefit of this change to decide if we can scrify this 10% speed for the 14% memory? or we may look to see if we can enhance more the enumeration speed somehow? |
I was looking at the decompiled code before & after. Before we didn't support Reset of the enumerator, so I dropped that. Micro-micro optimization there, but saves on a field and a set. I ran perf for that, but I think it's just getting lost in the noise at that point. In @tarekgh's numbers above, check out these: Baseline:
Updated
What's interesting about those numbers is, there wasn't a change in those code paths! EnumerateActivityTags before & after both call I'm down to continuing to work on this, but it feels like we're running into noise we can't avoid? @tarekgh @noahfalk Let me know what you want me to do! Some extra thoughts...
|
@CodeBlanch I'll try to re-run it again with your last change. I doubt there is a noises here but this will show up with my next run. Could be the difference between my results and yours is the baseline version. I am using |
I played more with measuring, and I am seeing the noises that @CodeBlanch talked about. So, I have written a test case for EnumerateActivityTagObjects which uses static created Activity object with tags. running my modified test, I am seeing the memory allocations gain is I have a couple of suggestion if we can apply:
|
Thanks for the help on this @tarekgh!
This is what's weird with the benchmarks. I didn't change anything for the public IEnumerable<KeyValuePair<string, string?>> Tags
{
get => _tags?.EnumerateStringValues() ?? s_emptyBaggageTags;
}
public IEnumerable<KeyValuePair<string, string?>> EnumerateStringValues()
{
LinkedListNode<KeyValuePair<string, object?>>? current = _first;
while (current != null)
{
if (current.Value.Value is string || current.Value.Value == null)
{
yield return new KeyValuePair<string, string?>(current.Value.Key, (string?)current.Value.Value);
}
current = current.Next;
};
} Doesn't use the new |
@CodeBlanch you are right, I have looked at the implementation of Activity.Tags in the baseline and your changes and they identical. I even compared the IL code. maybe this could be a JIT change somehow? we may compare without your changes at all and look at the numbers. If you don't have time to try that, I can try it tomorrow. just build a private without your changes and run the perf test on it. |
I did more experiment but building the runtime without your change (as the baseline) and then built it with your change and now the numbers make more sense: Baseline
your change
I think there is no real CPU perf regressions in @noahfalk will be around tomorrow if he has any comment before we merge this |
@tarekgh Awesome! Thanks for the assist. I'll open a PR for the perf tests a bit later. Did you make any changes to the actual benchmarks I should include on that? |
I just added [Benchmark]
public void EnumTagObjectsModified()
{
foreach (var t in s_activity.TagObjects)
{
bool b = t.Value.GetType() == typeof(string); // I know this is not the best :-(
}
}
private static Activity s_activity = InitActivity();
private static Activity InitActivity()
{
ActivitySource aSource = new ActivitySource("TestActivitySource-mine");
ActivityListener al= new ActivityListener
{
ShouldListenTo = s => s.Name == "TestActivitySource-mine",
GetRequestedDataUsingContext = (ref ActivityCreationOptions<ActivityContext> o) => ActivityDataRequest.AllDataAndRecorded
};
ActivitySource.AddActivityListener(al);
Activity a = aSource.StartActivity(
"TestActivity",
ActivityKind.Internal,
parentContext: default,
tags: new Dictionary<string, object>
{
["tag1"] = "string1",
["tag2"] = 1,
["tag3"] = "string2",
["tag4"] = false,
["tag5"] = 2.4,
["tag6"] = new Object(),
});
return a;
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@noahfalk will be around tomorrow if he has any comment before we merge this
Thanks for the diligence, I'm happy if you guys are. If we are getting requests to improve perf we can continue tweaking in 6.0. Fwiw when benchmarking at this level controlling for noise is hard, for example code alignment
Updated the public code to use the internal
struct GetEnumerator
method to prevent the linker from removing it.See #40362
/cc @tarekgh @noahfalk @cijothomas @eerhardt