Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metric AggregatorStore optimization for sorting Tag keys #2777

Conversation

utpilla
Copy link
Contributor

@utpilla utpilla commented Jan 11, 2022

Fixes item 1 of #2374

Changes

  • Add another ConcurrentDictionary<string, string> named tagKeyCombinations to map unsorted tag keys combination to the sorted combination
  • First lookup this new dictionary to know whether we already have the sorted combination. If it's already present, then we can avoid sorting the tag keys. If not, we sort the tag keys and add it to this new dictionary
  • Use the sorted tag keys combination to find the distinct tag values combinations and MetricPoint[] index

Benchmark Results

There is almost a 50% perf improvement starting more than one Tag keys

// * Summary *

BenchmarkDotNet=v0.13.1, OS=Windows 10.0.22000
Intel Core i7-9700 CPU 3.00GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=6.0.101
[Host] : .NET Core 3.1.22 (CoreCLR 4.700.21.56803, CoreFX 4.700.21.57101), X64 RyuJIT
DefaultJob : .NET Core 3.1.22 (CoreCLR 4.700.21.56803, CoreFX 4.700.21.57101), X64 RyuJIT

Without the change:

Method AggregationTemporality Mean Error StdDev Allocated
CounterHotPath Cumulative 37.78 ns 0.428 ns 0.401 ns -
CounterWith1LabelsHotPath Cumulative 97.82 ns 0.291 ns 0.258 ns -
CounterWith3LabelsHotPath Cumulative 435.29 ns 3.766 ns 3.523 ns -
CounterWith5LabelsHotPath Cumulative 611.81 ns 2.549 ns 2.259 ns -
CounterWith6LabelsHotPath Cumulative 720.06 ns 7.473 ns 6.625 ns -
CounterWith7LabelsHotPath Cumulative 811.55 ns 4.374 ns 3.877 ns -
CounterHotPath Delta 37.24 ns 0.169 ns 0.142 ns -
CounterWith1LabelsHotPath Delta 97.78 ns 0.472 ns 0.394 ns -
CounterWith3LabelsHotPath Delta 446.67 ns 4.068 ns 3.805 ns -
CounterWith5LabelsHotPath Delta 620.58 ns 5.938 ns 5.554 ns -
CounterWith6LabelsHotPath Delta 726.85 ns 11.819 ns 14.514 ns -
CounterWith7LabelsHotPath Delta 825.72 ns 3.472 ns 3.078 ns -

With the change:

Method AggregationTemporality Mean Error StdDev Allocated
CounterHotPath Cumulative 38.32 ns 0.425 ns 0.397 ns -
CounterWith1LabelsHotPath Cumulative 118.52 ns 0.655 ns 0.581 ns -
CounterWith3LabelsHotPath Cumulative 223.34 ns 1.932 ns 1.807 ns -
CounterWith5LabelsHotPath Cumulative 354.63 ns 2.737 ns 2.561 ns -
CounterWith6LabelsHotPath Cumulative 402.70 ns 3.946 ns 3.692 ns -
CounterWith7LabelsHotPath Cumulative 452.08 ns 3.191 ns 2.985 ns -
CounterHotPath Delta 37.78 ns 0.251 ns 0.235 ns -
CounterWith1LabelsHotPath Delta 118.62 ns 0.752 ns 0.704 ns -
CounterWith3LabelsHotPath Delta 219.84 ns 3.252 ns 3.042 ns -
CounterWith5LabelsHotPath Delta 355.83 ns 4.844 ns 4.531 ns -
CounterWith6LabelsHotPath Delta 405.35 ns 2.911 ns 2.723 ns -
CounterWith7LabelsHotPath Delta 450.32 ns 2.457 ns 2.298 ns -

@utpilla utpilla requested a review from a team January 11, 2022 05:42
@codecov
Copy link

codecov bot commented Jan 11, 2022

Codecov Report

Merging #2777 (b62c0f9) into main (aad309b) will increase coverage by 0.18%.
The diff coverage is 82.75%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #2777      +/-   ##
==========================================
+ Coverage   83.75%   83.93%   +0.18%     
==========================================
  Files         251      252       +1     
  Lines        8864     8902      +38     
==========================================
+ Hits         7424     7472      +48     
+ Misses       1440     1430      -10     
Impacted Files Coverage Δ
src/OpenTelemetry/Metrics/Tags.cs 66.66% <66.66%> (ø)
src/OpenTelemetry/Metrics/AggregatorStore.cs 83.23% <100.00%> (+1.47%) ⬆️
...nTelemetry/Internal/OpenTelemetrySdkEventSource.cs 73.58% <0.00%> (+1.88%) ⬆️
...lemetry/Internal/SelfDiagnosticsConfigRefresher.cs 92.30% <0.00%> (+5.76%) ⬆️
...enTelemetry/Metrics/ObjectArrayEqualityComparer.cs 80.00% <0.00%> (+6.66%) ⬆️
...enTelemetry/Metrics/StringArrayEqualityComparer.cs 80.00% <0.00%> (+6.66%) ⬆️
...mentation/ExportClient/BaseOtlpGrpcExportClient.cs 62.50% <0.00%> (+12.50%) ⬆️
...entation/ExportClient/OtlpGrpcTraceExportClient.cs 50.00% <0.00%> (+14.28%) ⬆️
...xporter.OpenTelemetryProtocol/OtlpTraceExporter.cs 59.09% <0.00%> (+22.72%) ⬆️

// Two-Level lookup. TagKeys x [ TagValues x Metrics ]
private readonly ConcurrentDictionary<string[], ConcurrentDictionary<object[], int>> keyValue2MetricAggs =
new ConcurrentDictionary<string[], ConcurrentDictionary<object[], int>>(new StringArrayEqualityComparer());
new ConcurrentDictionary<string[], ConcurrentDictionary<object[], int>>(StringArrayComparer);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another alternate option which doesn't have the risk of too many entries (when user keeps providing keys in different order)

Have the dictionary as before
If tagKeys lookup fail, sort and lookup again.
If fails, insert both original tagKeys and its sorted one to the dictionary.

So that we only store atmost 2 entries per key set. And we only do a single lookup in hotpath, as opposed to 2 look ups.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good suggestion. The only issue with this would be if the user provides the sorted combination as the very first combination and uses some random combination later on. In this case, we would always be sorting the keys.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, if the same order is re-used, then you get max performance. else lower perf.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this is good optimization to try out. Reusing the same order is probably the most common scenario. I suppose it's possible for library to use a single instrument in multiple code paths which add dimensions in a different order, but probably an edge case.

A different order may be likely in the event I have two libraries emitting the same metric name, but since they're different libraries they'd be separate Metric instances, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on offline sync, this may not be feasible.
We can come back to this and keep optimizing. For now, this PR avoids sorting in hot path, and makes a very significant perf boost.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A different order may be likely in the event I have two libraries emitting the same metric name, but since they're different libraries they'd be separate Metric instances, right?

They cannot emit with same metric name (unless different Meter). So it'll be different instances, yes.

Copy link
Member

@alanwest alanwest Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on offline sync, this may not be feasible.

I was wondering about this myself. Was the issue that synchronizing the two inserts (sorted and original order) would be tough?

Copy link
Contributor Author

@utpilla utpilla Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is in keeping the two entries for a given combination in sync.

For example:

counter.Add(10, new("A",1), new("C",3), new("B",2));

Here, since, ("A","C","B") is not present in the dictionary, we add these two entries:

("A","C","B") -> (1,3,2) -> MetricPointIndex1

("A","B","C") -> (1,2,3) -> MetricPointIndex1

Now, if we encounter a

counter.Add(10, new("A",10), new("B",20), new("C",30));

we will add another entry to the inner dictionary:

("A","B","C") -> (1,2,3) -> MetricPointIndex1
              -> (10,20,30)-> MetricPointIndex2

This is fine but we also need to add this entry to ("A","C","B"):

("A","C","B") -> (1,3,2) -> MetricPointIndex1
              -> (10,30,20)-> MetricPointIndex2 (This is the difficult part)

When we get a new set of tag values we have to find if there is another tag key combination present in the dictionary, and if it's present we have to add the same MetricPointIndex for the tag values sorted according to the tag keys.
In this case, we get the new tag values (10,20,30) which are sorted by ("A","B","C"). Now we have to find if there is some other combination of ("A","B","C") present in the dictionary. If it's present we then have to sort the tag values according to the combination that is present, in this case, ("A","C","B") which would mean we have to sort the tag values like this: (10,30,20). We then have to ensure that we assign it the same MetricPoint index.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh... right gotta deal with the values as well.

counterLong.Add(10, new("Key2", "Value2"), new("Key1", "Value1"), new("Key3", "Value3"));
counterLong.Add(10, new("Key2", "Value2"), new("Key3", "Value3"), new("Key1", "Value1"));
meterProvider.ForceFlush(MaxTimeToAllowForFlush);
sumReceived = GetLongSum(exportedItems);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that going to really tell if we exported one MetricPoint or more than one? IIRC, this method simply sums up all metric points, so won't really validate what you are after..

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think we should validate that only one Metric is received, and that metric has a single MetricPoint, with the tags (key1,key2,key3).

Copy link
Contributor Author

@utpilla utpilla Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh sorry my bad. I thought this GetLongSum would only get the first MetricPoint's sum for some reason and if the sum value matches the expected value, it would mean that all of Counter.Add statements contributed to only one MetricPoint.

I'll update this.

@utpilla
Copy link
Contributor Author

utpilla commented Jan 12, 2022

We would also need to have a mapping for sorted tag values. This adds another dictionary lookup. Here are the benchmark results with these changes:

Method AggregationTemporality Mean Error StdDev Allocated
CounterHotPath Cumulative 38.34 ns 0.554 ns 0.463 ns -
CounterWith1LabelsHotPath Cumulative 140.57 ns 0.571 ns 0.506 ns -
CounterWith3LabelsHotPath Cumulative 261.46 ns 1.030 ns 0.913 ns -
CounterWith5LabelsHotPath Cumulative 411.40 ns 1.445 ns 1.281 ns -
CounterWith6LabelsHotPath Cumulative 466.85 ns 1.490 ns 1.321 ns -
CounterWith7LabelsHotPath Cumulative 526.19 ns 1.694 ns 1.584 ns -
CounterHotPath Delta 37.66 ns 0.127 ns 0.119 ns -
CounterWith1LabelsHotPath Delta 139.48 ns 1.127 ns 1.054 ns -
CounterWith3LabelsHotPath Delta 261.91 ns 1.425 ns 1.264 ns -
CounterWith5LabelsHotPath Delta 410.88 ns 2.049 ns 1.816 ns -
CounterWith6LabelsHotPath Delta 471.72 ns 1.583 ns 1.403 ns -
CounterWith7LabelsHotPath Delta 531.99 ns 1.902 ns 1.686 ns -

}
}

if (!this.tagValueCombinations.TryGetValue(tagValues, out sortedTagValues))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These new dictionaries may grow quite large. Could we do our check of maxMetricPoints earlier to avoid allowing the dictionaries to grow unbounded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That check won't help as we are storing different possible combinations of the tag keys. So if we have maxMetricPoints as 1 we should still allow for multiple entries as we have to account for different combinations. There is no way to know what the maximum possible combinations would be as it depends on the number of keys provided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah right I wasn't thinking about this correctly, so then maybe do the sort as you're doing, but defer this.tagValueCombinations.TryAdd(seqValue, sortedTagValues); until you know you've gotten a metric point.

private readonly ConcurrentDictionary<string[], string[]> tagKeyCombinations =
new ConcurrentDictionary<string[], string[]>(StringArrayComparer);

private readonly ConcurrentDictionary<object[], object[]> tagValueCombinations =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be maintained per tagKeyCombination.

Copy link
Member

@alanwest alanwest Jan 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This crossed my mind too, but I considered something like

counter.Add(10, new("A",1), new("C",3), new("B",2));
counter.Add(10, new("D",1), new("E",3), new("F",2));

I think this would actually work fine as is - resulting in [1,3,2] inserted once, but it does seem a little goofy. So, maybe makes sense to scope it to tagKeyCombination for clarity.

@utpilla
Copy link
Contributor Author

utpilla commented Jan 22, 2022

The TagKeys and TagValues have to be sorted together and the dictionary that would contain the mapping will have to have both the keys and the values as a composite Key for the Dictionary. I have created a new struct called Tags which contains string[] Keys and object[] Values. This new struct Tags will be used as a Key for the dictionary and it will also be used as the Value type for the dictionary as the mapping is from the given tags combination to the respective sorted combination.

Here are the benchmarks results with this change:

Method AggregationTemporality Mean Error StdDev Allocated
CounterHotPath Cumulative 38.18 ns 0.702 ns 0.912 ns -
CounterWith1LabelsHotPath Cumulative 146.75 ns 1.381 ns 1.224 ns -
CounterWith3LabelsHotPath Cumulative 281.70 ns 1.353 ns 1.266 ns -
CounterWith5LabelsHotPath Cumulative 437.58 ns 2.072 ns 1.837 ns -
CounterWith6LabelsHotPath Cumulative 502.92 ns 2.166 ns 1.920 ns -
CounterWith7LabelsHotPath Cumulative 553.88 ns 2.734 ns 2.424 ns -
CounterHotPath Delta 36.47 ns 0.438 ns 0.388 ns -
CounterWith1LabelsHotPath Delta 155.47 ns 0.403 ns 0.337 ns -
CounterWith3LabelsHotPath Delta 293.14 ns 3.957 ns 3.507 ns -
CounterWith5LabelsHotPath Delta 452.68 ns 1.267 ns 0.989 ns -
CounterWith6LabelsHotPath Delta 503.88 ns 6.454 ns 5.389 ns -
CounterWith7LabelsHotPath Delta 557.82 ns 2.988 ns 2.649 ns -

@utpilla
Copy link
Contributor Author

utpilla commented Jan 22, 2022

Closing this PR in favor of #2805

@utpilla utpilla closed this Jan 22, 2022
@utpilla utpilla deleted the utpilla/Metric-AggregatorStore-Optimization branch November 23, 2023 03:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants