Performance optimizations for the inmem sinks key flattening #161

mkeeler · 2024-03-05T22:44:29Z

Description

In some Consul profiling we noticed the inmem sinks key flattening showing up as taking significant CPU. So I went into this looking to optimize key flattening to reduce the overhead.

The TLDR from everything that follows is that the changes introduced in this PR reduce CPU usage by 54-75% and memory allocations by 70-83%. for key flattening in the inmem sink

Details

I did these optimizations in 3 parts:

Eliminate fmt.Sprintf in flattenKeyLabels:

This drastically reduces allocations when there are lots of labels and had a 15-30% CPU usage reduction when labels were used. The more labels used the more drastic the reduction.

Simplify flattenKey:

Here I got rid of the temporary buffer and string replacer and put the code into the final state of this commit. This reduced the CPU utilization for flattenKey by 50% and allocations by 75%. The impact on flattenKeyLabels was a little less pronounced as most of the CPU is actually in label processing. Even so the reductions were 25-40% for CPU usage and 37-42% for allocations.

Eliminate the space replacer and call strings.Replace just once:

Within the label processing loop we were using the space replacer to write modified bytes out to the buffer. I instead swapped this for direct buffer writes and a single call to strings.Replace at the end. This resulted in another 33-47% CPU reduction for the function and 20-50% less allocations.

Overall benchmark comparison:

goos: darwin
goarch: amd64
pkg: github.com/hashicorp/go-metrics
cpu: Intel(R) Core(TM) i9-9980HK CPU @ 2.40GHz
                                               │ /Users/mkeeler/go-metrics.old.txt │ /Users/mkeeler/go-metrics.no-space-replacer.txt │
                                               │              sec/op               │         sec/op           vs base                │
FlattenKey/three-segments-16                                          189.00n ± 8%               62.97n ± 4%  -66.68% (p=0.000 n=10)
FlattenKey/five-segments-16                                           198.10n ± 6%               75.89n ± 3%  -61.69% (p=0.000 n=10)
FlattenKey/ten-segments-16                                             270.6n ± 4%               123.7n ± 5%  -54.30% (p=0.000 n=10)
FlattenKeyLabels/three-segments-no-labels-16                           277.3n ± 6%               102.0n ± 4%  -63.23% (p=0.000 n=10)
FlattenKeyLabels/three-segments-one-label-16                           512.4n ± 3%               128.9n ± 4%  -74.84% (p=0.000 n=10)
FlattenKeyLabels/five-segments-three-labels-16                        1033.0n ± 6%               238.3n ± 1%  -76.94% (p=0.000 n=10)
FlattenKeyLabels/ten-segments-five-labels-16                          1465.0n ± 6%               360.9n ± 2%  -75.36% (p=0.000 n=10)
_GlobalMetrics_Direct/direct-16                                        20.87n ± 5%               20.30n ± 2%   -2.68% (p=0.015 n=10)
_GlobalMetrics_Direct/atomic.Value-16                                  21.73n ± 7%               24.00n ± 9%  +10.42% (p=0.023 n=10)
geomean                                                                215.2n                    88.28n       -58.97%

                                               │ /Users/mkeeler/go-metrics.old.txt │ /Users/mkeeler/go-metrics.no-space-replacer.txt │
                                               │               B/op                │         B/op           vs base                  │
FlattenKey/three-segments-16                                         144.00 ± 0%                16.00 ± 0%  -88.89% (p=0.000 n=10)
FlattenKey/five-segments-16                                          160.00 ± 0%                24.00 ± 0%  -85.00% (p=0.000 n=10)
FlattenKey/ten-segments-16                                           240.00 ± 0%                64.00 ± 0%  -73.33% (p=0.000 n=10)
FlattenKeyLabels/three-segments-no-labels-16                         224.00 ± 0%                32.00 ± 0%  -85.71% (p=0.000 n=10)
FlattenKeyLabels/three-segments-one-label-16                         312.00 ± 0%                40.00 ± 0%  -87.18% (p=0.000 n=10)
FlattenKeyLabels/five-segments-three-labels-16                        584.0 ± 0%                152.0 ± 0%  -73.97% (p=0.000 n=10)
FlattenKeyLabels/ten-segments-five-labels-16                          832.0 ± 0%                368.0 ± 0%  -55.77% (p=0.000 n=10)
_GlobalMetrics_Direct/direct-16                                       0.000 ± 0%                0.000 ± 0%        ~ (p=1.000 n=10) ¹
_GlobalMetrics_Direct/atomic.Value-16                                 0.000 ± 0%                0.000 ± 0%        ~ (p=1.000 n=10) ¹
geomean                                                                          ²                          -72.37%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

                                               │ /Users/mkeeler/go-metrics.old.txt │ /Users/mkeeler/go-metrics.no-space-replacer.txt │
                                               │             allocs/op             │       allocs/op        vs base                  │
FlattenKey/three-segments-16                                          4.000 ± 0%                1.000 ± 0%  -75.00% (p=0.000 n=10)
FlattenKey/five-segments-16                                           4.000 ± 0%                1.000 ± 0%  -75.00% (p=0.000 n=10)
FlattenKey/ten-segments-16                                            4.000 ± 0%                1.000 ± 0%  -75.00% (p=0.000 n=10)
FlattenKeyLabels/three-segments-no-labels-16                          7.000 ± 0%                2.000 ± 0%  -71.43% (p=0.000 n=10)
FlattenKeyLabels/three-segments-one-label-16                         11.000 ± 0%                2.000 ± 0%  -81.82% (p=0.000 n=10)
FlattenKeyLabels/five-segments-three-labels-16                       18.000 ± 0%                3.000 ± 0%  -83.33% (p=0.000 n=10)
FlattenKeyLabels/ten-segments-five-labels-16                         23.000 ± 0%                4.000 ± 0%  -82.61% (p=0.000 n=10)
_GlobalMetrics_Direct/direct-16                                       0.000 ± 0%                0.000 ± 0%        ~ (p=1.000 n=10) ¹
_GlobalMetrics_Direct/atomic.Value-16                                 0.000 ± 0%                0.000 ± 0%        ~ (p=1.000 n=10) ¹
geomean                                                                          ²                          -69.40%                ²
¹ all samples are equal
² summaries must be >0 to compute geomean

I did these optimizations in 3 parts: Eliminate `fmt.Sprintf` in `flattenKeyLabels`: This drastically reduces allocations when there are lots of labels and had a 15-30% CPU usage reduction when labels were used. The more labels used the more drastic the reduction. Simplify `flattenKey`: Here I got rid of the temporary buffer and string replacer and put the code into the final state of this commit. This reduced the CPU utilization for `flattenKey` by 50% and allocations by 75%. The impact on `flattenKeyLabels` was a little less pronounced as most of the CPU is actually in label processing. Even so the reductions were 25-40% for CPU usage and 37-42% for allocations. Eliminate the space replacer and call `strings.Replace` just once: Within the label processing loop we were using the space replacer to write modified bytes out to the buffer. I instead swapped this for direct buffer writes and a single call to strings.Replace at the end. This resulted in another 33-47% CPU reduction for the function and 20-50% less allocations.

jrasell requested a review from a team December 18, 2024 10:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance optimizations for the inmem sinks key flattening #161

Performance optimizations for the inmem sinks key flattening #161

mkeeler commented Mar 5, 2024

Performance optimizations for the inmem sinks key flattening #161

Are you sure you want to change the base?

Performance optimizations for the inmem sinks key flattening #161

Conversation

mkeeler commented Mar 5, 2024

Description

Details