Performance optimizations for the inmem sinks key flattening #161
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
In some Consul profiling we noticed the inmem sinks key flattening showing up as taking significant CPU. So I went into this looking to optimize key flattening to reduce the overhead.
The TLDR from everything that follows is that the changes introduced in this PR reduce CPU usage by 54-75% and memory allocations by 70-83%. for key flattening in the inmem sink
Details
I did these optimizations in 3 parts:
Eliminate
fmt.Sprintf
inflattenKeyLabels
:This drastically reduces allocations when there are lots of labels and had a 15-30% CPU usage reduction when labels were used. The more labels used the more drastic the reduction.
Simplify
flattenKey
:Here I got rid of the temporary buffer and string replacer and put the code into the final state of this commit. This reduced the CPU utilization for
flattenKey
by 50% and allocations by 75%. The impact onflattenKeyLabels
was a little less pronounced as most of the CPU is actually in label processing. Even so the reductions were 25-40% for CPU usage and 37-42% for allocations.Eliminate the space replacer and call
strings.Replace
just once:Within the label processing loop we were using the space replacer to write modified bytes out to the buffer. I instead swapped this for direct buffer writes and a single call to strings.Replace at the end. This resulted in another 33-47% CPU reduction for the function and 20-50% less allocations.
Overall benchmark comparison: