Initial tag query performance improvements #848

shanson7 · 2018-02-09T21:36:15Z

The basic idea here is that the string representation of an id is already in use by the defById map. During the course of indexing and querying, we are repeatedly doing fmt calls to translate between the two types. I think the original motivation behind using Id over string was to reduce pointer usage for the GC but this results in many allocations during a tag query. Here is my comparison with the existing benchmark (1m benchtime):

Improvements across the board (especially in allocations) except for indexing, which is pretty close.

benchmark                                          old ns/op     new ns/op     delta
BenchmarkTagQueryFilterAndIntersect-8              74609352      55922356      -25.05%
BenchmarkTagQueryFilterAndIntersectOnlyRegex-8     146186085     117200507     -19.83%
BenchmarkTagQueryKeysByPrefixSimple-8              3825          3298          -13.78%
BenchmarkTagQueryKeysByPrefixExpressions-8         1082002       793130        -26.70%
BenchmarkIndexing-8                                11962         12970         +8.43%

benchmark                                          old allocs     new allocs     delta
BenchmarkTagQueryFilterAndIntersect-8              403615         3256           -99.19%
BenchmarkTagQueryFilterAndIntersectOnlyRegex-8     1005682        205082         -79.61%
BenchmarkTagQueryKeysByPrefixSimple-8              14             6              -57.14%
BenchmarkTagQueryKeysByPrefixExpressions-8         5389           1065           -80.24%
BenchmarkIndexing-8                                27             26             -3.70%

benchmark                                          old bytes     new bytes     delta
BenchmarkTagQueryFilterAndIntersect-8              9180205       383341        -95.82%
BenchmarkTagQueryFilterAndIntersectOnlyRegex-8     31116455      13526309      -56.53%
BenchmarkTagQueryKeysByPrefixSimple-8              528           352           -33.33%
BenchmarkTagQueryKeysByPrefixExpressions-8         332176        253633        -23.64%
BenchmarkIndexing-8                                2007          1911          -4.78%

Once a numeric Id representation is used all over metrictank, these two maps will still use the same key, but it will be even more efficient!

Use strings as keys until both non-string ids are used everywhere

DanCech · 2018-02-09T21:43:53Z

Very interesting, how does it impact memory usage for the map?

shanson7 · 2018-02-09T21:49:35Z

The strings all reference the strings that each def already has. So, the keys themselves are just the size of a golang string (pointer plus int) so they are smaller than the existing keys.

Edit: That being said, changing IDs to binary value types instead of strings/pointers is a worthwhile goal.

replay · 2018-02-12T05:12:46Z

Interesting. Actually at the time of writing that, as you already mentioned, my idea was that the next step would be to change everything that's still using strings into numeric types and then we won't need to convert between strings and numeric types anymore. I did not expect this temporary solution of converting forth and back is going to be that expensive, so I think your changes make sense that way.

Dieterbe · 2018-02-19T18:38:17Z

thanks @shanson7

shanson7 and others added 2 commits February 8, 2018 13:11

Use strings as keys until both non-string ids are used everywhere

94c54be

Merge pull request #19 from bloomberg/Round1_tagQueryOpts

0d9a276

Use strings as keys until both non-string ids are used everywhere

Remove unused metric

2ae724c

replay approved these changes Feb 12, 2018

View reviewed changes

Dieterbe merged commit 3c30169 into grafana:master Feb 19, 2018

Dieterbe mentioned this pull request Mar 18, 2018

Optimize kafka format #876

Merged

shanson7 deleted the feature_Round1TagQueryOpts branch August 9, 2018 15:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Initial tag query performance improvements #848

Initial tag query performance improvements #848

shanson7 commented Feb 9, 2018

DanCech commented Feb 9, 2018

shanson7 commented Feb 9, 2018 •

edited

Loading

replay commented Feb 12, 2018

Dieterbe commented Feb 19, 2018

Initial tag query performance improvements #848

Initial tag query performance improvements #848

Conversation

shanson7 commented Feb 9, 2018

DanCech commented Feb 9, 2018

shanson7 commented Feb 9, 2018 • edited Loading

replay commented Feb 12, 2018

Dieterbe commented Feb 19, 2018

shanson7 commented Feb 9, 2018 •

edited

Loading