Rewrite enricher to use map lookup #1523

replay · 2019-11-07T15:54:07Z

This replaces the enricher implementation to make enrichment faster. There are mainly two important changes:

When enriching a metric the enricher now simply needs to lookup the metric keys from a map, which is much cheaper than running it through a filter for each present meta record. This map needs to be kept up2date, which means enrichment must happen whenever metrics or meta records get added.
All events which change the state of the enricher (add metric, del metric, add meta record, del meta record) get processed asynchronously via a queue. Furthermore, all events to add metrics get buffered and executed in batches because this allows us to further decrease the amount of time it takes to enrich each of them. This means that when a new metric gets added to the index it can take a few seconds until its meta tags show up as well, but it is necessary to do it this way because otherwise the enrichment would slow down the ingest speed too much.

I will create a diagram to illustrate how it all works and post it here.

this replaces the enricher implementation to improve its enrichment performance. it drops the enrichment cache and it also doesn't filter metrics based on the meta records at the enrichment stage anymore. instead it is now building a map from which it can lookup the metric keys and resolve them into meta records, from which it then gets the meta tags. especially in scenarios where there is a large number of meta records in the index this performs much better than the old implementation.

this has the purpose of improving the addMetric performance when a large number of metrics gets added to the index concurrently. previously each of them would have been checked against the filter requirements of each existing meta record, due to how we now process them in batches this process is more efficient. instead of checking each new metric one by one against each meta record criteria, we're now building a small temporary index out of all the added metrics in the buffer, then we run each meta record as a query on that small index. this change improves the addMetric event processing performance by a huge factor in situations where a lot of metrics get added at once.

replay · 2019-11-07T15:58:01Z

FYI I'm planning to create a follow-up PR which will not change any logic. It will only move code around and rename stuff to create a separate package for all tag and meta tag related stuff and make names more explanatory. This is mostly just going to be house keeping to avoid namespace pollution of the memory index package. I'll probably also reorganize some of the tests and benchmarks into separate files to make their distribution more logical.

replay · 2019-11-07T18:48:11Z

This shows how the enrichment works now. At the time a new metric gets added to the index we asynchronously look up all meta records that need to be associated with it, using a tag query on a temporarily instantiated index. At the time the metric gets queried we then only need to do a few key lookups.

replay added 13 commits November 7, 2019 10:58

add helpers to get meta tag data structures

ccc2e47

refactor method to check if record exists

bb33c93

add tests for enricher

6715d97

add enricher benchmark

5d57d52

make enricher queue size configurable

ec4b0b7

add enricher metrics and logs

e03446c

shorter enricher lock when adding metric

ebf2972

make benchmark report allocs

45181c8

add config parameters and comments

b46956b

remove lock that isn't necessary anymore

00d2ccc

update changelog

2fef238

fix typo

e249e9c

robert-milan approved these changes Nov 15, 2019

View reviewed changes

only check once if tag is metric tag

acd89e0

replay merged commit f833d45 into master Nov 15, 2019

replay deleted the rewrite_enricher branch November 15, 2019 15:14

replay mentioned this pull request Nov 28, 2019

prepare changelog for v0.13.1 #1555

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rewrite enricher to use map lookup #1523

Rewrite enricher to use map lookup #1523

replay commented Nov 7, 2019

replay commented Nov 7, 2019

replay commented Nov 7, 2019

Rewrite enricher to use map lookup #1523

Rewrite enricher to use map lookup #1523

Conversation

replay commented Nov 7, 2019

replay commented Nov 7, 2019

replay commented Nov 7, 2019