Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow caching regular expression matching in rules #518

Merged

Conversation

flavray
Copy link
Contributor

@flavray flavray commented Aug 11, 2020

This improves performance for some use cases.

The exporter is configured with a set of rules to match regular
expressions against bean names. Regular expressions can be CPU-intensive,
and matching the same bean names over the same patterns is
time-consuming when there are tens of thousands of bean names (e.g:
exposed by Kafka). Using a single ConcurrentHashMap to store the result
of pattern matching here.

The MatchedRule class was introduced here to "store" this result.
Each rule in the configuration leads to a MatchedRule which helps
build the set of metrics to expose to Prometheus.
The mapping between a bean name + attributes to a MatchedRule is
done once. Some (in Kafka, most) bean names may not match any rule,
so caching of MatchedRule.unmatched() helps prevent re-computation
of bean names to non-existent rules.

The logic to map a bean name to a Prometheus MetricFamilySample is
unchanged from the upstream, the MatchedRule was introduced only to
be able to cache the results.

Caching is disabled by default (behaviour unchanged), and can be enabled
by setting cacheRules: true in the configuration.


In our case, this lead to reducing collection time from ~70 sec to ~3 sec on
some of our applications (Kafka brokers). I opened a thread on the mailing list
a couple of weeks ago and haven't heard back, but definitely happy to discuss
and make changes if needed, if you think this is worth merging. 🙂

@flavray flavray force-pushed the u/flavr/cache-regular-expressions branch from 46dc03f to 427dcc8 Compare August 11, 2020 12:44
Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick peek, this has a memory leak as you never remove old cache entries.

Having the setting apply to the whole config also means that a single string mbean that you want to work with would prevent you from being able to use this, per-rule would make more sense.

@flavray flavray force-pushed the u/flavr/cache-regular-expressions branch from 04db9ae to 02c4605 Compare August 12, 2020 17:53
@flavray
Copy link
Contributor Author

flavray commented Aug 12, 2020

Good catch, I've added some logic to only retain beans that have just been scraped in the cache.

Regarding the per-rule caching, I think it makes sense to try to have per-rule caching. However I'm not able to find a decent way to support all of: per-rule caching, caching "unmatched rules" and allowing bean value in patterns.

If we allow patterns with the bean value in them, caching a miss for a given value will lead to false negatives; on the other hand, not caching unmatched rules will limit the performance of the cache. I could try to "detect" rules that have a bean value in them and treat them differently, but that seems really prone to errors too.

Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not caching unmatched rules will limit the performance of the cache.

This is what I'd suggest. If it matters to a user, the impact can be minimised by having those rules last.

@flavray flavray force-pushed the u/flavr/cache-regular-expressions branch 2 times, most recently from 19ce83f to 73ea1a6 Compare August 13, 2020 15:45
@flavray
Copy link
Contributor Author

flavray commented Aug 13, 2020

not caching unmatched rules will limit the performance of the cache.

This is what I'd suggest. If it matters to a user, the impact can be minimised by having those rules last.

Are you suggesting having a catch-all rule at the end of the list of rules, that would effectively cache unmatched bean names (at the expense of actually exposing them)? If so, that sounds reasonable to me. If not, happy to hear more about the idea!

In the meantime, I had switched to having a per-rule flag to cache regular expression matching. The code looks quite different from before (with the introduction of MatchedRulesCache) and I am not entirely conviced with the approach, still around dealing with rule matching against bean value, which has a different behaviour now (albeit limited to the given rule rather than every rule). Happy to have more feedback on this.


Commit message

Cache regular expression matching per rule

Instead of having a global cacheRules flag, have a per-rule cache
flag (default: false). Rules that do not have the cache flag set will have the default
behaviour.

In order to allow caching of rules with values that change over time
(e.g: gauges), a new per-rule matchBeanValue flag has also been added
(default: true). Disabling the flag will not add the bean value to the
expression when matching against the list of rules (slightly different
behaviour than before, but toggleable per rule) and allow caching of
just the bean name (no matter the value).

@brian-brazil
Copy link
Contributor

Are you suggesting having a catch-all rule at the end of the list of rules, that would effectively cache unmatched bean names (at the expense of actually exposing them)?

I'm not, that'd be potentially incorrect. I'm proposing that the setting be per-rule, and if this is a concern for the user that they put the non-cached rules last so that they're not applied to everything that has already been matched. There's also already a way to halt processing of a given attribute.

README.md Outdated Show resolved Hide resolved
@flavray flavray force-pushed the u/flavr/cache-regular-expressions branch 2 times, most recently from d4abe90 to 648c7b3 Compare August 14, 2020 10:43
@flavray
Copy link
Contributor Author

flavray commented Aug 14, 2020

I'm not, that'd be potentially incorrect. I'm proposing that the setting be per-rule, and if this is a concern for the user that they put the non-cached rules last so that they're not applied to everything that has already been matched. There's also already a way to halt processing of a given attribute.

Gotcha. I've had another go at this, and removed the matchBeanValue flag.

Copy link
Contributor

@brian-brazil brian-brazil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind that scrapes can be concurrent.

README.md Outdated Show resolved Hide resolved
Flavien Raynaud added 12 commits August 19, 2020 14:49
This improves performance for some use cases.

The exporter is configured with a set of rules to match regular
expressions against bean names. Regular expressions can be CPU-intensive,
and matching the same bean names over the same patterns is time-consuming
when there are tens of thousands of bean names (e.g: exposed by Kafka).
Using a single `ConcurrentHashMap` to store the result of pattern
matching here.

The `MatchedRule` class was introduced here to "store" this result.
Each rule in the configuration leads to a `MatchedRule` which helps
build the set of metrics to expose to Prometheus.
The mapping between a bean name + attributes to a `MatchedRule` is
done once. Some (in Kafka, most) bean names may not match any rule,
so caching of `MatchedRule.unmatched()` helps prevent re-computation
of bean names to non-existent rules.

The logic to map a bean name to a Prometheus MetricFamilySample is
unchanged from the upstream, the `MatchedRule` was introduced only to
be able to cache the results.

Caching is disabled by default (behaviour unchanged), and can be enabled
by setting `cacheRules: true` in the configuration.

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Instead of having a global `cacheRules` flag, have a per-rule `cache`
flag (default: false). Rules that do not have the `cache` flag set will have the default
behaviour.

In order to allow caching of rules with values that change over time
(e.g: gauges), a new per-rule `matchBeanValue` flag has also been added
(default: true). Disabling the flag will not add the bean value to the
expression when matching against the list of rules (slightly different
behaviour than before, but toggleable per rule) and allow caching of
just the bean name (no matter the value).

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Always match on bean value, but cache names without the bean values.

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
It adds the result of the rule matching to the cache of matched rules (if enabled).

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Record count of cached beans by relying on the StalenessTracker to avoid
race conditions between scrapes.

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
@flavray flavray force-pushed the u/flavr/cache-regular-expressions branch from b117b4e to 52bfec5 Compare August 19, 2020 13:49
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Flavien Raynaud added 3 commits August 21, 2020 11:00
… lock

Also populate the MatchedRulesCache with the set of rules from the
config.

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
To avoid potential race conditions where a scrape could start with a
Config object and finish with another one.

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
It is easier to understand and will not yield a big performance hit.

Also make `maybeReloadConfig` return the new Config.

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
Signed-off-by: Flavien Raynaud <flavr@yelp.com>
@brian-brazil brian-brazil merged commit 4bfd346 into prometheus:master Aug 25, 2020
@brian-brazil
Copy link
Contributor

Thanks!

@flavray
Copy link
Contributor Author

flavray commented Aug 25, 2020

Awesome! Thank you for all the constructive feedback!

qinghui-xu pushed a commit to qinghui-xu/jmx_exporter that referenced this pull request Sep 18, 2020
* Allow caching regular expression matching in rules

This improves performance for some use cases.

The exporter is configured with a set of rules to match regular
expressions against bean names. Regular expressions can be CPU-intensive,
and matching the same bean names over the same patterns is time-consuming
when there are tens of thousands of bean names (e.g: exposed by Kafka).
Using a single `ConcurrentHashMap` to store the result of pattern
matching here.

The `MatchedRule` class was introduced here to "store" this result.
Each rule in the configuration leads to a `MatchedRule` which helps
build the set of metrics to expose to Prometheus.
The mapping between a bean name + attributes to a `MatchedRule` is
done once. Some (in Kafka, most) bean names may not match any rule,
so caching of `MatchedRule.unmatched()` helps prevent re-computation
of bean names to non-existent rules.

The logic to map a bean name to a Prometheus MetricFamilySample is
unchanged from the upstream, the `MatchedRule` was introduced only to
be able to cache the results.

Caching is disabled by default (behaviour unchanged), and can be enabled
by setting `cacheRules: true` in the configuration.

Signed-off-by: Flavien Raynaud <flavr@yelp.com>

* Cache regular expression matching per rule

Instead of having a global `cacheRules` flag, have a per-rule `cache`
flag (default: false). Rules that do not have the `cache` flag set will have the default
behaviour.

In order to allow caching of rules with values that change over time
(e.g: gauges), a new per-rule `matchBeanValue` flag has also been added
(default: true). Disabling the flag will not add the bean value to the
expression when matching against the list of rules (slightly different
behaviour than before, but toggleable per rule) and allow caching of
just the bean name (no matter the value).

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
@amuraru
Copy link
Contributor

amuraru commented Sep 29, 2020

@qinghui-xu thanks for this improvement. Would it make sense to add an example of how to use cache functionality in https://github.com/prometheus/jmx_exporter/blob/master/example_configs/kafka-2_0_0.yml ?

@brian-brazil
Copy link
Contributor

PRs welcome to update the existing example configs to use this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants