Skip to content

Commit

Permalink
Allow caching regular expression matching in rules (prometheus#518)
Browse files Browse the repository at this point in the history
* Allow caching regular expression matching in rules

This improves performance for some use cases.

The exporter is configured with a set of rules to match regular
expressions against bean names. Regular expressions can be CPU-intensive,
and matching the same bean names over the same patterns is time-consuming
when there are tens of thousands of bean names (e.g: exposed by Kafka).
Using a single `ConcurrentHashMap` to store the result of pattern
matching here.

The `MatchedRule` class was introduced here to "store" this result.
Each rule in the configuration leads to a `MatchedRule` which helps
build the set of metrics to expose to Prometheus.
The mapping between a bean name + attributes to a `MatchedRule` is
done once. Some (in Kafka, most) bean names may not match any rule,
so caching of `MatchedRule.unmatched()` helps prevent re-computation
of bean names to non-existent rules.

The logic to map a bean name to a Prometheus MetricFamilySample is
unchanged from the upstream, the `MatchedRule` was introduced only to
be able to cache the results.

Caching is disabled by default (behaviour unchanged), and can be enabled
by setting `cacheRules: true` in the configuration.

Signed-off-by: Flavien Raynaud <flavr@yelp.com>

* Cache regular expression matching per rule

Instead of having a global `cacheRules` flag, have a per-rule `cache`
flag (default: false). Rules that do not have the `cache` flag set will have the default
behaviour.

In order to allow caching of rules with values that change over time
(e.g: gauges), a new per-rule `matchBeanValue` flag has also been added
(default: true). Disabling the flag will not add the bean value to the
expression when matching against the list of rules (slightly different
behaviour than before, but toggleable per rule) and allow caching of
just the bean name (no matter the value).

Signed-off-by: Flavien Raynaud <flavr@yelp.com>
  • Loading branch information
flavray authored and qinghui-xu committed Sep 18, 2020
1 parent 767d205 commit 4ec9bb8
Show file tree
Hide file tree
Showing 5 changed files with 297 additions and 60 deletions.
28 changes: 15 additions & 13 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -51,30 +51,32 @@ rules:
valueFactor: 0.001
labels: {}
help: "Cassandra metric $1 $2"
cache: false
type: GAUGE
attrNameSnakeCase: false
```
Name | Description
---------|------------
startDelaySeconds | start delay before serving requests. Any requests within the delay period will result in an empty metrics set.
hostPort | The host and port to connect to via remote JMX. If neither this nor jmxUrl is specified, will talk to the local JVM.
username | The username to be used in remote JMX password authentication.
password | The password to be used in remote JMX password authentication.
jmxUrl | A full JMX URL to connect to. Should not be specified if hostPort is.
ssl | Whether JMX connection should be done over SSL. To configure certificates you have to set following system properties:<br/>`-Djavax.net.ssl.keyStore=/home/user/.keystore`<br/>`-Djavax.net.ssl.keyStorePassword=changeit`<br/>`-Djavax.net.ssl.trustStore=/home/user/.truststore`<br/>`-Djavax.net.ssl.trustStorePassword=changeit`
hostPort | The host and port to connect to via remote JMX. If neither this nor jmxUrl is specified, will talk to the local JVM.
username | The username to be used in remote JMX password authentication.
password | The password to be used in remote JMX password authentication.
jmxUrl | A full JMX URL to connect to. Should not be specified if hostPort is.
ssl | Whether JMX connection should be done over SSL. To configure certificates you have to set following system properties:<br/>`-Djavax.net.ssl.keyStore=/home/user/.keystore`<br/>`-Djavax.net.ssl.keyStorePassword=changeit`<br/>`-Djavax.net.ssl.trustStore=/home/user/.truststore`<br/>`-Djavax.net.ssl.trustStorePassword=changeit`
lowercaseOutputName | Lowercase the output metric name. Applies to default format and `name`. Defaults to false.
lowercaseOutputLabelNames | Lowercase the output metric label names. Applies to default format and `labels`. Defaults to false.
whitelistObjectNames | A list of [ObjectNames](http://docs.oracle.com/javase/6/docs/api/javax/management/ObjectName.html) to query. Defaults to all mBeans.
blacklistObjectNames | A list of [ObjectNames](http://docs.oracle.com/javase/6/docs/api/javax/management/ObjectName.html) to not query. Takes precedence over `whitelistObjectNames`. Defaults to none.
rules | A list of rules to apply in order, processing stops at the first matching rule. Attributes that aren't matched aren't collected. If not specified, defaults to collecting everything in the default format.
pattern | Regex pattern to match against each bean attribute. The pattern is not anchored. Capture groups can be used in other options. Defaults to matching everything.
rules | A list of rules to apply in order, processing stops at the first matching rule. Attributes that aren't matched aren't collected. If not specified, defaults to collecting everything in the default format.
pattern | Regex pattern to match against each bean attribute. The pattern is not anchored. Capture groups can be used in other options. Defaults to matching everything.
attrNameSnakeCase | Converts the attribute name to snake case. This is seen in the names matched by the pattern and the default format. For example, anAttrName to an\_attr\_name. Defaults to false.
name | The metric name to set. Capture groups from the `pattern` can be used. If not specified, the default format will be used. If it evaluates to empty, processing of this attribute stops with no output.
value | Value for the metric. Static values and capture groups from the `pattern` can be used. If not specified the scraped mBean value will be used.
valueFactor | Optional number that `value` (or the scraped mBean value if `value` is not specified) is multiplied by, mainly used to convert mBean values from milliseconds to seconds.
labels | A map of label name to label value pairs. Capture groups from `pattern` can be used in each. `name` must be set to use this. Empty names and values are ignored. If not specified and the default format is not being used, no labels are set.
help | Help text for the metric. Capture groups from `pattern` can be used. `name` must be set to use this. Defaults to the mBean attribute description and the full name of the attribute.
type | The type of the metric, can be `GAUGE`, `COUNTER` or `UNTYPED`. `name` must be set to use this. Defaults to `UNTYPED`.
name | The metric name to set. Capture groups from the `pattern` can be used. If not specified, the default format will be used. If it evaluates to empty, processing of this attribute stops with no output.
value | Value for the metric. Static values and capture groups from the `pattern` can be used. If not specified the scraped mBean value will be used.
valueFactor | Optional number that `value` (or the scraped mBean value if `value` is not specified) is multiplied by, mainly used to convert mBean values from milliseconds to seconds.
labels | A map of label name to label value pairs. Capture groups from `pattern` can be used in each. `name` must be set to use this. Empty names and values are ignored. If not specified and the default format is not being used, no labels are set.
help | Help text for the metric. Capture groups from `pattern` can be used. `name` must be set to use this. Defaults to the mBean attribute description and the full name of the attribute.
cache | Whether to cache bean name expressions to rule computation (match and mismatch). Not recommended for rules matching on bean value, as only the value from the first scrape will be cached and re-used. This can increase performance when collecting a lot of mbeans. Defaults to `false`.
type | The type of the metric, can be `GAUGE`, `COUNTER` or `UNTYPED`. `name` must be set to use this. Defaults to `UNTYPED`.

Metric names and label names are sanitized. All characters other than `[a-zA-Z0-9:_]` are replaced with underscores,
and adjacent underscores are collapsed. There's no limitations on label values or the help text.
Expand Down
154 changes: 115 additions & 39 deletions collector/src/main/java/io/prometheus/jmx/JmxCollector.java
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,6 @@
import io.prometheus.client.Counter;
import org.yaml.snakeyaml.Yaml;

import javax.management.MalformedObjectNameException;
import javax.management.ObjectName;
import java.io.File;
import java.io.FileReader;
import java.io.IOException;
Expand All @@ -24,6 +22,8 @@
import java.util.logging.Logger;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import javax.management.MalformedObjectNameException;
import javax.management.ObjectName;

import static java.lang.String.format;

Expand All @@ -38,13 +38,14 @@ public class JmxCollector extends Collector implements Collector.Describable {

private static final Logger LOGGER = Logger.getLogger(JmxCollector.class.getName());

private static class Rule {
static class Rule {
Pattern pattern;
String name;
String value;
Double valueFactor = 1.0;
String help;
boolean attrNameSnakeCase;
boolean cache = false;
Type type = Type.UNTYPED;
ArrayList<String> labelNames;
ArrayList<String> labelValues;
Expand All @@ -62,6 +63,8 @@ private static class Config {
List<ObjectName> blacklistObjectNames = new ArrayList<ObjectName>();
List<Rule> rules = new ArrayList<Rule>();
long lastUpdate = 0L;

MatchedRulesCache rulesCache;
}

private Config config;
Expand All @@ -77,7 +80,7 @@ public JmxCollector(File in) throws IOException, MalformedObjectNameException {
}

public JmxCollector(String yamlConfig) throws MalformedObjectNameException {
config = loadConfig((Map<String, Object>)new Yaml().load(yamlConfig));
config = loadConfig((Map<String, Object>)new Yaml().load(yamlConfig));
}

public JmxCollector(InputStream inputStream) throws MalformedObjectNameException {
Expand Down Expand Up @@ -106,7 +109,19 @@ private void reloadConfig() {
}
}

private Config loadConfig(Map<String, Object> yamlConfig) throws MalformedObjectNameException {
private synchronized Config getLatestConfig() {
if (configFile != null) {
long mtime = configFile.lastModified();
if (mtime > config.lastUpdate) {
LOGGER.fine("Configuration file changed, reloading...");
reloadConfig();
}
}

return config;
}

private Config loadConfig(Map<String, Object> yamlConfig) throws MalformedObjectNameException {
Config cfg = new Config();

if (yamlConfig == null) { // Yaml config empty, set config to empty map.
Expand Down Expand Up @@ -165,7 +180,7 @@ private Config loadConfig(Map<String, Object> yamlConfig) throws MalformedObject
}
}

if (yamlConfig.containsKey("rules")) {
if (yamlConfig.containsKey("rules")) {
List<Map<String,Object>> configRules = (List<Map<String,Object>>) yamlConfig.get("rules");
for (Map<String, Object> ruleObject : configRules) {
Map<String, Object> yamlRule = ruleObject;
Expand All @@ -191,6 +206,9 @@ private Config loadConfig(Map<String, Object> yamlConfig) throws MalformedObject
if (yamlRule.containsKey("attrNameSnakeCase")) {
rule.attrNameSnakeCase = (Boolean)yamlRule.get("attrNameSnakeCase");
}
if (yamlRule.containsKey("cache")) {
rule.cache = (Boolean)yamlRule.get("cache");
}
if (yamlRule.containsKey("type")) {
rule.type = Type.valueOf((String)yamlRule.get("type"));
}
Expand Down Expand Up @@ -220,6 +238,8 @@ private Config loadConfig(Map<String, Object> yamlConfig) throws MalformedObject
cfg.rules.add(new Rule());
}

cfg.rulesCache = new MatchedRulesCache(cfg.rules);

return cfg;

}
Expand Down Expand Up @@ -287,9 +307,15 @@ class Receiver implements JmxScraper.MBeanReceiver {
Map<String, MetricFamilySamples> metricFamilySamplesMap =
new HashMap<String, MetricFamilySamples>();

private static final char SEP = '_';
Config config;
MatchedRulesCache.StalenessTracker stalenessTracker;

private static final char SEP = '_';

Receiver(Config config, MatchedRulesCache.StalenessTracker stalenessTracker) {
this.config = config;
this.stalenessTracker = stalenessTracker;
}

// [] and () are special in regexes, so swtich to <>.
private String angleBrackets(String s) {
Expand All @@ -307,13 +333,24 @@ void addSample(MetricFamilySamples.Sample sample, Type type, String help) {
mfs.samples.add(sample);
}

private void defaultExport(
// Add the matched rule to the cached rules and tag it as not stale
// if the rule is configured to be cached
private void addToCache(final Rule rule, final String cacheKey, final MatchedRule matchedRule) {
if (rule.cache) {
config.rulesCache.put(rule, cacheKey, matchedRule);
stalenessTracker.add(rule, cacheKey);
}
}

private MatchedRule defaultExport(
String matchName,
String domain,
LinkedHashMap<String, String> beanProperties,
LinkedList<String> attrKeys,
String attrName,
String help,
Object value,
Double value,
double valueFactor,
Type type) {
StringBuilder name = new StringBuilder();
name.append(domain);
Expand Down Expand Up @@ -350,8 +387,7 @@ private void defaultExport(
}
}

addSample(new MetricFamilySamples.Sample(fullname, labelNames, labelValues, ((Number)value).doubleValue()),
type, help);
return new MatchedRule(fullname, matchName, type, help, labelNames, labelValues, value, valueFactor);
}

public void recordBean(
Expand All @@ -368,40 +404,55 @@ public void recordBean(
String help = attrDescription + " (" + beanName + attrName + ")";
String attrNameSnakeCase = toSnakeAndLowerCase(attrName);

MatchedRule matchedRule = MatchedRule.unmatched();

for (Rule rule : config.rules) {
// Rules with bean values cannot be properly cached (only the value from the first scrape will be cached).
// If caching for the rule is enabled, replace the value with a dummy <cache> to avoid caching different values at different times.
Object matchBeanValue = rule.cache ? "<cache>" : beanValue;

String matchName = beanName + (rule.attrNameSnakeCase ? attrNameSnakeCase : attrName) + ": " + matchBeanValue;

if (rule.cache) {
MatchedRule cachedRule = config.rulesCache.get(rule, matchName);
if (cachedRule != null) {
stalenessTracker.add(rule, matchName);
if (cachedRule.isMatched()) {
matchedRule = cachedRule;
break;
}

// The bean was cached earlier, but did not match the current rule.
// Skip it to avoid matching against the same pattern again
continue;
}
}

Matcher matcher = null;
String matchName = beanName + (rule.attrNameSnakeCase ? attrNameSnakeCase : attrName);
if (rule.pattern != null) {
matcher = rule.pattern.matcher(matchName + ": " + beanValue);
matcher = rule.pattern.matcher(matchName);
if (!matcher.matches()) {
addToCache(rule, matchName, MatchedRule.unmatched());
continue;
}
}

Number value;
Double value = null;
if (rule.value != null && !rule.value.isEmpty()) {
String val = matcher.replaceAll(rule.value);

try {
beanValue = Double.valueOf(val);
value = Double.valueOf(val);
} catch (NumberFormatException e) {
LOGGER.fine("Unable to parse configured value '" + val + "' to number for bean: " + beanName + attrName + ": " + beanValue);
return;
}
}
if (beanValue instanceof Number) {
value = ((Number)beanValue).doubleValue() * rule.valueFactor;
} else if (beanValue instanceof Boolean) {
value = (Boolean)beanValue ? 1 : 0;
} else {
LOGGER.fine("Ignoring unsupported bean: " + beanName + attrName + ": " + beanValue);
return;
}

// If there's no name provided, use default export format.
if (rule.name == null) {
defaultExport(domain, beanProperties, attrKeys, rule.attrNameSnakeCase ? attrNameSnakeCase : attrName, help, value, rule.type);
return;
matchedRule = defaultExport(matchName, domain, beanProperties, attrKeys, rule.attrNameSnakeCase ? attrNameSnakeCase : attrName, help, value, rule.valueFactor, rule.type);
addToCache(rule, matchName, matchedRule);
break;
}

// Matcher is set below here due to validation in the constructor.
Expand Down Expand Up @@ -437,30 +488,48 @@ public void recordBean(
}
} catch (Exception e) {
throw new RuntimeException(
format("Matcher '%s' unable to use: '%s' value: '%s'", matcher, unsafeLabelName, labelValReplacement), e);
format("Matcher '%s' unable to use: '%s' value: '%s'", matcher, unsafeLabelName, labelValReplacement), e);
}
}
}

// Add to samples.
LOGGER.fine("add metric sample: " + name + " " + labelNames + " " + labelValues + " " + value.doubleValue());
addSample(new MetricFamilySamples.Sample(name, labelNames, labelValues, value.doubleValue()), rule.type, help);
matchedRule = new MatchedRule(name, matchName, rule.type, help, labelNames, labelValues, value, rule.valueFactor);
addToCache(rule, matchName, matchedRule);
break;
}

if (matchedRule.isUnmatched()) {
return;
}
}

}
Number value;
if (matchedRule.value != null) {
beanValue = matchedRule.value;
}

public List<MetricFamilySamples> collect() {
if (configFile != null) {
long mtime = configFile.lastModified();
if (mtime > config.lastUpdate) {
LOGGER.fine("Configuration file changed, reloading...");
reloadConfig();
if (beanValue instanceof Number) {
value = ((Number) beanValue).doubleValue() * matchedRule.valueFactor;
} else if (beanValue instanceof Boolean) {
value = (Boolean) beanValue ? 1 : 0;
} else {
LOGGER.fine("Ignoring unsupported bean: " + beanName + attrName + ": " + beanValue);
return;
}

// Add to samples.
LOGGER.fine("add metric sample: " + matchedRule.name + " " + matchedRule.labelNames + " " + matchedRule.labelValues + " " + value.doubleValue());
addSample(new MetricFamilySamples.Sample(matchedRule.name, matchedRule.labelNames, matchedRule.labelValues, value.doubleValue()), matchedRule.type, help);
}

Receiver receiver = new Receiver();
}

public List<MetricFamilySamples> collect() {
// Take a reference to the current config and collect with this one
// (to avoid race conditions in case another thread reloads the config in the meantime)
Config config = getLatestConfig();

MatchedRulesCache.StalenessTracker stalenessTracker = new MatchedRulesCache.StalenessTracker();
Receiver receiver = new Receiver(config, stalenessTracker);
JmxScraper scraper = new JmxScraper(config.jmxUrl, config.username, config.password, config.ssl,
config.whitelistObjectNames, config.blacklistObjectNames, receiver, jmxMBeanPropertyCache);
long start = System.nanoTime();
Expand All @@ -477,6 +546,8 @@ public List<MetricFamilySamples> collect() {
e.printStackTrace(new PrintWriter(sw));
LOGGER.severe("JMX scrape failed: " + sw.toString());
}
config.rulesCache.evictStaleEntries(stalenessTracker);

List<MetricFamilySamples> mfsList = new ArrayList<MetricFamilySamples>();
mfsList.addAll(receiver.metricFamilySamplesMap.values());
List<MetricFamilySamples.Sample> samples = new ArrayList<MetricFamilySamples.Sample>();
Expand All @@ -488,13 +559,18 @@ public List<MetricFamilySamples> collect() {
samples.add(new MetricFamilySamples.Sample(
"jmx_scrape_error", new ArrayList<String>(), new ArrayList<String>(), error));
mfsList.add(new MetricFamilySamples("jmx_scrape_error", Type.GAUGE, "Non-zero if this scrape failed.", samples));
samples = new ArrayList<MetricFamilySamples.Sample>();
samples.add(new MetricFamilySamples.Sample(
"jmx_scrape_cached_beans", new ArrayList<String>(), new ArrayList<String>(), stalenessTracker.cachedCount()));
mfsList.add(new MetricFamilySamples("jmx_scrape_cached_beans", Type.GAUGE, "Number of beans with their matching rule cached", samples));
return mfsList;
}

public List<MetricFamilySamples> describe() {
List<MetricFamilySamples> sampleFamilies = new ArrayList<MetricFamilySamples>();
sampleFamilies.add(new MetricFamilySamples("jmx_scrape_duration_seconds", Type.GAUGE, "Time this JMX scrape took, in seconds.", new ArrayList<MetricFamilySamples.Sample>()));
sampleFamilies.add(new MetricFamilySamples("jmx_scrape_error", Type.GAUGE, "Non-zero if this scrape failed.", new ArrayList<MetricFamilySamples.Sample>()));
sampleFamilies.add(new MetricFamilySamples("jmx_scrape_cached_beans", Type.GAUGE, "Number of beans with their matching rule cached", new ArrayList<MetricFamilySamples.Sample>()));
return sampleFamilies;
}

Expand Down
Loading

0 comments on commit 4ec9bb8

Please sign in to comment.