Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a separate flag for 'start' parameter #136

Merged
merged 2 commits into from
Dec 4, 2018

Conversation

nilebox
Copy link
Contributor

@nilebox nilebox commented Nov 20, 2018

Make a separate flag metrics-max-age for duration to be used for start parameter sent to Prometheus, determining the set of available metrics appeared during a specified period. This will allow us to update a list of available metrics frequently (e.g. 30s or less) without running into flakiness of appearing/disappearing metrics.

Fixes #135

@nilebox nilebox force-pushed the start-flag branch 2 times, most recently from 5ba0ad0 to 73cc357 Compare November 20, 2018 14:08
@DirectXMan12
Copy link
Contributor

Can you add something like the description of #135 to the description of this PR? Someone doing git log should be able to easily see what this is about

@nilebox
Copy link
Contributor Author

nilebox commented Nov 28, 2018

@DirectXMan12 done, please review.

Copy link
Contributor

@DirectXMan12 DirectXMan12 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally on board with this, couple comments inline

@@ -53,6 +53,8 @@ type PrometheusAdapter struct {
AdapterConfigFile string
// MetricsRelistInterval is the interval at which to relist the set of available metrics
MetricsRelistInterval time.Duration
// MetricsStartDuration is the period to query available metrics for
MetricsStartDuration time.Duration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this name is not immediately obvious. Perhaps max-metrics-age or something

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to metrics-max-age to be consistent with existing metrics-relist-interval flag name.

@@ -83,6 +85,8 @@ func (cmd *PrometheusAdapter) addFlags() {
"and custom metrics API resources")
cmd.Flags().DurationVar(&cmd.MetricsRelistInterval, "metrics-relist-interval", cmd.MetricsRelistInterval, ""+
"interval at which to re-list the set of all available metrics from Prometheus")
cmd.Flags().DurationVar(&cmd.MetricsStartDuration, "metrics-start-duration", cmd.MetricsStartDuration, ""+
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we default this to something sane based the metrics-relist-interval so that we don't confuse people who are expecting the existing behavior? e.g. if someone sets metrics-relist-interval to 20 minutes, they probably don't expect the max metrics age to be 10 minutes, and if someone sets it to 30 seconds, they probably also don't expect the age to be 10 minutes. Maybe 2 * relist interval by default or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed that I can make it longer than metrics-relist-interval by default, i.e. 20m. That would suggest that max age should be at least twice as high as relist interval.

Why setting max age to 10m with relist interval 30s is bad though? I think it's perfectly fine actually :) e.g. even if Prometheus is configured to be super slow and refreshes metrics once every 10m, I still don't want to have a potential delay of 9m:59s once a new metric is discovered by Prometheus, so I will set it to 30s.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DirectXMan12 I can probably add a validation code that would require the max-age >= 2 * relist-interval?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no validation needed. My main concern was that it means metrics that are no longer valid will show up in discovery for a lot longer. If you have things coming and going quickly, this could lead to a lot more metrics to process during discovery than you intend. It's not a big deal here, though, and we can always follow up later.

@@ -105,6 +109,10 @@ func (cmd *PrometheusAdapter) makeProvider(promClient prom.Client, stopCh <-chan
return nil, nil
}

if cmd.MetricsMaxAge < cmd.MetricsRelistInterval {
return nil, fmt.Errorf("max age must not be less than relist interval")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DirectXMan12 added a validation, but without the 2 * multiplier, as I think it's a bit too opinionated.

e.g. if someone sets a relist interval to 5s, while Prometheus itself (internally) has an interval of 1m, even setting max age to 10s won't help, so users need to understand what they are doing.

@nilebox
Copy link
Contributor Author

nilebox commented Nov 28, 2018

This PR should be good to merge now.

@DirectXMan12
Copy link
Contributor

LGTM 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants