Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errrs when using custom metrics without MetricTemplate #1609

Closed
timsanwald opened this issue Mar 6, 2024 · 3 comments · Fixed by #1611
Closed

Errrs when using custom metrics without MetricTemplate #1609

timsanwald opened this issue Mar 6, 2024 · 3 comments · Fixed by #1611
Labels
kind/bug Something isn't working

Comments

@timsanwald
Copy link

Describe the bug

Custom metrics directly in Canary without the use of MetricTemplate is not working anymore.
If theres not a single use of the predefined metrics ("request-duration" and "request-success-rate")

We are getting the error introduced with the following commit 16f8e15

Why is it required to use the MetricTemplate function now, instead of directly addressing the e.g. query? I couldn't find any changelog noting the breaking change there at least.

spec:
analysis:
interval: 60s
iterations: 5
metrics:
- name: canary-health-metrics
query: >-
sum(logback_events_total{level="error", job="some-app"}) <=
bool 0
thresholdRange:
min: 1

{"level":"error","ts":"2024-03-06T14:36:57.883Z","caller":"controller/events.go:39","msg":"Metric query failed for no usable metrics template were configured","canary":"redacted.namespace","stacktrace":"github.com/fluxcd/flagger/pkg/controller.(*Controller).recordEventErrorf\n\t/workspace/pkg/controller/events.go:39\ngit.luolix.top/fluxcd/flagger/pkg/controller.(*Controller).runMetricChecks\n\t/workspace/pkg/controller/scheduler_metrics.go:310\ngit.luolix.top/fluxcd/flagger/pkg/controller.(*Controller).runAnalysis\n\t/workspace/pkg/controller/scheduler.go:744\ngit.luolix.top/fluxcd/flagger/pkg/controller.(*Controller).advanceCanary\n\t/workspace/pkg/controller/scheduler.go:433\ngit.luolix.top/fluxcd/flagger/pkg/controller.CanaryJob.Start.func1\n\t/workspace/pkg/controller/job.go:39"}

To Reproduce

Didn't had time until now to reproduce, just my observation now.

Expected behavior

A clear and concise description of what you expected to happen.

Additional context

  • Flagger version: 1.36.0
  • Kubernetes version: 1.27.9
  • Service Mesh provider: istio
  • Ingress provider: istio
@stefanprodan
Copy link
Member

stefanprodan commented Mar 6, 2024

We shouldn't error out if the query field is present.

@LiZhenCheng9527
Copy link
Contributor

LiZhenCheng9527 commented Mar 7, 2024

else if metric.Name != "request-success-rate" && metric.Name != "request-duration" {
	c.recordEventErrorf(canary, "Metric query failed for no usable metrics template were configured")
	return false
}

} else if metric.Name != "request-success-rate" && metric.Name != "request-duration" {

It's because I only did TemplateRef and BuiltinMetric checks and didn't consider Metric.query. Because the description of query in the canary API is ready to be deprecated.

Do I need to add the relevant checks now? Or is it better to encourage users to use TemplateRef and abandon query?
@stefanprodan

@stefanprodan
Copy link
Member

@LiZhenCheng9527 we can't drop deprecated fields unless we bump the API version. To fix the breaking change now, we should look for query and if it exists, we shouldn't error out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants