Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

topdown: Pass metrics to built-ins and record http.send latency #2629

Merged
merged 1 commit into from
Aug 14, 2020

Conversation

tsandall
Copy link
Member

Previously the built-in functions had no way to record metrics for
performance monitoring or other purposes. With this change, built-in
functions can manipulate the evaluation metrics. Initially, only the
http.send function has been updated to report latency.

Fixes #2034

Signed-off-by: Torin Sandall torinsandall@gmail.com

patrick-east
patrick-east previously approved these changes Aug 13, 2020
topdown/http.go Outdated
var (
allowedKeys = ast.NewSet()
requiredKeys = ast.NewSet(ast.StringTerm("method"), ast.StringTerm("url"))
httpSendLatencyMetricKey = "rego_" + ast.HTTPSend.Name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to define a shared builtin key prefix or something? Could maybe have a helper on the BuiltinContext or something.

🤔 That being said maybe we do it when we have >1, for a single metric it doesn't really matter too much. I just want to make sure we keep them uniform since they are (sort of) user facing names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to rego_builtin_http.send.

@@ -64,6 +67,8 @@ const httpSendBuiltinCacheKey httpSendKey = "HTTP_SEND_CACHE_KEY"

func builtinHTTPSend(bctx BuiltinContext, args []*ast.Term, iter func(*ast.Term) error) error {

bctx.Metrics.Timer(httpSendLatencyMetricKey).Start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts around having a metric for cache hits too? As-is on subsequent calls the time would still be accumulated in here too, but it might be hard for someone to tell how much time the actual http call took versus http call plus some (potentially unknown?) number of cache hits that are accumulated in this metric.

I wonder too if we would want a mechanism to allow for providing more details about what the metric is, like.. if there are two http.send calls in some policy for different URLs or something it may not make sense to add their times together under a single timer. The tricky thing is that we dont always want to shove like the full call context into the name of the metric.. but having some way to map it back to it might be nice. Not really sure I have a great suggestion though. Brainstorming a little bit.. maybe we have an ID or something to map the metric to trace node? I guess maybe it would require having some like metadata or something to tag on the metric.

Again though.. maybe we cross that bridge when we get there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts around having a metric for cache hits too?

I was assuming that cache hit latency would be negligible compared to cache miss latency. From my point of view, the current implementation will be useful for telling when the query is CPU vs. IO bound. That's one of the main issues I've encountered with policies that use http.send--when you run into performance problems, you want to divide-and-conquer and quickly identifying whether it was an http.send seems useful. I don't think this is going to useful for profiling individual http.send callls. LMKWYT.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works for me 👍

Previously the built-in functions had no way to record metrics for
performance monitoring or other purposes. With this change, built-in
functions can manipulate the evaluation metrics. Initially, only the
http.send function has been updated to report latency.

Fixes open-policy-agent#2034

Signed-off-by: Torin Sandall <torinsandall@gmail.com>
@tsandall tsandall merged commit 6507fa4 into open-policy-agent:master Aug 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve query eval metrics to report http.send latency
3 participants