-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: Add new metric request_throughput
#619
base: main
Are you sure you want to change the base?
Conversation
… throughput of slow queries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for opening!
The only thing I'm not sure about is the cutoff in milliseconds. I don't think that's what the original issue discussed, can you double-check?
It would also be nice to see some tests, the logic is getting a bit more complicated and shouldn't stay untested.
slow_request_server_throughput
slow_request_throughput
Extract header parsing into a dedicated function.
…tMetricHistogram()`
slow_request_throughput
request_throughput
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mostly LGTM, although looking at the Server-Timing spec, i think we're missing some cases in how it can be formatted.
{"WithoutSleep", false, "unit=0, other_unit=2", false}, | ||
{"WithoutSleep", true, "", false}, | ||
{"WithoutSleep", false, "", false}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you update the names of these tests?
expected int64 | ||
err bool | ||
}{ | ||
{"key0=0, key1=1", "key0", 0, false}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add keys to each struct initialization?
{"key0=0, key1=1", "key0", 0, false}, | |
{header: "key0=0, key1=1", key: "key0", expected: 0, err: false}, |
maybe putting them on separate lines might make this more readable
{"key0=0, key1=1", "key0", 0, false}, | |
{ | |
header: "key0=0, key1=1", | |
key: "key0", | |
expected: 0, | |
err: false, | |
}, |
if value != tt.expected { | ||
t.Errorf("expected value: %d, got: %d", tt.expected, value) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we normally use testify for these kind of comparions:
if value != tt.expected { | |
t.Errorf("expected value: %d, got: %d", tt.expected, value) | |
} | |
assert.Equal(t, tt.expected, value) |
it reduces the boilerplate
@@ -209,6 +216,8 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) { | |||
f.StringVar(&cfg.LogRequestExcludeHeadersList, "server.log-request-headers-exclude-list", "", "Comma separated list of headers to exclude from loggin. Only used if server.log-request-headers is true.") | |||
f.BoolVar(&cfg.LogRequestAtInfoLevel, "server.log-request-at-info-level-enabled", false, "Optionally log requests at info level instead of debug level. Applies to request headers as well if server.log-request-headers is enabled.") | |||
f.BoolVar(&cfg.ProxyProtocolEnabled, "server.proxy-protocol-enabled", false, "Enables PROXY protocol.") | |||
f.DurationVar(&cfg.Throughput.RequestCutoff, "server.throughput.request-cutoff", 0, "Duration after which a request will be observed. For requests taking longer than this duration to finish, the throughput will be calculated. If set to 0, the throughput will not be calculated.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when reading this in the context of all the config in server.Config
I am not sure what request cutoff does. Can you clarify how it interacts with the Server-Timing
header, the unit, and the request_throughput
histogram?
@@ -209,6 +216,8 @@ func (cfg *Config) RegisterFlags(f *flag.FlagSet) { | |||
f.StringVar(&cfg.LogRequestExcludeHeadersList, "server.log-request-headers-exclude-list", "", "Comma separated list of headers to exclude from loggin. Only used if server.log-request-headers is true.") | |||
f.BoolVar(&cfg.LogRequestAtInfoLevel, "server.log-request-at-info-level-enabled", false, "Optionally log requests at info level instead of debug level. Applies to request headers as well if server.log-request-headers is enabled.") | |||
f.BoolVar(&cfg.ProxyProtocolEnabled, "server.proxy-protocol-enabled", false, "Enables PROXY protocol.") | |||
f.DurationVar(&cfg.Throughput.RequestCutoff, "server.throughput.request-cutoff", 0, "Duration after which a request will be observed. For requests taking longer than this duration to finish, the throughput will be calculated. If set to 0, the throughput will not be calculated.") | |||
f.StringVar(&cfg.Throughput.Unit, "server.throughput.unit", "total_samples", "Unit of the server throughput metric, for example 'processed_bytes' or 'total_samples'. Observed values will be gathered from Server-Timing header with defined key. If set, it is appended to the request_server_throughput metric name.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one more thing - can you given an example value for the header and also mention that it's expected it is an integer value? Server-Timing: total_sample=20
@@ -73,5 +75,15 @@ func NewServerMetrics(cfg Config) *Metrics { | |||
Name: "inflight_requests", | |||
Help: "Current number of inflight requests.", | |||
}, []string{"method", "route"}), | |||
RequestThroughput: reg.NewHistogramVec(prometheus.HistogramOpts{ | |||
Namespace: cfg.MetricsNamespace, | |||
Name: "slow_request_throughput_" + cfg.Throughput.Unit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you suggested we change this to request_throughput_samples_total
, are you still making the change?
}{ | ||
{"key0=0, key1=1", "key0", 0, false}, | ||
{"key0=0, key1=1", "key1", 1, false}, | ||
{"key0=0, key1=1", "key2", 0, true}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i checked the format for Server-Timing
according to mozilla https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Server-Timing
can you add a few more test cases to make sure we're handling all of them correctly? Maybe one with dur
, one with a ;
, one with repeated keys.
Name: "request_duration_seconds", | ||
Help: "Time (in seconds) spent serving HTTP requests.", | ||
Buckets: instrument.DefBuckets, | ||
NativeHistogramBucketFactor: metricsNativeHistogramFactor, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need the native histogram config for all of these histograms? it's only test and we only export the text-based format, which doesn't include native histograms.
I'm trying to keep the function as simple as possible.
What this PR does:
This PR adds two configuration parameters
server.throughput-config.slow-request-cutoff
andserver.throughput-config.unit
, exposes a new metricslow_request_server_throughput
and calculates throughput inunits/s
for the metric using information from headerServer-Timing
.If
server.throughput-config.slow-request-cutoff
is0
no throughput will be calculated.Implemented code is really similar to this branch by @krajorama. But adds some flexibility to measure throughput based on different signals. It will default to
total_samples
as processed samples are easier to explain to users. Discussed with @dimitarvdimitrov.Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]