-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add command name to count metrics #437
Conversation
private final MeterRegistry meterRegistry; | ||
|
||
/** The tag for error being true, created only once. */ | ||
private final Tag errorTag = Tag.of("error", "true"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These can probably be static
? (unlike injected things like meterRegistry)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add it. The class is marked application scope so only one instance is maintained.
// Add metrics | ||
Tags tags = Tags.of(apiTag, commandTag, statusCodeTag, errorTag); | ||
// record | ||
meterRegistry.counter("http_server_requests_custom_seconds_count", tags).increment(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if there's a good place to define these constants? If not, inlining is fine.
(looking down looks like we have MetricsConstants
added so maybe add there)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will add it to the Constants
* @param responseContext {@link ContainerResponseContext} | ||
*/ | ||
@ServerResponseFilter | ||
public void record( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use different name even if record
is not technically a reserved word in Java?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will change it to recordMetric
// reset the stream to fetch from beginning of stream | ||
inputStream.reset(); | ||
// get body from requestContext | ||
final Iterator<String> fieldIterator = objectMapper.readTree(inputStream).fieldNames(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is rather expensive, parsing the whole JSON contents. Is there no other way? It also requires deeper knowledge of payload than should be needed in a filter.
I guess feasibility depends on whether this is called for each and every command: if it was only for errors it'd be less problematic.
But I guess our choice to NOT use URL REST-style is making this way more challenging than it should be...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we can actually avoid most of decoding, although do need to stream through content more than once.
Since we only need to see:
{ "command-name"
it is possible to create JsonParser
using ObjectMapper
and use nextToken()
to look only at first 2 tokens: they must be:
JsonToken.START_OBJECT
JsonToken.FIELD_NAME
and after (2), can call jsonParser.currentName()
to get command name we need.
After that JsonParser
should be close()
d (frees/recycles buffers).
This would remove most of processing overhead, as long as request payload is buffered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will make the change
@ServerResponseFilter | ||
public void record( | ||
ContainerRequestContext requestContext, ContainerResponseContext responseContext) { | ||
String commandName = getCommandName(requestContext.getEntityStream()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it guaranteed this stream can always reset()
reliably? And does it need to be reset after reading or before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is done post command is run, so after reset is not needed. I will add a null check, if Stream object present reset won't fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some concerns about having to decode JSON for every command additional time, but that is probably necessary due to design of API. So I guess I am ok with that. There are some additional suggestions (wrt constant made static
), but more generally I think I'd like someone else (Ivan) to have a look.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to propose different solution, which has pros and cons, but definitely avoids doing that gymnastics in the filter, which imo can have negative performance for the requests..
Proposal
How about we don't count record on the HTTP level, but rather record the performance of the CommandProcessor
. There we could easily use MeterRegistry
directly and measure time for the command execution. We could include tags that we want and that we can take from the exceptions, context, tenant info provider, etc.
The implementation would be way easier and we would read information directly from the model objects, thus we would have way better performance.
The other libraries often have such metrics as well, and do report on in-flight messages, error rates, etc, so it wouldn't be anything unusual. For example, I know Axon has something similar for their command gateway.
The implementation can be very simple:
public <T extends Command> Uni<CommandResult> processCommand(
CommandContext commandContext, T command) {
Timer timer = meterRegistry.timer(...);
return commandResolverService
.resolverForCommand(command)
// resolver can be null, not handled in CommandResolverService for now
.flatMap(
resolver -> {
// if we have resolver, resolve operation and execute
Operation operation = resolver.resolveCommand(commandContext, command);
return operation.execute(queryExecutor);
})
.onItemOrFailure((item, ex) -> {
// capture metrics & handle result or exception
});
}
Pros over current solution
- can measure time and not only count
- does not manipulation of the response input stream
- can include tenant if available (can the current solution as well)
- can include detailed information on the error, for example error code or error class
- better performance
Cons over current solution
- no information on
4xx
response as they never reach the command gateway - measures only time in command processor (current solution does not measure time, but still we would not get complete HTTP time, but that's fine imo)
I would like to sync about this and thus putting Request changes until we agree on the approach..
@ivansenic I like this idea: access from model does give more information. Wrt 4xx responses; perhaps there could be different mechanism to deal with those? And if we only processed stream content (... if we must) for 4xx, it'd be less of performance concern as well. Or, alternatively, just have non-command-specific 4xx counts: not optimal but at least could alert. |
The 4xx would still be captured in the http metrics out of the box, so we would have info about those as well, just without the command name. @maheshrajamani and I agreed to go and record in the command processor and we agreed to pick up error codes, exception classes and even the info if error is caused by user or not. |
Excellent! Sounds like a plan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I provided a lot of good suggestions, please have a look..
src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/MetricsConstants.java
Outdated
Show resolved
Hide resolved
src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/MetricsConstants.java
Outdated
Show resolved
Hide resolved
src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/MetricsConstants.java
Outdated
Show resolved
Hide resolved
src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/MetricsConstants.java
Outdated
Show resolved
Hide resolved
src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/MetricsConstants.java
Outdated
Show resolved
Hide resolved
src/main/java/io/stargate/sgv2/jsonapi/service/processor/CommandProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/io/stargate/sgv2/jsonapi/service/processor/CommandProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/io/stargate/sgv2/jsonapi/service/processor/CommandProcessor.java
Outdated
Show resolved
Hide resolved
src/main/java/io/stargate/sgv2/jsonapi/service/processor/CommandProcessor.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check what comments do make and adapt.. Approving to remove the need for wait..
What this PR does:
Add command name to count metrics
Which issue(s) this PR fixes:
Fixes #411
Checklist