Add command name to count metrics #437

maheshrajamani · 2023-05-23T18:31:21Z

What this PR does:
Add command name to count metrics

Which issue(s) this PR fixes:
Fixes #411

Checklist

Changes manually tested
Automated Tests added/updated
Documentation added/updated
CLA Signed: DataStax CLA

tatu-at-datastax · 2023-05-23T21:19:17Z

src/main/java/io/stargate/sgv2/jsonapi/api/security/ErrorChallengeSender.java

+  private final MeterRegistry meterRegistry;
+
+  /** The tag for error being true, created only once. */
+  private final Tag errorTag = Tag.of("error", "true");


These can probably be static? (unlike injected things like meterRegistry)

Will add it. The class is marked application scope so only one instance is maintained.

tatu-at-datastax · 2023-05-23T22:17:57Z

src/main/java/io/stargate/sgv2/jsonapi/api/security/ErrorChallengeSender.java

+      // Add metrics
+      Tags tags = Tags.of(apiTag, commandTag, statusCodeTag, errorTag);
+      // record
+      meterRegistry.counter("http_server_requests_custom_seconds_count", tags).increment();


Not sure if there's a good place to define these constants? If not, inlining is fine.
(looking down looks like we have MetricsConstants added so maybe add there)

Will add it to the Constants

tatu-at-datastax · 2023-05-23T22:20:12Z

src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/RequestMetricsFilter.java

+   * @param responseContext {@link ContainerResponseContext}
+   */
+  @ServerResponseFilter
+  public void record(


Would it make sense to use different name even if record is not technically a reserved word in Java?

Will change it to recordMetric

tatu-at-datastax · 2023-05-23T22:25:28Z

src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/RequestMetricsFilter.java

+      // reset the stream to fetch from beginning of stream
+      inputStream.reset();
+      // get body from requestContext
+      final Iterator<String> fieldIterator = objectMapper.readTree(inputStream).fieldNames();


This is rather expensive, parsing the whole JSON contents. Is there no other way? It also requires deeper knowledge of payload than should be needed in a filter.

I guess feasibility depends on whether this is called for each and every command: if it was only for errors it'd be less problematic.
But I guess our choice to NOT use URL REST-style is making this way more challenging than it should be...

Ok, we can actually avoid most of decoding, although do need to stream through content more than once.

Since we only need to see:

{ "command-name"

it is possible to create JsonParser using ObjectMapper and use nextToken() to look only at first 2 tokens: they must be:

JsonToken.START_OBJECT

JsonToken.FIELD_NAME

and after (2), can call jsonParser.currentName() to get command name we need.
After that JsonParser should be close()d (frees/recycles buffers).

This would remove most of processing overhead, as long as request payload is buffered.

Will make the change

tatu-at-datastax · 2023-05-23T22:27:44Z

src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/RequestMetricsFilter.java

+  @ServerResponseFilter
+  public void record(
+      ContainerRequestContext requestContext, ContainerResponseContext responseContext) {
+    String commandName = getCommandName(requestContext.getEntityStream());


Is it guaranteed this stream can always reset() reliably? And does it need to be reset after reading or before?

This is done post command is run, so after reset is not needed. I will add a null check, if Stream object present reset won't fail.

tatu-at-datastax

I have some concerns about having to decode JSON for every command additional time, but that is probably necessary due to design of API. So I guess I am ok with that. There are some additional suggestions (wrt constant made static), but more generally I think I'd like someone else (Ivan) to have a look.

ivansenic

I would like to propose different solution, which has pros and cons, but definitely avoids doing that gymnastics in the filter, which imo can have negative performance for the requests..

Proposal

How about we don't count record on the HTTP level, but rather record the performance of the CommandProcessor. There we could easily use MeterRegistry directly and measure time for the command execution. We could include tags that we want and that we can take from the exceptions, context, tenant info provider, etc.

The implementation would be way easier and we would read information directly from the model objects, thus we would have way better performance.

The other libraries often have such metrics as well, and do report on in-flight messages, error rates, etc, so it wouldn't be anything unusual. For example, I know Axon has something similar for their command gateway.

The implementation can be very simple:

  public <T extends Command> Uni<CommandResult> processCommand(
      CommandContext commandContext, T command) {
     Timer timer = meterRegistry.timer(...);

      return commandResolverService
        .resolverForCommand(command)

        // resolver can be null, not handled in CommandResolverService for now
        .flatMap(
            resolver -> {
              // if we have resolver, resolve operation and execute
              Operation operation = resolver.resolveCommand(commandContext, command);
              return operation.execute(queryExecutor);
            })

       .onItemOrFailure((item, ex) -> {
           // capture metrics & handle result or exception 
       });
  }

Pros over current solution

can measure time and not only count
does not manipulation of the response input stream
can include tenant if available (can the current solution as well)
can include detailed information on the error, for example error code or error class
better performance

Cons over current solution

no information on 4xx response as they never reach the command gateway
measures only time in command processor (current solution does not measure time, but still we would not get complete HTTP time, but that's fine imo)

I would like to sync about this and thus putting Request changes until we agree on the approach..

tatu-at-datastax · 2023-05-24T15:38:40Z

@ivansenic I like this idea: access from model does give more information.

Wrt 4xx responses; perhaps there could be different mechanism to deal with those? And if we only processed stream content (... if we must) for 4xx, it'd be less of performance concern as well. Or, alternatively, just have non-command-specific 4xx counts: not optimal but at least could alert.

ivansenic · 2023-05-24T15:56:23Z

The 4xx would still be captured in the http metrics out of the box, so we would have info about those as well, just without the command name. @maheshrajamani and I agreed to go and record in the command processor and we agreed to pick up error codes, exception classes and even the info if error is caused by user or not.

tatu-at-datastax · 2023-05-24T17:36:13Z

The 4xx would still be captured in the http metrics out of the box, so we would have info about those as well, just without the command name. @maheshrajamani and I agreed to go and record in the command processor and we agreed to pick up error codes, exception classes and even the info if error is caused by user or not.

Excellent! Sounds like a plan.

ivansenic

I think I provided a lot of good suggestions, please have a look..

src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/MetricsConstants.java

src/main/java/io/stargate/sgv2/jsonapi/service/processor/CommandProcessor.java

src/test/java/io/stargate/sgv2/jsonapi/api/v1/CountIntegrationTest.java

ivansenic

Please check what comments do make and adapt.. Approving to remove the need for wait..

src/main/java/io/stargate/sgv2/jsonapi/api/v1/CollectionResource.java

src/main/java/io/stargate/sgv2/jsonapi/api/v1/metrics/JsonApiMetricsConfig.java

src/main/java/io/stargate/sgv2/jsonapi/service/processor/MeteredCommandProcessor.java

src/test/java/io/stargate/sgv2/jsonapi/service/processor/MeteredCommandProcessorTest.java

maheshrajamani added 5 commits May 22, 2023 13:18

Add count metrics bt command and flag for error

9bb798e

Merge branch 'add-error-metrics-by-command' into add-metrics-by-command

4fd4ec0

Custom metrics for count by command

9d87299

Change the return flag to true

decd0dc

Fixed the command name

9668f43

maheshrajamani requested a review from a team as a code owner May 23, 2023 18:31

IT fix to check for 401 for token missing

bab977e

tatu-at-datastax reviewed May 23, 2023

View reviewed changes

ivansenic suggested changes May 24, 2023

View reviewed changes

maheshrajamani added 3 commits May 24, 2023 18:10

Changes for Metrics to capture from CommandProcessor

21315bb

IT fix

627f7f5

Fixed the FindNamespacesCommand name in the test case

8b996ac

ivansenic suggested changes May 25, 2023

View reviewed changes

maheshrajamani added 5 commits May 25, 2023 10:22

Changes to adapt tag name from config and added MeteredCommandProcessor

652f2bb

Added unit test case for MeteredCommandProcessor

eb22e76

Added unit test case for MeteredCommandProcessor

ff66f5e

Merge branch 'main' into add-metrics-by-command

bd0cd8e

Changes for merge conflict

c8ac337

maheshrajamani requested a review from ivansenic May 25, 2023 16:37

ivansenic approved these changes May 29, 2023

View reviewed changes

maheshrajamani added 2 commits May 31, 2023 11:48

Changes as per review comments

3c5ee8d

Updated CONFIGURATION.md

8fa6bd8

maheshrajamani merged commit e579f53 into main May 31, 2023

maheshrajamani deleted the add-metrics-by-command branch May 31, 2023 16:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add command name to count metrics #437

Add command name to count metrics #437

maheshrajamani commented May 23, 2023 •

edited

Loading

tatu-at-datastax May 23, 2023

maheshrajamani May 24, 2023

tatu-at-datastax May 23, 2023 •

edited

Loading

maheshrajamani May 24, 2023

tatu-at-datastax May 23, 2023

maheshrajamani May 24, 2023

tatu-at-datastax May 23, 2023 •

edited

Loading

tatu-at-datastax May 23, 2023

maheshrajamani May 24, 2023

tatu-at-datastax May 23, 2023 •

edited

Loading

maheshrajamani May 24, 2023

tatu-at-datastax left a comment

ivansenic left a comment •

edited

Loading

tatu-at-datastax commented May 24, 2023

ivansenic commented May 24, 2023

tatu-at-datastax commented May 24, 2023

ivansenic left a comment

ivansenic left a comment

Add command name to count metrics #437

Add command name to count metrics #437

Conversation

maheshrajamani commented May 23, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tatu-at-datastax May 23, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tatu-at-datastax May 23, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tatu-at-datastax May 23, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tatu-at-datastax left a comment

Choose a reason for hiding this comment

ivansenic left a comment • edited Loading

Choose a reason for hiding this comment

Proposal

Pros over current solution

Cons over current solution

tatu-at-datastax commented May 24, 2023

ivansenic commented May 24, 2023

tatu-at-datastax commented May 24, 2023

ivansenic left a comment

Choose a reason for hiding this comment

ivansenic left a comment

Choose a reason for hiding this comment

maheshrajamani commented May 23, 2023 •

edited

Loading

tatu-at-datastax May 23, 2023 •

edited

Loading

tatu-at-datastax May 23, 2023 •

edited

Loading

tatu-at-datastax May 23, 2023 •

edited

Loading

ivansenic left a comment •

edited

Loading