[usePrometheus] Add resolver success and error metrics #2161

RMHonor · 2024-03-01T09:49:10Z

Is your feature request related to a problem? Please describe.

We'd like to capture success and failure prometheus metrics for specific resolvers to build SLOs against. At the moment, the only data captured for resolvers is the duration.

Describe the solution you'd like

Two new counter metrics, graphql_envelop_resolver_success_total and graphql_envelop_resolver_error_total which capture the total number of successful and unsuccessful resolver executions. They would follow the same labelling at the existing resolver execution time (i.e. resolver="{Type}.{field}).

Describe alternatives you've considered

It would be possible to handle this using the operation metrics, and capture success/failure of the whole operation by operationName, but this has some issues:

Infinite cardinality - particularly in the case of a public graph, there could be 1000s of different operation names
Naming collisions - there may be multiple operations which contain the same name, which would skew metrics
We may only care about a nested resolver - if multiple individuals contribute to the same document, different individuals may care about different portions of the operation and want to track specific paths.

Additional context

Using the useOnResolve plugin, it should be relatively straightforward to add this counter in by checking if the result is an instance of an Error.

The text was updated successfully, but these errors were encountered:

EmrysMyrddin · 2024-03-01T13:26:39Z

Hey! Thank you for the proposition. It seems a good idea, perhaps we just should have options enable/disable this metrics to allow users to avoid having too much data. Perhaps even an allow/disallow list of type and fields ?

We will probably not have the bandwidth to implement this in a short term, but PR are welcome and we are here if someone have questions about how to do it :-)

RMHonor · 2024-03-01T13:37:58Z

Hey! Thank you for the proposition. It seems a good idea, perhaps we just should have options enable/disable this metrics to allow users to avoid having too much data. Perhaps even an allow/disallow list of type and fields ?

We will probably not have the bandwidth to implement this in a short term, but PR are welcome and we are here if someone have questions about how to do it :-)

Happy to PR!

There already is an option to enable/disable the existing resolver field, maybe it could be turned into an object to be more granular? There's also resolverWhitelist options which only traces those specified, could just piggyback off that existing option.

EmrysMyrddin · 2024-03-07T15:57:29Z

Oh, yeah, keep it simple for now, it was just an idea :-) Let's wait to see if this is something actually needed by some users :-)

You can go with the just the existing option for a first version!

EmrysMyrddin added the kind/enhancement New feature or request label Mar 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[usePrometheus] Add resolver success and error metrics #2161

[usePrometheus] Add resolver success and error metrics #2161

RMHonor commented Mar 1, 2024

EmrysMyrddin commented Mar 1, 2024

RMHonor commented Mar 1, 2024

EmrysMyrddin commented Mar 7, 2024

[usePrometheus] Add resolver success and error metrics #2161

[usePrometheus] Add resolver success and error metrics #2161

Comments

RMHonor commented Mar 1, 2024

EmrysMyrddin commented Mar 1, 2024

RMHonor commented Mar 1, 2024

EmrysMyrddin commented Mar 7, 2024