Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add OpenTelemetry metrics reporting #107

Merged
merged 17 commits into from
Jun 28, 2024
Merged

feat: add OpenTelemetry metrics reporting #107

merged 17 commits into from
Jun 28, 2024

Conversation

evansims
Copy link
Member

@evansims evansims commented Jun 27, 2024

Description

Introduces OpenTelemetry metrics reporting into the SDK for specific actions. We're intentionally starting with only a few metric events that we can then grow over time.

We don't have any formal documentation for this currently. We'll write this as we roll it out across all SDKs, so here is a summary of the events and the associated attributes:

Metric Name Type Description
fga-client.request.duration Histogram The total request time for FGA requests
fga-client.query.duration Histogram The amount of time the FGA server took to process the request
fga-client.credentials.request Counter The total number of times a new token was requested when using ClientCredentials
Attribute Name Type Description
fga-client.response.model_id string The authorization model ID that the FGA server used
fga-client.request.method string The FGA method/action that was performed
fga-client.request.store_id string The store ID that was sent as part of the request
fga-client.request.model_id string The authorization model ID that was sent as part of the request, if any
fga-client.request.client_id string The client ID associated with the request, if any
fga-client.user string The user that is associated with the action of the request for check and list users
http.status_code int The status code of the response
http.method string The HTTP method for the request
http.host string Host identifier of the origin the request was sent to

We're implementing this using opentelemetry-api, which means that all actions will be a no-op unless an opentelemetry-sdk (or equivalent) instance is configured within an application. This is currently in draft as we work through the implementation across SDKs, but we're looking for any feedback folks may have!

An example implementation is provided and can be found under opentelemetry in the example directory of this repository.

References

Closes #93

Review Checklist

  • I have clicked on "allow edits by maintainers".
  • I have added documentation for new/changed functionality in this PR or in a PR to openfga.dev [Provide a link to any relevant PRs in the references section above]
  • The correct base branch is being used, if not main
  • I have added tests to validate that the change in functionality is working as expected

@evansims evansims added the enhancement New feature or request label Jun 27, 2024
@codecov-commenter
Copy link

codecov-commenter commented Jun 27, 2024

Codecov Report

Attention: Patch coverage is 86.41975% with 33 lines in your changes missing coverage. Please review.

Project coverage is 68.74%. Comparing base (9056bb8) to head (67ce370).

Files Patch % Lines
openfga_sdk/api/open_fga_api.py 82.50% 7 Missing ⚠️
openfga_sdk/sync/open_fga_api.py 82.50% 7 Missing ⚠️
openfga_sdk/telemetry/metrics.py 82.50% 7 Missing ⚠️
openfga_sdk/help.py 0.00% 6 Missing ⚠️
openfga_sdk/telemetry/attributes.py 89.47% 4 Missing ⚠️
openfga_sdk/api_client.py 95.65% 1 Missing ⚠️
openfga_sdk/sync/api_client.py 95.45% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #107      +/-   ##
==========================================
+ Coverage   68.29%   68.74%   +0.45%     
==========================================
  Files         113      119       +6     
  Lines        8894     9129     +235     
==========================================
+ Hits         6074     6276     +202     
- Misses       2820     2853      +33     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@evansims evansims marked this pull request as ready for review June 28, 2024 04:35
@evansims evansims requested review from a team as code owners June 28, 2024 04:35
@evansims evansims changed the title feat: add opentelemetry metrics reporting feat: add OpenTelemetry metrics reporting Jun 28, 2024
@evansims
Copy link
Member Author

One important thing to note, I've opted not to use the opentelemetry-semantic-conventions package here for a number of reasons.

  1. I noticed its definitions were incomplete and missing a number of basic attribute names we were expecting to be present.
  2. Several names had differing values from other languages' versions of the package.

It didn't make sense to implement the package in such an unfinished state here.

ewanharris
ewanharris previously approved these changes Jun 28, 2024
Copy link
Member

@ewanharris ewanharris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

I really like the abstraction you built around the telemetry, definitely will look to do that in JS as we roll out the metrics/span work across more of the API

@evansims evansims added this pull request to the merge queue Jun 28, 2024
Merged via the queue into main with commit 99ab813 Jun 28, 2024
17 checks passed
@evansims evansims deleted the feat/opentelemetry branch June 28, 2024 16:17
github-merge-queue bot pushed a commit to openfga/dotnet-sdk that referenced this pull request Sep 10, 2024
<!-- Thanks for opening a PR! Here are some quick tips:
If this is your first time contributing, [read our Contributing
Guidelines](https://github.com/openfga/.github/blob/main/CONTRIBUTING.md)
to learn how to create an acceptable PR for this repo.
By submitting a PR to this repository, you agree to the terms within the
[OpenFGA Code of
Conduct](https://github.com/openfga/.github/blob/main/CODE_OF_CONDUCT.md)

If your PR is under active development, please submit it as a "draft".
Once it's ready, open it up for review.
-->

<!-- Provide a brief summary of the changes -->

## Description

Introduces [OpenTelemetry](https://opentelemetry.io/) metrics reporting
into the SDK for specific actions. We're intentionally starting with
only a few metric events that we can then grow over time.

We don't have any formal documentation for this currently. We'll write
this as we roll it out across all SDKs, so here is a summary of the
events and the associated attributes:

## Metrics

### Supported Metrics

| Metric Name | Type | Enabled by Default | Description |

|---------------------------------|-----------|--------------------|--------------------------------------------------------------------------------------|
| `fga-client.request.duration` | Histogram | Yes | The total request
time for FGA requests |
| `fga-client.query.duration` | Histogram | Yes | The amount of time the
FGA server took to internally process nd evaluate the request |
|` fga-client.credentials.request`| Counter | Yes | The total number of
times a new token was requested when using ClientCredentials |
| `fga-client.request.count` | Counter | No | The total number of
requests made to the FGA server |

### Supported attributes

| Attribute Name | Type | Enabled by Default | Description |

|--------------------------------|----------|--------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `fga-client.response.model_id` | `string` | Yes | The authorization
model ID that the FGA server used |
| `fga-client.request.method` | `string` | Yes | The FGA method/action
that was performed (e.g. `Check`, `ListObjects`, ...) in TitleCase |
| `fga-client.request.store_id` | `string` | Yes | The store ID that was
sent as part of the request |
| `fga-client.request.model_id` | `string` | Yes | The authorization
model ID that was sent as part of the request, if any |
| `fga-client.request.client_id` | `string` | Yes | The client ID
associated with the request, if any |
| `fga-client.user` | `string` | No | The user that is associated with
the action of the request for check and list objects |
| `http.request.resend_count` | `int` | Yes | The number of retries
attempted (Only sent if the request was retried. Count of `1` means the
request was retried once in addition to the original request) |
| `http.response.status_code` | `int` | Yes | The status code of the
response |
| `http.request.method` | `string` | No | The HTTP method for the
request |
| `http.host` | `string` | Yes | Host identifier of the origin the
request was sent to |
| `url.scheme` | `string` | No | HTTP Scheme of the request
(`http`/`https`) |
| `url.full` | `string` | No | Full URL of the request |
| `user_agent.original` | `string` | Yes | User Agent used in the query
|

## Testing

Spin up Prometheus, build the project and then run the
OpenTelemetryExample app.

You should see metrics in Prometheus such as:
```
fga_client_query_duration_bucket{exported_instance="28790af8-2116-4e47-baaf-319f9f23f518", exported_job=""openfga-otel-dotnet-example"", fga_client_request_client_id="3DSvMhkJwjoSBMTliPEdPjybbVewfVes", fga_client_request_method="Check", fga_client_request_model_id="01J41WCA3KBA9SP2M350FPDEJ0", fga_client_request_store_id="01J40DW2J9NNYNNVEC430S3Z97", fga_client_response_model_id="01J41WCA3KBA9SP2M350FPDEJ0", fga_client_user="user:anne", http_client_request_duration="40", http_host="api.us1.fga.dev", http_request_method="POST", http_response_status_code="200", http_server_request_duration="2", instance="otel-collector:8889", job="otel-collector", label1="value1", le="5", url_full="https://api.us1.fga.dev/stores/01J40DW2J9NNYNNVEC430S3Z97/check", url_scheme="https", user_agent_original="openfga-sdk dotnet/0.4.0"}
---
fga_client_query_duration_bucket{exported_instance="fbee572d-0bff-4957-95a2-c532fe7312c1", exported_job=""openfga-otel-dotnet-example"", fga_client_request_client_id="3DSvMhkJwjoSBMTliPEdPjybbVewfVes", fga_client_request_method="WriteAuthorizationModel", fga_client_request_store_id="01J40DW2J9NNYNNVEC430S3Z97", http_client_request_duration="139", http_host="api.us1.fga.dev", http_request_method="POST", http_request_resend_count="1", http_response_status_code="201", http_server_request_duration="69", instance="otel-collector:8889", job="otel-collector", label1="value1", le="75", url_full="https://api.us1.fga.dev/stores/01J40DW2J9NNYNNVEC430S3Z97/authorization-models", url_scheme="https", user_agent_original="openfga-sdk dotnet/0.4.0"}
---
fga_client_query_duration_bucket{exported_instance="fbee572d-0bff-4957-95a2-c532fe7312c1", exported_job=""openfga-otel-dotnet-example"", fga_client_request_client_id="3DSvMhkJwjoSBMTliPEdPjybbVewfVes", fga_client_request_method="Write", fga_client_request_model_id="01J41WCA3KBA9SP2M350FPDEJ0", fga_client_request_store_id="01J40DW2J9NNYNNVEC430S3Z97", fga_client_response_model_id="01J41WCA3KBA9SP2M350FPDEJ0", http_client_request_duration="68", http_host="api.us1.fga.dev", http_request_method="POST", http_request_resend_count="1", http_response_status_code="200", http_server_request_duration="30", instance="otel-collector:8889", job="otel-collector", label1="value1", le="5000", url_full="https://api.us1.fga.dev/stores/01J40DW2J9NNYNNVEC430S3Z97/write", url_scheme="https", user_agent_original="openfga-sdk dotnet/0.4.0"}
```

Also for client credential exchange, you should see metrics such as:
```
{__name__="fga_client_credentials_request", exported_instance="28790af8-2116-4e47-baaf-319f9f23f518", exported_job=""openfga-otel-dotnet-example"", fga_client_request_client_id="3DSvMhkJwjoSBMTliPEdPjybbVewfVes", fga_client_request_method="ClientCredentialsExchange", http_client_request_duration="1461", http_host="fga.us.auth0.com", http_request_method="POST", http_response_status_code="200", instance="otel-collector:8889", job="otel-collector", label1="value1", url_full="https://fga.us.auth0.com/oauth/token", url_scheme="https", user_agent_original="openfga-sdk dotnet/0.4.0"}
```

## References
<!-- Provide a list of any applicable references here (GitHub Issue,
[OpenFGA RFC](https://github.com/openfga/rfcs), other PRs, etc..) -->

Closes #68

Follows the implementations in JS
(openfga/js-sdk#117), Python
(openfga/python-sdk#107) and Java
(openfga/java-sdk#94)

## Review Checklist
- [x] I have clicked on ["allow edits by
maintainers"](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork).
- [ ] I have added documentation for new/changed functionality in this
PR or in a PR to [openfga.dev](https://github.com/openfga/openfga.dev)
[Provide a link to any relevant PRs in the references section above]
- [x] The correct base branch is being used, if not `main`
- [ ] I have added tests to validate that the change in functionality is
working as expected
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Export metrics
3 participants