Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat!: Use OTel to export metrics (metric name changes) (#419)
### Related Issues <!-- add here the GitHub issue that this PR resolves if applicable --> In #186, there was a discussion about going with OpenTelemetry or a direct Prometheus implementation. The agreement was to have a scraping endpoint, i.e., `pull mode`, and not support the `push` mode. This PR wants to replace the existing direct Prometheus implementation with the vendor-agnostic OpenTelemetry one, maintaining the same feature set. <details> <summary>Example of exposed metrics:</summary> ``` # HELP go_gc_duration_seconds A summary of the pause duration of garbage collection cycles. # TYPE go_gc_duration_seconds summary go_gc_duration_seconds{quantile="0"} 5.1e-05 go_gc_duration_seconds{quantile="0.25"} 0.000174874 go_gc_duration_seconds{quantile="0.5"} 0.000559251 go_gc_duration_seconds{quantile="0.75"} 0.000875708 go_gc_duration_seconds{quantile="1"} 0.001527791 go_gc_duration_seconds_sum 0.010100666 go_gc_duration_seconds_count 18 # HELP go_goroutines Number of goroutines that currently exist. # TYPE go_goroutines gauge go_goroutines 21 # HELP go_info Information about the Go environment. # TYPE go_info gauge go_info{version="go1.19.3"} 1 # HELP go_memstats_alloc_bytes Number of bytes allocated and still in use. # TYPE go_memstats_alloc_bytes gauge go_memstats_alloc_bytes 2.6836392e+07 # HELP go_memstats_alloc_bytes_total Total number of bytes allocated, even if freed. # TYPE go_memstats_alloc_bytes_total counter go_memstats_alloc_bytes_total 2.76925672e+08 # HELP go_memstats_buck_hash_sys_bytes Number of bytes used by the profiling bucket hash table. # TYPE go_memstats_buck_hash_sys_bytes gauge go_memstats_buck_hash_sys_bytes 6351 # HELP go_memstats_frees_total Total number of frees. # TYPE go_memstats_frees_total counter go_memstats_frees_total 2.563256e+06 # HELP go_memstats_gc_sys_bytes Number of bytes used for garbage collection system metadata. # TYPE go_memstats_gc_sys_bytes gauge go_memstats_gc_sys_bytes 7.975824e+06 # HELP go_memstats_heap_alloc_bytes Number of heap bytes allocated and still in use. # TYPE go_memstats_heap_alloc_bytes gauge go_memstats_heap_alloc_bytes 2.6836392e+07 # HELP go_memstats_heap_idle_bytes Number of heap bytes waiting to be used. # TYPE go_memstats_heap_idle_bytes gauge go_memstats_heap_idle_bytes 4.0656896e+07 # HELP go_memstats_heap_inuse_bytes Number of heap bytes that are in use. # TYPE go_memstats_heap_inuse_bytes gauge go_memstats_heap_inuse_bytes 3.2415744e+07 # HELP go_memstats_heap_objects Number of allocated objects. # TYPE go_memstats_heap_objects gauge go_memstats_heap_objects 77579 # HELP go_memstats_heap_released_bytes Number of heap bytes released to OS. # TYPE go_memstats_heap_released_bytes gauge go_memstats_heap_released_bytes 3.0916608e+07 # HELP go_memstats_heap_sys_bytes Number of heap bytes obtained from system. # TYPE go_memstats_heap_sys_bytes gauge go_memstats_heap_sys_bytes 7.307264e+07 # HELP go_memstats_last_gc_time_seconds Number of seconds since 1970 of last garbage collection. # TYPE go_memstats_last_gc_time_seconds gauge go_memstats_last_gc_time_seconds 1.676843405619668e+09 # HELP go_memstats_lookups_total Total number of pointer lookups. # TYPE go_memstats_lookups_total counter go_memstats_lookups_total 0 # HELP go_memstats_mallocs_total Total number of mallocs. # TYPE go_memstats_mallocs_total counter go_memstats_mallocs_total 2.640835e+06 # HELP go_memstats_mcache_inuse_bytes Number of bytes in use by mcache structures. # TYPE go_memstats_mcache_inuse_bytes gauge go_memstats_mcache_inuse_bytes 12000 # HELP go_memstats_mcache_sys_bytes Number of bytes used for mcache structures obtained from system. # TYPE go_memstats_mcache_sys_bytes gauge go_memstats_mcache_sys_bytes 15600 # HELP go_memstats_mspan_inuse_bytes Number of bytes in use by mspan structures. # TYPE go_memstats_mspan_inuse_bytes gauge go_memstats_mspan_inuse_bytes 300832 # HELP go_memstats_mspan_sys_bytes Number of bytes used for mspan structures obtained from system. # TYPE go_memstats_mspan_sys_bytes gauge go_memstats_mspan_sys_bytes 701760 # HELP go_memstats_next_gc_bytes Number of heap bytes when next garbage collection will take place. # TYPE go_memstats_next_gc_bytes gauge go_memstats_next_gc_bytes 3.620952e+07 # HELP go_memstats_other_sys_bytes Number of bytes used for other system allocations. # TYPE go_memstats_other_sys_bytes gauge go_memstats_other_sys_bytes 1.972625e+06 # HELP go_memstats_stack_inuse_bytes Number of bytes in use by the stack allocator. # TYPE go_memstats_stack_inuse_bytes gauge go_memstats_stack_inuse_bytes 2.424832e+06 # HELP go_memstats_stack_sys_bytes Number of bytes obtained from system for stack allocator. # TYPE go_memstats_stack_sys_bytes gauge go_memstats_stack_sys_bytes 2.424832e+06 # HELP go_memstats_sys_bytes Number of bytes obtained from system. # TYPE go_memstats_sys_bytes gauge go_memstats_sys_bytes 8.6169632e+07 # HELP go_threads Number of OS threads created. # TYPE go_threads gauge go_threads 18 # HELP otel_scope_info Instrumentation Scope metadata # TYPE otel_scope_info gauge otel_scope_info{otel_scope_name="openfeature/flagd",otel_scope_version=""} 1 # HELP promhttp_metric_handler_requests_in_flight Current number of scrapes being served. # TYPE promhttp_metric_handler_requests_in_flight gauge promhttp_metric_handler_requests_in_flight 1 # HELP promhttp_metric_handler_requests_total Total number of scrapes by HTTP status code. # TYPE promhttp_metric_handler_requests_total counter promhttp_metric_handler_requests_total{code="200"} 78 promhttp_metric_handler_requests_total{code="500"} 0 promhttp_metric_handler_requests_total{code="503"} 0 # HELP request_duration_seconds The latency of the HTTP requests # TYPE request_duration_seconds histogram request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="0"} 0 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="5"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="10"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="25"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="50"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="75"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="100"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="250"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="500"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="750"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="1000"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="2500"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="5000"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="7500"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="10000"} 22661 request_duration_seconds_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="+Inf"} 22661 request_duration_seconds_sum{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name=""} 33.89927686600012 request_duration_seconds_count{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name=""} 22661 # HELP requests_inflight The number of inflight requests being handled at the same time # TYPE requests_inflight gauge requests_inflight{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name=""} 3 # HELP response_size_bytes_bytes The size of the HTTP responses # TYPE response_size_bytes_bytes histogram response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="0"} 0 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="5"} 0 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="10"} 0 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="25"} 0 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="50"} 1 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="75"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="100"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="250"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="500"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="750"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="1000"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="2500"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="5000"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="7500"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="10000"} 22661 response_size_bytes_bytes_bucket{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name="",le="+Inf"} 22661 response_size_bytes_bytes_sum{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name=""} 1.654229e+06 response_size_bytes_bytes_count{http_method="POST",http_status_code="200",http_url="/schema.v1.Service/ResolveBoolean",otel_scope_name="openfeature/flagd",otel_scope_version="",service_name=""} 22661 # HELP target_info Target metadata # TYPE target_info gauge target_info{service_name="unknown_service:___go_build_github_com_open_feature_flagd",telemetry_sdk_language="go",telemetry_sdk_name="opentelemetry",telemetry_sdk_version="1.13.0"} 1 ``` </details> ### Notes There are several benefits of using OpenTelemetry. The most prominent one is if we introduce Span supports in flagd, we could get out-of-the-box support for [exemplars](https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/data-model.md#exemplars) to pin-point slow requests. ### Follow-up Tasks - Introduce a flag to specify the OpenTelemetry collector URL ❓ --------- Signed-off-by: Giovanni Liva <giovanni.liva@dynatrace.com> Signed-off-by: Todd Baert <toddbaert@gmail.com> Co-authored-by: Todd Baert <toddbaert@gmail.com>
- Loading branch information