Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prometheus metric exporter #378

Merged

Conversation

hectorhdzg
Copy link
Member

@hectorhdzg hectorhdzg commented Jan 24, 2020

This implementation is based on OpenCensus Prometheus exporter

I added a lot of code from #341 to unblock testing and lint for this, once Metrics code is updated this PR only need to merge with latest changes. Please review Prometheus related changes and add comments in PR 341 PR for Metric related stuff.

There are still TODOs regarding supporting histograms and adding tests for Gauges and Measures, this is currently blocked by missing Aggregators in opentelemetry SDK

Fixes #157

@hectorhdzg hectorhdzg requested review from a team, c24t and lzchen January 24, 2020 22:07
@codecov-io
Copy link

codecov-io commented Jan 25, 2020

Codecov Report

Merging #378 into master will increase coverage by 0.14%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #378      +/-   ##
==========================================
+ Coverage   88.11%   88.25%   +0.14%     
==========================================
  Files          41       41              
  Lines        2078     2078              
  Branches      238      238              
==========================================
+ Hits         1831     1834       +3     
+ Misses        173      172       -1     
+ Partials       74       72       -2
Impacted Files Coverage Δ
...emetry-sdk/src/opentelemetry/sdk/trace/__init__.py 91.2% <100%> (+0.36%) ⬆️
...elemetry-api/src/opentelemetry/context/__init__.py 67.39% <0%> (+4.34%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 35b19b3...bfd815b. Read the comment docs.

Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few comments/questions.

Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the PR looks good, I have a question about the controller type that is used.

This PR uses a PushController, it periodically collects the metrics and exports them. The exporter saves those metrics until it is pulled from a client (through the http endpoint that is exposed by start_http_server). I'm wondering if that fact that the same metric could be shown multiple times when the endpoint is called (the controller made multiple iterations), or could not be shown at all (the controller didn't make any iteration) is a problem.

I feel this kind of exporter fits much better with a PullController, would it make sense to implement it before this PR?

@hectorhdzg
Copy link
Member Author

@mauriciovasquezbernal @ocelotl I tried to address all comments please let me know if you have more feedback, we can update examples to use new PullController when available

Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some more comments, more of them are just nits. Only one really important about the structure used to pass store metrics records.


def test_gauge_to_prometheus(self):
# TODO: Add unit test once GaugeAggregator is available
self.assertEqual(True, True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even better, remove the test function and just leave the TODO comment. Pytest still collects, runs and reports this test as it was meaningful when it is not.


def test_measure_to_prometheus(self):
# TODO: Add unit test once Measure Aggregators are available
self.assertEqual(True, True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried the example and it didn't work for me. Besides that LGTM!.

@hectorhdzg
Copy link
Member Author

@mauriciovasquezbernal are you getting any error when running the example?, did you run pip install -e .\opentelemetry-ext-prometheus?

@mauriciovasquezbernal
Copy link
Member

@hectorhdzg I'm sorry for not being clear. It was not working for me because the check_point vs checkpoint, now that you have renamed it, it should be fine.

Copy link
Member

@mauriciovasquezbernal mauriciovasquezbernal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm approving it with the PushController because we don't have defined an architecture for the PullController yet. Probably we want to iterate over this implementation once we have the PullController implemented.

def collect(self):
"""Collect fetches the metrics from OpenTelemetry
and delivers them as Prometheus Metrics.
Collect is invoked every time a prometheus.Gatherer is run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a link to some API that shows how this works?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The custom collector will use same logic as regular Prometheus Collector, pulling mechanism is configured by time intervals but is also triggered by other input like the user refreshing Prometheus visualization UI

https://github.com/prometheus/client_python#custom-collectors
https://github.com/prometheus/client_python/blob/master/prometheus_client/registry.py

for label_key in metric_record.metric.label_keys:
label_keys.append(self._sanitize(label_key))
metric_name = ""
if self._prefix != "":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can do if self._prefix: I believe

@lzchen
Copy link
Contributor

lzchen commented Feb 14, 2020

Looks good! Just some questions about how the label keys and label values behavior is handled.

Copy link
Contributor

@lzchen lzchen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@hectorhdzg
Copy link
Member Author

@ocelotl let me know if you have more feedback for this one

Copy link
Contributor

@ocelotl ocelotl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :+1+

@hectorhdzg hectorhdzg added the PR:please merge This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Feb 19, 2020
Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry to come in so late into a PR, especially one that's ready to merge. But I think the design on how prometheus metrics are exposed requires at the minimum a conversation.

I think we should really split out the starting of the prometheus_client's exposure of metrics from the linking to a Meter object. The current pattern of always choosing to use prometheus_client.start_http_server makes some really valuable use cases very hard to accomplish:

  1. I already have a prometheus http server / wsgi server started, and I want to add opentelemetry metrics to it.
  2. I cannot choose to mount my prometheus route into my wsgi / asgi app
  3. cannot support any future run modes of prometheus_client.

Why not just require that someone reads through prometheus_client, sets up their prometheus endpoint, then attaches via PushController a PrometheusExporter to Meter? It doesn't add that many lines of code, and separates the route startup from the adding of metrics to a registry.

Aside from that, one other question:

Performance of Prometheus in multi-process environments?

I'm curious for those who are familiar with OpenCensus: what was the popularity of the prometheus integration? For me, I'm not sure if it'll handle high throughput applications with it's use of a file to perform IPC around metrics:

There is also a general weakness in the prometheus client where it uses a file to perform IPC, and some features (such as registries) do not behave like normal:

https://github.com/prometheus/client_python#multiprocess-mode-gunicorn

https://github.com/prometheus/client_python/blob/master/prometheus_client/multiprocess.py

@toumorokoshi toumorokoshi removed the PR:please merge This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Feb 20, 2020
@hectorhdzg
Copy link
Member Author

I removed start_http server call from exporter, users can use any kind of Prometheus client and we just attach our Collector to the Prometheus registry, added extra call in README and sample so people can see their data when testing the exporter. @toumorokoshi, @codeboten , @lzchen and @ocelotl please let me know if this ok for everyone.

Copy link
Member

@toumorokoshi toumorokoshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! looks good. I'll leave it open for a little bit longer in case anyone wants to comment on the new pattern, but otherwise we'll merge tomorrow.

@toumorokoshi toumorokoshi added the PR:please merge This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Feb 21, 2020
@toumorokoshi toumorokoshi merged commit 26e0576 into open-telemetry:master Feb 21, 2020
@c24t c24t removed the PR:please merge This PR is ready to be merged by a Maintainer (has enough valid approvals, successful build, etc.) label Feb 21, 2020
@hectorhdzg hectorhdzg deleted the hectorhdzg/prometheus_exporter branch March 24, 2020 22:35
srikanthccv pushed a commit to srikanthccv/opentelemetry-python that referenced this pull request Nov 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add Prometheus exporter
7 participants