Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

Prometheus interaction #7

Merged
merged 8 commits into from
Aug 3, 2023
Merged

Prometheus interaction #7

merged 8 commits into from
Aug 3, 2023

Conversation

vyzigold
Copy link
Contributor

This PR adds initial functionality to interact with prometheus.

There are 2 parts to the client. First adds a library to query prometheus from python, this can be used from a future aodh prometheus evaluator. The second part uses this to interact with prometheus from cli. Examples of both can be seen in the README. Kudos to @paramite for the prometheus_client.py

There are a few questions regarding the future development:

  • Which cli commands (if any) do we want to support?
    Right now I added the list, show and query commands. I think it might be useful to replicate at least some of the basic openstack metric commands. Only the query command uses the PrometheusAPIClient right now. If we want to keep list and show, it'll need to be slightly modified. (list and show use the prometheus-api-client library, which we decided not to use because of packaging difficulties)

  • Where do we want to enforce the RBAC?
    There is a class PrometheusRBAC prepared in prometheus_client.py. But I think it might be better to decouple openstack specific logic from prometheus_client.py. We could put the code for that into QueryManager.query, where it is right now (although it would need to be a bit more robust).

  • Where do we get the host and port for prometheus?
    It seems to me like the other openstack plugins just ask keystone for this information, but keystone doesn't know anything about prometheus. It could be specified as a parameter by whoever is creating the client from the python library. In case of cli, we could add --host and --port args. Right now it's hardcoded to 127.0.0.1:9090

@vyzigold vyzigold requested a review from paramite July 26, 2023 09:16
observabilityclient/v1/client.py Outdated Show resolved Hide resolved
setup.cfg Outdated Show resolved Hide resolved
observabilityclient/utils/metric_utils.py Outdated Show resolved Hide resolved
@paramite
Copy link
Member

This is definitely great start. We can discuss the RBAC, but having API client free of OpenStack specific logic sounds fine to me.

@paramite paramite requested review from jlarriba and yadneshk July 26, 2023 16:56
README.md Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
README.md Show resolved Hide resolved
observabilityclient/plugin.py Show resolved Hide resolved
observabilityclient/v1/client.py Outdated Show resolved Hide resolved
@vyzigold
Copy link
Contributor Author

As was discussed on our meeting, I will:

  • Rename the command to be "openstack metric" to provide continuity with gnocchi client.
  • Add cli commands, for start I'll add create, delete, list, show and I'll leave the query command.
  • Implement the RBAC injection (I have a new idea how to do it). I'll keep it outside of PrometheusAPIClient to keep PrometheusAPIClient free of openstack specifics.
  • Get the prometheus url configuration from /etc/openstack or from env variables.

@vyzigold vyzigold marked this pull request as draft July 27, 2023 08:59
vyzigold added 2 commits July 31, 2023 09:21
This commit adds:
    - commands:
        delete
        clear-tombstones
        snapshot
    - Better rbac injection as well as a possibility
      to disable rbac.
    - Configuration of prometheus_client through
      env variables and /etc/openstack/prometheus.yaml

It also does some further cleanup. It makes the list and
show commands use our prometheus client in prometheus_client.py
@vyzigold
Copy link
Contributor Author

I incorporated the changes we talked about. The changes include mainly:

- commands:
    delete
    clear-tombstones
    snapshot
- Better rbac injection as well as a possibility
  to disable rbac.
- Configuration of prometheus_client through
  env variables and /etc/openstack/prometheus.yaml
- Renaming of the command to openstack metric *

It also does some further cleanup. It makes the list and
show commands use our prometheus client in prometheus_client.py

For the RBAC I'm including a label {project_id='some_id'} after each metric name in each query. The PromQL is quite specific about where labels can be placed, so before injecting the RBAC into the user's query, I'm querying prometheus to find out all of the metric names, so that I can find them inside the query. If we don't want to do this additional query we would probably need to restrict what kind of user queries we support or implement a proper PromQL parser (one can be seen here).

During the development, I encountered a few things, on which I'd like to know your opinion.

  1. Can I use the pyyaml library used in metric_utils.py?
  2. In metric_utils.py, I hardcoded the path to a file, which should contain information about how to connect to prometheus to "/etc/openstack/prometheus.yaml". Are we ok with it staying hardcoded like that?
  3. I added the option to disable rbac to all of the commands/functions. Should everybody be able to use that option freely or should we restrict it somehow?
  4. What do we do about the "query" and "show" commands? Originally I meant the query command to accept any PromQL query and display the result. This was supposed to be the command, which would be later used for autoscalling. The show command was supposed to do something similar to gnocchi plugin's show command - display a value of some metric. So the user is meant to write a metric name as an argument and it'll display all current values of the metric. The thing is, that right now both of the commands do really similar things. "query" is basically just an extension of "show". If you take any "show" command and replace "show" by "query", you'll get exactly the same result. I'd say we could just delete the current implementation of "show" command and replace it by "query".
  5. Do we want to somehow restrict access to the admin endpoints? I know anybody can just curl prometheus directly, but maybe we might not want to make it so easy for anybody to just delete metrics.
  6. Inspired by the prometheus-api-client library I added an ability to specify additional labels separately to a query. The command looks something like this: openstack metric query somequery --label="job='prometheus'". Does this seem useful or just confusing?

@vyzigold vyzigold marked this pull request as ready for review July 31, 2023 14:49
@vyzigold
Copy link
Contributor Author

vyzigold commented Aug 1, 2023

I also wanted to add a "create" command similarly to the gnocchi client, but there isn't an endpoint in prometheus for creating metrics.

Copy link
Member

@paramite paramite left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few points for discussion, but otherwise this is legit.

observabilityclient/v1/python_api.py Outdated Show resolved Hide resolved
observabilityclient/v1/python_api.py Outdated Show resolved Hide resolved
README.md Show resolved Hide resolved
observabilityclient/prometheus_client.py Outdated Show resolved Hide resolved

def _enrich_labels(self, labels, disable_rbac):
if not self.rbac_init_successful and not disable_rbac:
raise ObservabilityRbacError("Unauthorized. Couldn't "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to fail in constructor rather than later in the process when enriching is used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is what I initially implemented, but then I figured, that there is a possibility to disable rbac with each function/command. So even if the constructor can't find the project_id, you should still be able to do openstack metric list --disable-rbac. That's why I moved the exception from the constructor to here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, that I know, that even the openstack command is able to log, I'll at least log a warning in the constructor.

return labels

# TODO aren't the additional labels just making
# the code confusing? Are they useful?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My 2cents: Maybe and No. User can construct whatever query with various labels in it, so I personally don't see a point of having additional parameters for additional parameters.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I removed the code

@paramite
Copy link
Member

paramite commented Aug 1, 2023

  • Can I use the pyyaml library used in metric_utils.py?

It seems to me that this library is not available downstream (at least it's not obvious from list of Brew builds). Checking on OSP deployment @lnatapov has on seal31 it confirms that we don't have that package (no python-pyyaml nor python3-pyyaml nor PyYAML). So if you can't achieve same with standard YAML lib, we would need to package and ship it ourselves.

  • In metric_utils.py, I hardcoded the path to a file, which should contain information about how to connect to prometheus to "/etc/openstack/prometheus.yaml". Are we ok with it staying hardcoded like that?

Hardcoding paths in case of configuration is fine, but I would add more option and ideally consistent options with the rest of the client: https://github.com/openstack/python-openstackclient/blob/7ea78b6ef65481c8e97bac959b4f11e3ecae8a3e/doc/source/cli/man/openstack.rst#config-files

  • I added the option to disable rbac to all of the commands/functions. Should everybody be able to use that option freely or should we restrict it somehow?

That is a good question. If we are sure enough that the query enriching mechanism allows most of the queries (which to me it seems so), I would say that only admin users should be able to fetch metrics of any project.

  • What do we do about the "query" and "show" commands? Originally I meant the query command to accept any PromQL query and display the result. This was supposed to be the command, which would be later used for autoscalling. The show command was supposed to do something similar to gnocchi plugin's show command - display a value of some metric. So the user is meant to write a metric name as an argument and it'll display all current values of the metric. The thing is, that right now both of the commands do really similar things. "query" is basically just an extension of "show". If you take any "show" command and replace "show" by "query", you'll get exactly the same result. I'd say we could just delete the current implementation of "show" command and replace it by "query".

If we want to have an outupt of show command consistent with the past, then I would keep it as it is now (eg. accepting just metric name). Having a query command to display something like max_over_time(ceilometer_image_size{resource='abcd'}[15m]) for example shoud be a purpose of query command and indeed will be used for autoscaling.

  • Do we want to somehow restrict access to the admin endpoints? I know anybody can just curl prometheus directly, but maybe we might not want to make it so easy for anybody to just delete metrics.

By admin users maybe, yes.

  • Inspired by the prometheus-api-client library I added an ability to specify additional labels separately to a query. The command looks something like this: openstack metric query somequery --label="job='prometheus'". Does this seem useful or just confusing?

As I wrote in comments, this seems unnecessary to me.

@vyzigold
Copy link
Contributor Author

vyzigold commented Aug 1, 2023

I don't think there is a standard yaml lib in python (if there is, please correct me). But I don't think we need to package it just because of this. If the config file doesn't get much more complex, it should be pretty easy to just parse it by ourselves. Using regexes, it's probably just a few lines of code.

@paramite
Copy link
Member

paramite commented Aug 2, 2023

I don't think there is a standard yaml lib in python (if there is, please correct me).

YAML library is being distributed in two forms. One is a C-binding to libyaml which usually is part of Python distribution and then you can have pure Python implementation which we would have to package. Not that I would know that fact yesterday, but I never had to install additional package to be able to import yaml, and that is what I meant by 'standard' library. I thought that PyYAML you were mentioning yesterday can do something extra. So yeah, you can use PyYAML ;).

@vyzigold
Copy link
Contributor Author

vyzigold commented Aug 3, 2023

I've implemented Martin's comments.

  • I added proper logging to PrometheusAPIClient
  • I removed all the pieces of code related to the "additional labels"
  • I modified the show query as suggested by Martin
  • I added a possibility for multiple locations of the prometheus.yaml

I also discovered, that label values can be any unicode character, which would break the current rbac implementation if "}" is a part of a label value. Fix for that will follow shortly.

@vyzigold
Copy link
Contributor Author

vyzigold commented Aug 3, 2023

I added a support for unicode label values, which wouldn't work before. The rbac is getting quite complex. What I tried seemed to work. I'll come up with some extensive unit tests to make sure it really does what it should.

I'm leaving for 2 weeks vacation. I feel like I implemented most of what Martin noted, and there aren't any other reviews here right now. My suggestion is to merge this (unless somebody sees something awful here), so that Martin can start using it in his aodh evaluator. Please leave notes/reviews here, or somewhere else. I'm planning to take a look at them when I'm back and open another PR with them.

Copy link

@jlarriba jlarriba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My review was centered around using a better command than "observabilityclient" from the cli. As we agreed on the meeting, re-use "metrics" is perfectly fit.

@vyzigold vyzigold merged commit a580772 into master Aug 3, 2023
@vyzigold vyzigold deleted the prometheus_interaction branch August 3, 2023 13:30
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants