Skip to content
This repository has been archived by the owner on Nov 1, 2023. It is now read-only.

Commit

Permalink
Prometheus interaction (#7)
Browse files Browse the repository at this point in the history
* Remove old observability client

* Add initial functionality for prometheus querying

* Fix a copy-paste error in get_client()

* Add additional functionality.

This commit adds:
    - commands:
        delete
        clear-tombstones
        snapshot
    - Better rbac injection as well as a possibility
      to disable rbac.
    - Configuration of prometheus_client through
      env variables and /etc/openstack/prometheus.yaml

* Make README up to date

* Implement Martin's PR comments

* Implement better support for label values in rbac

* PEP8
  • Loading branch information
vyzigold authored Aug 3, 2023
1 parent 3f8acf0 commit a580772
Show file tree
Hide file tree
Showing 15 changed files with 814 additions and 627 deletions.
72 changes: 25 additions & 47 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
# python-observabilityclient

observabilityclient is an OpenStackClient (OSC) plugin implementation that
implements commands for management of OpenStack observability components such
as Prometheus, collectd and Ceilometer.
implements commands for management of Prometheus.

## Development

Expand All @@ -17,58 +16,37 @@ su - stack
git clone https://github.com/infrawatch/python-observabilityclient
cd python-observabilityclient
sudo python setup.py install --prefix=/usr
# clone and install observability playbooks and roles
git clone https://github.com/infrawatch/osp-observability-ansible
sudo mkdir /usr/share/osp-observability
sudo ln -s `pwd`/osp-observability-ansible/playbooks /usr/share/osp-observability/playbooks
sudo ln -s `pwd`/osp-observability-ansible/roles/spawn_container /usr/share/ansible/roles/spawn_container
sudo ln -s `pwd`/osp-observability-ansible/roles/osp_observability /usr/share/ansible/roles/osp_observability
```

### Enable collectd write_prometheus
Create a THT environment file to enable the write_prometheus plugin for the collectd service. Then redeploy your overcloud and include this new file:

```
mkdir -p ~/templates/observability
cat <EOF >> templates/observability/collectd-write-prometheus.yaml
resource_registry:
OS::TripleO::Services::Collectd: /usr/share/openstack-tripleo-heat-templates/deployment/metrics/collectd-container-puppet.yaml
## Usage

# TEST
# parameter_merge_strategies:
# CollectdExtraPlugins: merge
Use `openstack metric query somequery` to query for metrics in prometheus.

parameter_defaults:
CollectdExtraPlugins:
- write_prometheus
EOF
To use the python api do the following:
```
from observabilityclient import client
### Discover endpoints
After deployment of your cloud you can discover endpoints available for scraping:

```
source stackrc
openstack observability discover --stack-name=standalone
c = client.Client(
'1', keystone_client.get_session(conf),
adapter_options={
'interface': conf.service_credentials.interface,
'region_name': conf.service_credentials.region_name})
c.query.query("somequery")
```

### Deploy prometheus:
Create a config file and run the setup command
## List of commands

```
$ cat test_params.yaml
prometheus_remote_write:
stf:
url: https://default-prometheus-proxy-service-telemetry.apps.FAKE.ocp.cluster/api/v1/write
basic_user: internal
basic_pass: Pl4iNt3xTp4a55
ca_cert: |
-----BEGIN CERTIFICATE-----
ABCDEFGHIJKLMNOPQRSTUVWXYZ
-----END CERTIFICATE-----
not-stf:
url: http://prometheus-rw.example.com/api/v1/write
openstack metric list - lists all metrics
openstack metric show - shows current values of a metric
openstack metric query - queries prometheus and outputs the result
openstack metric delete - deletes some metrics
openstack metric snapshot - takes a snapshot of the current data
openstack metric clean-tombstones - cleans the tsdb tombstones

$ openstack observability setup prometheus_agent --config ./test_params.yaml
```
## List of functions provided by the python library
c.query.list - lists all metrics
c.query.show - shows current values of a metric
c.query.query - queries prometheus and outputs the result
c.query.delete - deletes some metrics
c.query.snapshot - takes a snapshot of the current data
c.query.clean-tombstones - cleans the tsdb tombstones
22 changes: 22 additions & 0 deletions observabilityclient/client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Copyright 2023 Red Hat, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

import sys


def Client(version, *args, **kwargs):
module = 'observabilityclient.v%s.client' % version
__import__(module)
client_class = getattr(sys.modules[module], 'Client')
return client_class(*args, **kwargs)
23 changes: 20 additions & 3 deletions observabilityclient/plugin.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,16 @@
# Copyright 2023 Red Hat, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

"""OpenStackClient Plugin interface"""

Expand All @@ -8,7 +21,7 @@
API_NAME = 'observabilityclient'
API_VERSION_OPTION = 'os_observabilityclient_api_version'
API_VERSIONS = {
'1': 'observabilityclient.plugin',
'1': 'observabilityclient.v1.client.Client',
}


Expand All @@ -20,12 +33,16 @@ def make_client(instance):
:param ClientManager instance: The ClientManager that owns the new client
"""
plugin_client = utils.get_client_class(
observability_client = utils.get_client_class(
API_NAME,
instance._api_version[API_NAME],
API_VERSIONS)

client = plugin_client()
client = observability_client(session=instance.session,
adapter_options={
'interface': instance.interface,
'region_name': instance.region_name
})
return client


Expand Down
200 changes: 200 additions & 0 deletions observabilityclient/prometheus_client.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,200 @@
# Copyright 2023 Red Hat, Inc.
#
# Licensed under the Apache License, Version 2.0 (the "License"); you may
# not use this file except in compliance with the License. You may obtain
# a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
# WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
# License for the specific language governing permissions and limitations
# under the License.

import logging
import requests


LOG = logging.getLogger(__name__)


class PrometheusAPIClientError(Exception):
def __init__(self, response):
self.resp = response

def __str__(self) -> str:
if self.resp.status_code != requests.codes.ok:
if self.resp.status_code != 204:
decoded = self.resp.json()
if 'error' in decoded:
return f'[{self.resp.status_code}] {decoded["error"]}'
return f'[{self.resp.status_code}] {self.resp.reason}'
else:
decoded = self.resp.json()
return f'[{decoded.status}]'

def __repr__(self) -> str:
if self.resp.status_code != requests.codes.ok:
if self.resp.status_code != 204:
decoded = self.resp.json()
if 'error' in decoded:
return f'[{self.resp.status_code}] {decoded["error"]}'
return f'[{self.resp.status_code}] {self.resp.reason}'
else:
decoded = self.resp.json()
return f'[{decoded.status}]'


class PrometheusMetric:
def __init__(self, input):
self.timestamp = input['value'][0]
self.labels = input['metric']
self.value = input['value'][1]


class PrometheusAPIClient:
def __init__(self, host):
self._host = host
self._session = requests.Session()
self._session.verify = False

def set_ca_cert(self, ca_cert):
self._session.verify = ca_cert

def set_client_cert(self, client_cert, client_key):
self._session.cert = client_cert
self._session.key = client_key

def set_basic_auth(self, auth_user, auth_password):
self._session.auth = (auth_user, auth_password)

def _get(self, endpoint, params=None):
url = (f"{'https' if self._session.verify else 'http'}://"
f"{self._host}/api/v1/{endpoint}")
resp = self._session.get(url, params=params,
headers={'Accept': 'application/json'})
if resp.status_code != requests.codes.ok:
raise PrometheusAPIClientError(resp)
decoded = resp.json()
if decoded['status'] != 'success':
raise PrometheusAPIClientError(resp)

return decoded

def _post(self, endpoint, params=None):
url = (f"{'https' if self._session.verify else 'http'}://"
f"{self._host}/api/v1/{endpoint}")
resp = self._session.post(url, params=params,
headers={'Accept': 'application/json'})
if resp.status_code != requests.codes.ok:
raise PrometheusAPIClientError(resp)
decoded = resp.json()
if 'status' in decoded and decoded['status'] != 'success':
raise PrometheusAPIClientError(resp)
return decoded

def query(self, query):
"""Sends custom queries to Prometheus
:param query: the query to send
:type query: str
"""

LOG.debug(f"Querying prometheus with query: {query}")
decoded = self._get("query", dict(query=query))

if decoded['data']['resultType'] == 'vector':
result = [PrometheusMetric(i) for i in decoded['data']['result']]
else:
result = [PrometheusMetric(decoded)]
return result

def series(self, matches):
"""Queries the /series/ endpoint of prometheus
:param matches: List of matches to send as parameters
:type matches: [str]
"""

LOG.debug(f"Querying prometheus for series with matches: {matches}")
decoded = self._get("series", {"match[]": matches})

return decoded['data']

def labels(self):
"""Queries the /labels/ endpoint of prometheus, returns list of labels
There isn't a way to tell prometheus to restrict
which labels to return. It's not possible to enforce
rbac with this for example.
"""

LOG.debug("Querying prometheus for labels")
decoded = self._get("labels")

return decoded['data']

def label_values(self, label):
"""Queries prometheus for values of a specified label.
:param label: Name of label for which to return values
:type label: str
"""

LOG.debug(f"Querying prometheus for the values of label: {label}")
decoded = self._get(f"label/{label}/values")

return decoded['data']

# ---------
# admin api
# ---------

def delete(self, matches, start=None, end=None):
"""Deletes some metrics from prometheus
:param matches: List of matches, that specify which metrics to delete
:type matches [str]
:param start: Timestamp from which to start deleting.
None for as early as possible.
:type start: timestamp
:param end: Timestamp until which to delete.
None for as late as possible.
:type end: timestamp
"""
# NOTE Prometheus doesn't seem to return anything except
# of 204 status code. There doesn't seem to be a
# way to know if anything got actually deleted.
# It does however return 500 code and error msg
# if the admin APIs are disabled.

LOG.debug(f"Deleting metrics from prometheus matching: {matches}")
try:
self._post("admin/tsdb/delete_series", {"match[]": matches,
"start": start,
"end": end})
except PrometheusAPIClientError as exc:
# The 204 is allowed here. 204 is "No Content",
# which is expected on a successful call
if exc.resp.status_code != 204:
raise exc

def clean_tombstones(self):
"""Asks prometheus to clean tombstones"""

LOG.debug("Cleaning tombstones from prometheus")
try:
self._post("admin/tsdb/clean_tombstones")
except PrometheusAPIClientError as exc:
# The 204 is allowed here. 204 is "No Content",
# which is expected on a successful call
if exc.resp.status_code != 204:
raise exc

def snapshot(self):
"""Creates a snapshot and returns the file name containing the data"""

LOG.debug("Taking prometheus data snapshot")
ret = self._post("admin/tsdb/snapshot")
return ret["data"]["name"]
Loading

0 comments on commit a580772

Please sign in to comment.