Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#171281178] Document metric exporting to prometheus #343

Merged
merged 6 commits into from
Sep 15, 2020

Conversation

46bit
Copy link
Contributor

@46bit 46bit commented Sep 14, 2020

What

Does three main things:

  1. Separates the monitoring documentation into different pages for apps and services
  2. Adds instructions on how to get backing service metrics into Prometheus
  3. Removes metric descriptions that now exist inline with the metric graphs in paas-admin

See the commit messages for more details.

How to review

  1. Preview the content locally (see README) and check that it renders as expected.
  2. Manually check that any removed or renamed anchor tags are not in use (linkchecker doesn't do this).

Who can review

Not @46bit

@46bit 46bit force-pushed the document-metric-exporting-to-prometheus branch from 5f6d8f9 to da7f9c7 Compare September 15, 2020 10:31
Miki Mokrysz added 3 commits September 15, 2020 11:47
This commit adds a new entry to the Tech Docs sidebar entitled
"Monitoring backing services". This page describes how to access
metrics and logs for tenants to understand the state of their backing
services.

This content is all moved over from the pre-existing "Monitoring apps"
documentation. While there may be some broken anchor links, I am not
clear that our Tech Docs provide any options for addressing that. To
try and mitigate user confusion I've ensured that both sections of
documentation link to each other.

I have done this because the "Monitoring apps" section was unwieldy
and disordered. It contains a description of every backing service
metric shown by `paas-admin`, and that's way too much content to have
inline. For now I'm not sure where else to put it.

In the next commit I will be adding new content describing how to get
backing service metrics exported to Prometheus. This split should give
that content a good place to live.
This commit adds a short section of documentation:

  - Inform users that they can export backing service metrics to
    Prometheus (or by implication, other systems compatible with the
    Prometheus exposition format)

  - Explain that we already exported Postgres and MySQL metrics in the
    https://github.com/alphagov/paas-prometheus-exporter

  - Explain we can provide Elasticsearch and Redis metrics on request.
    Elasticsearch has to be setup manually [1]. Redis is already
    deployed [2], but it costs money that we can't recharge yet. To
    control costs we want to keep an eye on who is using it.

  - Mention that in theory Prometheus can be run on PaaS, and how, but
    emphasise that we don't know if it's completely solved. This can
    be expanded later if/when we can provide more info.

[1] https://team-manual.cloud.service.gov.uk/support/shipping_elasticsearch_metrics_to_tenants/
[2] https://github.com/alphagov/paas-prometheus-endpoints/tree/main/src/redis
The list of service metrics available in paas-admin hadn't been
updated to include Elasticsearch. This commit adds it to the list.

I'm not even going to try and add a "Metrics definitions" section.
That info massively pollutes the backing service monitoring page
already. It doesn't belong there.
@46bit 46bit force-pushed the document-metric-exporting-to-prometheus branch from da7f9c7 to 9369706 Compare September 15, 2020 10:49
We now show this information inside paas-admin, above the metric
graphs. It could be worth having this information visible, but it
enormously pollutes the metrics pages of our tech docs.

All the definitions were pretty similar to the ones in paas-admin.
@46bit 46bit force-pushed the document-metric-exporting-to-prometheus branch from 9369706 to 0207270 Compare September 15, 2020 10:54
@46bit
Copy link
Contributor Author

46bit commented Sep 15, 2020

Now that I've removed the metrics definitions, the "Monitoring backing services" page is quite short. I suggest we keep it that way, as the "Monitoring apps" section is huge. When the metrics endpoints for Redis/etc become generally available there'll be more information to put on the page.

I think we need to be able to have separate pages where necessary--for instance, instructions for setting up Logit. These long sets of instructions pollute the pages and make it impossible to find anything by scrolling.

Miki Mokrysz added 2 commits September 15, 2020 12:11
Now there's a Prometheus section below this heading was no longer
clear enough to explain what it was for.
@46bit 46bit force-pushed the document-metric-exporting-to-prometheus branch from 1b4ac72 to d032cbc Compare September 15, 2020 11:20
@46bit
Copy link
Contributor Author

46bit commented Sep 15, 2020

The Monitoring Apps docs could use some love, but I'm not going there right now. I've filed #344 for one of the most obvious issues with them.

@46bit 46bit marked this pull request as ready for review September 15, 2020 11:25
@46bit 46bit requested a review from seaemsi September 15, 2020 11:26
@46bit 46bit merged commit 75c6f14 into master Sep 15, 2020
@46bit 46bit deleted the document-metric-exporting-to-prometheus branch September 15, 2020 16:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants