Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add beacon checks metrics #57

Merged
merged 3 commits into from
Dec 19, 2023
Merged

Conversation

voyvodov
Copy link
Contributor

This introduce metrics for Salt status beacons
(https://docs.saltproject.io/en/latest/ref/beacons/all/salt.beacons.status.html) Will expose metrics for last beacon event per minion and also the total number of observed minions (minions not reported for 4 hours by default are removed every hour).
This way, user can not only monitor jobs success, but also if all expected minions are reporting back to Salt master.
The exporter will watch the PKI directory of the master for changed files and this way will know how many minions are expected to respond back. On (re)start it will read the minions' keys for initial status of observable minions.

This introduce metrics for Salt status beacons
(https://docs.saltproject.io/en/latest/ref/beacons/all/salt.beacons.status.html)
Will expose metrics for last beacon event per minion and also the total number of
observed minions (minions not reported for 4 hours by default
are removed every hour).
This way, user can not only monitor jobs success, but also if all expected
minions are reporting back to Salt master.
The exporter will watch the PKI directory of the master for changed files
and this way will know how many minions are expected to respond back.
On (re)start it will read the minions' keys for initial status of
observable minions.
@voyvodov
Copy link
Contributor Author

If the idea is okey will also update the docs with information how this can be used and what is required from Salt side to be done to work correctly

@kpetremann kpetremann self-requested a review December 13, 2023 12:32
@kpetremann kpetremann added the enhancement New feature or request label Dec 13, 2023
@kpetremann
Copy link
Owner

Hi @voyvodov,

Thanks for the PR and the idea. I'll have a look!

@kpetremann
Copy link
Owner

I like the idea :)
I'll start the review of the code as soon as possible.

minions not reported for 4 hours by default are removed every hour

It should be configurable by the user.

The exporter will watch the PKI directory of the master for changed files and this way will know how many minions are expected to respond back.

Could also be used to expose the number of registered minion.

This way, user can not only monitor jobs success

nitpick on this: there is a metric to get the total of minion responses, whether it is successful or not. But beacon is a better way 👍

@voyvodov
Copy link
Contributor Author

voyvodov commented Dec 14, 2023

Ohh... I put wrong description.

In fact it will not remove them until they are removed fron salt master.
Initially it was using beacon to detect minions and background thread to cleanup (from which is the description). After a while I change it to get the public keys in the PKI directory of the accepted minions and a thread watching for changes (e.g. accepted or deleted minions).
So, exporter will just report last "heartbeat". It's up to the user what value will choose to alert in Grafana or Prometheus.

There is a metric which report the total of all registered minions (salt_health_minions_total)

@kpetremann
Copy link
Owner

@voyvodov LGTM thanks for the contrib!

If the idea is okey will also update the docs with information how this can be used and what is required from Salt side to be done to work correctly

do you want to update this PR with the documentation or do a dedicated PR for this?

@voyvodov
Copy link
Contributor Author

@voyvodov LGTM thanks for the contrib!

If the idea is okey will also update the docs with information how this can be used and what is required from Salt side to be done to work correctly

do you want to update this PR with the documentation or do a dedicated PR for this?

If you're okey I can do it in seperate PR in next few days

@kpetremann
Copy link
Owner

ok no problemo :) let's merge it!

I'll release a new version of the exporter when we will have the documentation up to date.

@kpetremann kpetremann merged commit ccd23c4 into kpetremann:main Dec 19, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants