Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store config hash in a metric so that it's visible at all times #967

Closed
kentquirk opened this issue Jan 9, 2024 · 0 comments
Closed

Store config hash in a metric so that it's visible at all times #967

kentquirk opened this issue Jan 9, 2024 · 0 comments
Assignees
Labels
type: enhancement New feature or request
Milestone

Comments

@kentquirk
Copy link
Contributor

kentquirk commented Jan 9, 2024

Is your feature request related to a problem? Please describe.

A customer asked about how configuration changes are recorded / made visible to users. Configuration changes are logged at Info level: "configuration change was detected and the configuration was reloaded".

Customer: Could we get that somehow exposed via prometheus and/or otel metrics? It’s kinda odd to be flying blind here: We don’t run the refinery on info in prod (default is warn) and the endpoint is also not publicly exposed/easy to query for most folks. So I don’t know which version of the config the refinery has loaded and whether it picked up any changes.

me: The easy answer here would be a new metric gauge called something like config_checksum where we convert the config hash into a number and store it in the metric. It would change whenever the config file contents changed, and would be identical across all the items in your fleet if they all had the same config. Would that meet your needs?

Customer: yep - that’d be perfect 👍

Describe the solution you'd like

  • Config hashes are MD5 hashes (so that people can calculate them on the command line with the md5 tool to verify). Probably the right strategy is to take the last 4 digits of the MD5, convert them from hex to a number, and store that in a gauge metric. So if the md5 of the hash is 7f1237f7db723f4e874a7a8269081a77, we would convert 1a77 from hex to decimal, and the value of the metric would be 6775.
  • We also actually need two metrics -- config_checksum and rules_checksum.
  • Just for belt-and-suspenders, we should also increase the log message's priority to warn, and include both the full hash value and this decimal checksum in the log message. This will allow people to correlate the logs and metrics.

Describe alternatives you've considered

We could just improve the log, but this is easy and fits with the way a lot of people like to manage their clusters.

@kentquirk kentquirk added the type: enhancement New feature or request label Jan 9, 2024
@kentquirk kentquirk added this to the v2.4 milestone Jan 9, 2024
@fchikwekwe fchikwekwe self-assigned this Jan 31, 2024
@fchikwekwe fchikwekwe removed their assignment Feb 20, 2024
@fchikwekwe fchikwekwe modified the milestones: v2.4, v2.5 Feb 21, 2024
@MikeGoldsmith MikeGoldsmith modified the milestones: vNEXT, 2.6 Mar 13, 2024
@VinozzZ VinozzZ modified the milestones: v2.6, v2.7 Jun 14, 2024
@VinozzZ VinozzZ self-assigned this Jun 18, 2024
VinozzZ added a commit that referenced this issue Jun 20, 2024
<!--
Thank you for contributing to the project! 💜
Please make sure to:
- Chat with us first if this is a big change
  - Open a new issue (or comment on an existing one)
- We want to make sure you don't spend time implementing something we
might have to say No to
- Add unit tests
- Mention any relevant issues in the PR description (e.g. "Fixes #123")

Please see our [OSS process
document](https://github.com/honeycombio/home/blob/main/honeycomb-oss-lifecycle-and-practices.md#)
to get an idea of how we operate.
-->

## Which problem is this PR solving?

To provide better visibility of current configuration used in Refinery,
this PR introduce two metrics, `config_hash` and `rule_config_hash`, for
keeping track of configuration.

## Short description of the changes
- change config change log from `info` level to `warn`
- include full config hash value in config change log
- store the decimal number of the last 4 digit of config hash value as
metrics


#967
@VinozzZ VinozzZ closed this as completed Jun 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants