controller: disable prometheus metric processor memory #511
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Issue number:
Closes #441
Description of changes:
Per #441 and testing, the gauges reported by the controller's
/metrics
endpoint would often report stale data, double-counting hosts as it showed the historic state alongside the current state.This is because
with_memory()
was enabled, which explicitly maintains previous metrics when only new subsets are reported.This change disables metric processor memory to make it truly stateless, which is how we were computing the metrics anyways.
In this case, prometheus will still report old state until the metrics are stale. In order to attempt to clear old state faster, we also explicitly report
0
counts for prior metrics when we can. Because the controller is supposed to be stateless and can be rescheduled, this is done as a best-effort.Testing done:
Watched the controller
/metrics
endpoint with and without this change.After the change was made, the metrics endpoint always surfaced the current state of the cluster, e.g.:
(Note that
StagedAndPerformedUpdate
has a count of0
, because the host that is now inRebootedIntoUpdate
had it's prior count intentionally wiped)Terms of contribution:
By submitting this pull request, I agree that this contribution is dual-licensed under the terms of both the Apache License, version 2.0, and the MIT license.