Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add autoupdate controller metrics #50807

Open
wants to merge 1 commit into
base: hugo/diagnostics-service-use-local-metrics-registry
Choose a base branch
from

Conversation

hugoShaka
Copy link
Contributor

@hugoShaka hugoShaka commented Jan 7, 2025

Part of: RFD-184

Goal (internal): https://github.com/gravitational/cloud/issues/10289

This PR adds metrics to monitor and troubleshoot automatic agent rollouts. There are two metrics potentially increasing cardinality:

  • the metrics containing labeled with the start or target versions. There a mechanism to cleanup old time series and remove older labels so they don't pile up.
  • the metric containing the stage for each group. This metric is per-group, so as long as we control the number of groups we control the metric cardinality. The metric is the group name, so renaming groups will not increase cardinality and a misconfigured metrics server cannot accidentally disclose the agent groups.

Depends on:

@hugoShaka hugoShaka force-pushed the hugo/autoupdate-rollout-metrics branch from 37ff41f to fcfc8fd Compare January 9, 2025 23:06
@hugoShaka hugoShaka changed the base branch from master to hugo/teleport-use-non-global-metrics-registry January 9, 2025 23:07
@hugoShaka hugoShaka force-pushed the hugo/autoupdate-rollout-metrics branch from 1264570 to b3ba472 Compare January 9, 2025 23:27
@hugoShaka hugoShaka marked this pull request as ready for review January 9, 2025 23:27
@hugoShaka hugoShaka requested review from sclevine and vapopov January 9, 2025 23:27
@github-actions github-actions bot requested review from fheinecke and tigrato January 9, 2025 23:27
Base automatically changed from hugo/teleport-use-non-global-metrics-registry to master January 10, 2025 16:03
@hugoShaka hugoShaka force-pushed the hugo/autoupdate-rollout-metrics branch from b3ba472 to 4be59b9 Compare January 10, 2025 21:07
@public-teleport-github-review-bot public-teleport-github-review-bot bot removed the request for review from fheinecke January 10, 2025 21:15
@hugoShaka hugoShaka force-pushed the hugo/autoupdate-rollout-metrics branch from 4be59b9 to e7159a6 Compare January 14, 2025 15:15
@hugoShaka hugoShaka force-pushed the hugo/autoupdate-rollout-metrics branch from e7159a6 to e0ada45 Compare January 14, 2025 18:42
@hugoShaka hugoShaka changed the base branch from master to hugo/diagnostics-service-use-local-metrics-registry January 14, 2025 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants