Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase resource defaults for monitoring stack #551

Open
blancharda opened this issue Jul 8, 2024 · 1 comment
Open

Increase resource defaults for monitoring stack #551

blancharda opened this issue Jul 8, 2024 · 1 comment
Labels
enhancement New feature or request monitoring Issues related to monitoring components / resources
Milestone

Comments

@blancharda
Copy link
Member

Is your feature request related to a problem? Please describe.

The monitoring stack (prometheus, grafana, loki etc) have enough resources to start, but often struggle when scaled beyond a single node or with higher volume workloads. We should consider updating the default values, and provide clear guidance on suggested overrides for various deployment scales/sizes.

Additional context

Prometheus in particular seems to struggle even with relatively small workloads.

@blancharda blancharda added the enhancement New feature or request label Jul 8, 2024
@mjnagel mjnagel added the monitoring Issues related to monitoring components / resources label Jul 8, 2024
@mjnagel
Copy link
Contributor

mjnagel commented Jul 16, 2024

This may be a good reason to evaluate scalable/HA grafana + prometheus. For reference on DUBBD in the past we had tickets for HPAs on those two and noted necessary external dependencies:

Loki itself defaults to a scalable mode (but single replica) with no resource limits/requests.

I think we should definitely:

  1. Document the overrides for scaling these up (resources as a first pass, replicas/hpa as we further explore/support those).
  2. Identify any upstream guidance on sizing/scaling for each.
  3. As we gather more data from CI/staging environments also document our own suggested sizing based on unique core needs.

@mjnagel mjnagel added this to the 0.27.0 milestone Aug 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request monitoring Issues related to monitoring components / resources
Projects
None yet
Development

No branches or pull requests

2 participants