-
Notifications
You must be signed in to change notification settings - Fork 550
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
helm: meta-monitoring #2068
helm: meta-monitoring #2068
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! I was able to send metrics from a local Mimir to my Grafana Cloud account, but I found a few rough edges along the way. Can you take a look?
operations/helm/charts/mimir-distributed/templates/metamonitoring/_helpers.tpl
Outdated
Show resolved
Hide resolved
metaMonitoring: | ||
grafanaAgent: | ||
# -- Controls whether to create PodLogs, MetricsInstance, LogsInstance, and GrafanaAgent CRs to scrape the | ||
# ServiceMonitors of the chart and ship metrics and logs to the remote endpoints below. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it doesn't work with the defaults. And yes, the ServiceMonitors need to be enabled. I will clarify this in the docs
ca9e934
to
801d0d7
Compare
@dimitarvdimitrov this PR was very inspiring, just put up something similar for Loki: grafana/helm-charts#1514 |
I realized there is no service monitor for the overrides exporter I will add one |
#2125 adds a new helper for serviceomonitors. I will wait for that to be merged before adding one here. Looks be trivial. |
5e2a8a4
to
d02efbc
Compare
d02efbc
to
0f867d5
Compare
operations/helm/charts/mimir-distributed/templates/metamonitoring/grafana-agent.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. My remaining comments are more nitpick that we can deal with separately if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This time I was able to send metrics and logs to Grafana Cloud, and was able to send metrics locally to Mimir, so it appears everything is working.
I found one last UX bug.
operations/helm/charts/mimir-distributed/templates/metamonitoring/_helpers.tpl
Show resolved
Hide resolved
operations/helm/charts/mimir-distributed/templates/validate.yaml
Outdated
Show resolved
Hide resolved
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
dbb7977
to
2a67ba3
Compare
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
* main: (63 commits) Add new section on website for links to blog posts, podcasts and talks. (grafana#2216) Rename codified errors to errors catalog (grafana#2256) Helm: add a step to contributing doc (grafana#2257) Signal that 2.2 release is now in progress. (grafana#2254) Removed migration of alertmanager local state files from old hierarchy (Cortex 1.8 and earlier) (grafana#2253) operations/mimir: Change multi_zone_ingester_max_unavailable to 25 (grafana#2251) Helm: weekly release (grafana#2252) Jsonnet: Configure ingester max global metadata per user and per metric (grafana#2250) Helm: metamonitor naming (grafana#2236) Mimir documentation about out-of-order (grafana#2183) Vendor latest mimir-prometheus/main (grafana#2243) Set CODEOWNERS to primary technical writer (grafana#2242) Use BasicLifecycler for distributors and auto-forget (grafana#2154) Docs: Basic documentation for deploying the ruler using jsonnet. (grafana#2127) Fix post merge reviews on 2187 (grafana#2230) Add tests for user metadata in the ingester (grafana#2184) Change the error message template for per-tenant limits (grafana#2234) helm: meta-monitoring (grafana#2068) Article about migrating from Consul to memberlist. Added documentation for /memberlist endpoint. (grafana#2166) Update runbooks to mention possibility to investigate memberlist KV store in various alerts (grafana#2158) ...
* Add meta-monitoring Signed-off-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>
What this PR does
Aims to make monitoring of Mimir/GEM when deployed via the helm chart easier. The idea is to make the process as streamlined as possible so that an operator has to spend minimal time on configuring scraping and relabelling configs.
This PR tries to achieve this by vendoring the Grafana Agent Operator helm chart. The mimir-distributed chart creates custom resources that create two grafana agents: one that scrapes metrics and one that collects logs. Optionally the mimir chart can also create resources that scrape relevant metrics from cadvisor, kubelet, and kube-state-metrics because these are used in alerts and dashboards.
Which issue(s) this PR fixes or relates to
Fixes #2014
Checklist
CHANGELOG.md
updated - the order of entries should be[CHANGE]
,[FEATURE]
,[ENHANCEMENT]
,[BUGFIX]