Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L3 231/add monitoring #49

Merged
merged 11 commits into from
Apr 22, 2024
Merged

L3 231/add monitoring #49

merged 11 commits into from
Apr 22, 2024

Conversation

msterle
Copy link
Contributor

@msterle msterle commented Apr 5, 2024

PR 1/2 for L3-231

This PR adds monitoring to the king cluster, production cluster addressed in a follow-up PR

  • Added kube-prometheus-stack chart for monitoring.
  • Added ServiceMonitors for:
    • sqnc-node (for alice, bob and charlie)
    • sqnc-identity-service postgresql (for alice, bob and charlie)
    • sqnc-matchmaker-api postgresql (for alice, bob and charlie)
    • flux components
    • self-monitoring of monitoring components
  • Add grafana dashboards for:
    • alertmanager
    • flux cluster stats (unsure if working, pulled from flux documentation)
    • flux control plane (pulled from flux documentation)
    • Flux2 (from grafana.com)
    • nginx (from grafana.com)
    • polkadot (from grafana.com)
    • postgresql (from grafana.com)

Notes:

  • Currently, the flux cluster stats dashboard is broken due to a bug in the kube-state-metrics chart. This bug has been resolved, but the fix has not been pulled into kube-prometheus-stack yet

Copy link
Contributor

@dblane-digicatapult dblane-digicatapult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, we should ensure that alice, bob, charlie and the nginx/infra sync depend on the monitoring sync kustomization being completed first as this introduces the CRDs that will subsequently be used in those kustomizations/helm charts.

scripts/flux-resume-all.sh Outdated Show resolved Hide resolved
clusters/kind-cluster/base/app-sync.yaml Outdated Show resolved Hide resolved
clusters/kind-cluster/base/namespaces.yaml Outdated Show resolved Hide resolved
clusters/kind-cluster/monitoring/release.yaml Outdated Show resolved Hide resolved
clusters/kind-cluster/monitoring/release.yaml Outdated Show resolved Hide resolved
clusters/kind-cluster/monitoring/release.yaml Show resolved Hide resolved
clusters/kind-cluster/monitoring/source.yaml Outdated Show resolved Hide resolved
clusters/kind-cluster/nginx/release.yaml Outdated Show resolved Hide resolved
scripts/flux-suspend-all.sh Outdated Show resolved Hide resolved
Co-authored-by: David Blane <32327139+dblane-digicatapult@users.noreply.github.com>
@msterle msterle merged commit 0556b4a into main Apr 22, 2024
@msterle msterle deleted the L3-231/add-monitoring branch April 22, 2024 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants