Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Monitoring resources #333

Merged
merged 65 commits into from
Jun 29, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
a1cc102
monitoring: 3scale operator service monitor for prometheus scrapping
eguzki Dec 16, 2019
bda927a
monitoring: new CR struct for monitoring
eguzki Mar 5, 2020
44da9af
monitoring: operator-sdk generation crds and openapi
eguzki Mar 5, 2020
9099a9a
go-bindata: embedding binary data
eguzki Mar 10, 2020
906f2bf
monitoring: grafana, servicemonitors, rules reconcilliation
eguzki Mar 5, 2020
25b55e6
monitoring: apicast monitoring basic setup
eguzki Mar 5, 2020
5e0e975
monitoring: system monitoring basic setup
eguzki Mar 10, 2020
fe540af
monitoring: common reconcile methods in base_apimanager_logic_reconciler
eguzki Mar 10, 2020
b943885
monitoring: asset generation
eguzki Mar 10, 2020
561e0b9
monitoring: zync monitoring basic setup
eguzki Mar 11, 2020
4509f60
fix unittests
eguzki Mar 11, 2020
3debe34
monitoring: backend enable internal metrics
eguzki Mar 11, 2020
1df2cfa
monitoring: backend monitoring basic setup
eguzki Mar 11, 2020
a61ae2c
monitoring: monitoring CRD permission roles
eguzki Mar 17, 2020
10f10d6
monitoring: update CSV
eguzki Mar 18, 2020
dc56271
update go.mod
eguzki Apr 1, 2020
17903a7
services suffix -metrics
eguzki Apr 1, 2020
d8a1e22
monitoring: enable backend listener metrics
eguzki Apr 1, 2020
1ce8faa
go.mod, go.sum: fix azure go-autorest causing ambiguous import
eguzki Apr 1, 2020
a98710b
monitoring: apisonator dashboard
eguzki Apr 1, 2020
4bf1aee
monitoring: kubernetes-resources-by-namespace dashboard
slopezz Apr 6, 2020
6369162
monitoring: kubernetes-resources-by-pod dashboard
slopezz Apr 6, 2020
1bb0a97
monitoring: system dashboard
slopezz Apr 6, 2020
b2ddc5f
monitoring: reconcile system dashboard
eguzki Apr 6, 2020
54ac28a
monitoring: reconcile kubernetes-resources dashboards
eguzki Apr 6, 2020
9a67989
monitoring: update assets
eguzki Apr 6, 2020
30baf10
monitoring: threescale-kube-state-metrics prometheus rules
eguzki Apr 6, 2020
3653fa6
monitoring: fix threescale-kube-state-metrics prometheusrules
eguzki Apr 7, 2020
496a937
rebase master
eguzki May 11, 2020
4a3812d
monitoring: check monitoring CRD are registered
eguzki May 13, 2020
550ff61
monitoring: unittests fixed
eguzki May 13, 2020
ee73f5f
monitoring: docs
eguzki May 21, 2020
c8d752f
template generation
eguzki Mar 11, 2020
dfa844e
monitoring: discovery client in base reconciler
eguzki May 23, 2020
f31430f
monitoring: update unittests
eguzki May 23, 2020
a1a6136
monitoring: apply common label options to monitoring resources
eguzki Jun 9, 2020
8d493b5
Add apicast main app dashboard
roivaz Jun 11, 2020
06da838
Add apicast services dashboard
roivaz Jun 11, 2020
d5f10a0
Dashboard fixes
roivaz Jun 11, 2020
1793360
apicast new dashboards embedded asset generation
eguzki Jun 15, 2020
76fd269
monitoring: add new licenses
eguzki Jun 15, 2020
9907f67
Makefile little fix
eguzki Jun 15, 2020
fcbee23
monitoring: enable apicast extended metrics
eguzki Jun 15, 2020
06ed17b
monitoring: cr example
eguzki Jun 16, 2020
cf93808
monitoring: apicasts rules
eguzki Jun 22, 2020
b474aad
Apply fixes for apicast and apicast services dashboard
roivaz Jun 22, 2020
aab506a
fix monitoring doc
eguzki Jun 23, 2020
53f3456
fix duplicated import
eguzki Jun 23, 2020
33b3dcc
remove todo comments
eguzki Jun 23, 2020
0ec722d
fix assets comment
eguzki Jun 23, 2020
99d4d5c
generate assets
eguzki Jun 23, 2020
767b8e0
apicast prometheus alert only on production gateway
eguzki Jun 23, 2020
6f3ff88
regenerate k8s resources
eguzki Jun 23, 2020
45c8a72
Makefile: assets target has install-tools as dep
eguzki Jun 23, 2020
41ce638
adapt backup and restore controller to the new base reconciler
eguzki Jun 23, 2020
f475592
Fix pod promql queries on apicast grafana dashboard
slopezz Jun 23, 2020
76658fc
Update backend grafana dashboard with latest SaaS dashboard version
slopezz Jun 23, 2020
e1f53d6
Update assets after updating dashboards
slopezz Jun 23, 2020
1a5a82f
monitoring changes for operator only
eguzki Jun 26, 2020
4c3bbb9
template generation
eguzki Jun 26, 2020
bbbd132
monitoring: upgrade path
eguzki Jun 26, 2020
ef240b5
fix unittests for new metrics options
eguzki Jun 26, 2020
84a4af8
monitoring resources enabled for operator, disabled for templates
eguzki Jun 26, 2020
775a0a6
fix unittests after rebase
eguzki Jun 26, 2020
c7ef61b
remove duplicated imports
eguzki Jun 29, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 12 additions & 2 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ help: Makefile

## vendor: Populate vendor directory
vendor:
@GO111MODULE=on $(GO) mod vendor
$(GO) mod vendor

IMAGE ?= quay.io/3scale/3scale-operator
SOURCE_VERSION ?= master
Expand All @@ -39,6 +39,16 @@ download:
@echo Download go.mod dependencies
@go mod download

## install-tools: Installing tools from tools.go
install-tools: download
eguzki marked this conversation as resolved.
Show resolved Hide resolved
@echo Installing tools from tools.go
@cat tools.go | grep _ | awk -F'"' '{print $$2}' | xargs -tI % go install %
miguelsorianod marked this conversation as resolved.
Show resolved Hide resolved

## assets: Generate embedded assets
assets: install-tools
@echo Generate Go embedded assets files by processing source
$(GO) generate github.com/3scale/3scale-operator/pkg/assets

## build: Build operator
build:
$(OPERATOR_SDK) build $(IMAGE):$(VERSION)
Expand Down Expand Up @@ -111,7 +121,7 @@ endif
cd $(PROJECT_PATH)/deploy/olm-catalog && operator-courier verify --ui_validate_io 3scale-operator-master/

## licenses.xml: Generate licenses.xml file
licenses.xml:
licenses.xml: $(DEPENDENCY_DECISION_FILE)
ifndef LICENSEFINDERBINARY
$(error "license-finder is not available please install: gem install license_finder --version 5.7.1")
endif
Expand Down
52 changes: 51 additions & 1 deletion cmd/manager/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,13 @@ import (

"github.com/3scale/3scale-operator/pkg/3scale/amp/product"
"github.com/3scale/3scale-operator/pkg/apis"
appsv1alpha1 "github.com/3scale/3scale-operator/pkg/apis/apps/v1alpha1"
"github.com/3scale/3scale-operator/pkg/common"
"github.com/3scale/3scale-operator/pkg/controller"
"github.com/3scale/3scale-operator/version"
"github.com/prometheus/client_golang/prometheus"

monitoringv1 "github.com/coreos/prometheus-operator/pkg/apis/monitoring/v1"
grafanav1alpha1 "github.com/integr8ly/grafana-operator/v3/pkg/apis/integreatly/v1alpha1"
appsv1 "github.com/openshift/api/apps/v1"
imagev1 "github.com/openshift/api/image/v1"
routev1 "github.com/openshift/api/route/v1"
Expand All @@ -27,10 +30,12 @@ import (
"github.com/operator-framework/operator-sdk/pkg/log/zap"
"github.com/operator-framework/operator-sdk/pkg/metrics"
sdkVersion "github.com/operator-framework/operator-sdk/version"
"github.com/prometheus/client_golang/prometheus"
"github.com/spf13/pflag"
v1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/runtime/schema"
"k8s.io/apimachinery/pkg/util/intstr"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/client/config"
logf "sigs.k8s.io/controller-runtime/pkg/log"
"sigs.k8s.io/controller-runtime/pkg/manager"
Expand Down Expand Up @@ -127,6 +132,18 @@ func main() {
os.Exit(1)
}

// Setup Scheme for all monitoring resources
if err := monitoringv1.AddToScheme(mgr.GetScheme()); err != nil {
log.Error(err, "")
os.Exit(1)
}

// Setup Scheme for all grafana resources
if err := grafanav1alpha1.AddToScheme(mgr.GetScheme()); err != nil {
miguelsorianod marked this conversation as resolved.
Show resolved Hide resolved
log.Error(err, "")
os.Exit(1)
}

// Setup Scheme for all resources
if err := apis.AddToScheme(mgr.GetScheme()); err != nil {
log.Error(err, "")
Expand Down Expand Up @@ -191,6 +208,12 @@ func addMetrics(ctx context.Context, cfg *rest.Config, namespace string) {
log.Info("Could not create metrics Service", "error", err.Error())
}

// Adding the monitoring-key:middleware to the operator service which will get propagated to the serviceMonitor
err = addMonitoringKeyLabelToOperatorService(ctx, cfg, service)
if err != nil {
log.Error(err, "Could not add monitoring-key label to operator metrics Service")
}

// CreateServiceMonitors will automatically create the prometheus-operator ServiceMonitor resources
// necessary to configure Prometheus to scrape metrics from this operator.
services := []*v1.Service{service}
Expand Down Expand Up @@ -298,3 +321,30 @@ func filterGKVsFromAddToScheme(gvks []schema.GroupVersionKind) []schema.GroupVer

return ownGVKs
}

func addMonitoringKeyLabelToOperatorService(ctx context.Context, cfg *rest.Config, service *v1.Service) error {
if service == nil {
return fmt.Errorf("service doesn't exist")
}

kclient, err := client.New(cfg, client.Options{})
if err != nil {
return err
}

updatedLabels := map[string]string{
"monitoring-key": common.MonitoringKey,
"app": appsv1alpha1.Default3scaleAppLabel,
}
for k, v := range service.ObjectMeta.Labels {
updatedLabels[k] = v
}
service.ObjectMeta.Labels = updatedLabels

err = kclient.Update(ctx, service)
if err != nil {
return err
}

return nil
}
5 changes: 5 additions & 0 deletions deploy/crds/apps.3scale.net_apimanagers_crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4068,6 +4068,11 @@ spec:
type: object
imageStreamTagImportInsecure:
type: boolean
monitoring:
properties:
enabled:
type: boolean
type: object
podDisruptionBudget:
properties:
enabled:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
apiVersion: apps.3scale.net/v1alpha1
kind: APIManager
metadata:
name: example-apimanager-monitoring
spec:
wildcardDomain: example.com
monitoring:
enabled: true
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,19 @@ metadata:
"wildcardDomain": "example.com"
}
},
{
"apiVersion": "apps.3scale.net/v1alpha1",
"kind": "APIManager",
"metadata": {
"name": "example-apimanager-monitoring"
},
"spec": {
"monitoring": {
"enabled": true
},
"wildcardDomain": "example.com"
}
},
{
"apiVersion": "apps.3scale.net/v1alpha1",
"kind": "APIManager",
Expand Down Expand Up @@ -520,13 +533,6 @@ spec:
- patch
- update
- watch
- apiGroups:
- monitoring.coreos.com
resources:
- servicemonitors
verbs:
- get
- create
- apiGroups:
- apps
resourceNames:
Expand Down Expand Up @@ -662,6 +668,27 @@ spec:
- update
- watch
- delete
- apiGroups:
- monitoring.coreos.com
resources:
- servicemonitors
- prometheusrules
verbs:
- list
- get
- create
- update
- watch
- apiGroups:
- integreatly.org
resources:
- grafanadashboards
verbs:
- get
- list
- create
- update
- watch
serviceAccountName: 3scale-operator
strategy: deployment
installModes:
Expand Down
28 changes: 21 additions & 7 deletions deploy/role.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -41,13 +41,6 @@ rules:
- patch
- update
- watch
- apiGroups:
- monitoring.coreos.com
resources:
- servicemonitors
verbs:
- get
- create
- apiGroups:
- apps
resourceNames:
Expand Down Expand Up @@ -183,3 +176,24 @@ rules:
- update
- watch
- delete
- apiGroups:
- monitoring.coreos.com
resources:
- servicemonitors
- prometheusrules
verbs:
- list
- get
- create
- update
- watch
- apiGroups:
- integreatly.org
resources:
- grafanadashboards
verbs:
- get
- list
- create
- update
- watch
6 changes: 6 additions & 0 deletions doc/apimanager-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ This resource is the resource used to deploy a 3scale API Management solution.
| ZyncSpec | `zync` | \*ZyncSpec | No | See [ZyncSpec](#ZyncSpec) reference | Spec of the Zync part |
| HighAvailabilitySpec | `highAvailability` | \*HighAvailabilitySpec | No | See [HighAvailabilitySpec](#HighAvailabilitySpec) reference | Spec of the HighAvailability part |
| PodDisruptionBudgetSpec | `podDisruptionBudget` | \*PodDisruptionBudgetSpec | No | See [PodDisruptionBudgetSpec](#PodDisruptionBudgetSpec) reference | Spec of the PodDisruptionBudgetSpec part |
| MonitoringSpec | `monitoring` | \*MonitoringSpec | No | Disabled | [MonitoringSpec](#MonitoringSpec) reference |

#### ApicastSpec

Expand Down Expand Up @@ -279,6 +280,11 @@ pre-created by the user:
| --- | --- | --- | --- | --- | --- |
| Enabled | `enabled` | bool | No | `false` | Enable to automatically create [PodDisruptionBudgets](https://kubernetes.io/docs/concepts/workloads/pods/disruptions/) for components that can scale. Not including any of the databases or redis services.|

#### MonitoringSpec

| **Field** | **json/yaml field**| **Type** | **Required** | **Default value** | **Description** |
| --- | --- | --- | --- | --- | --- |
| Enabled | `enabled` | bool | No | `false` | [Enable to automatically create monitoring resources](operator-monitoring-resources.md) |

#### APIManagerStatus

Expand Down
15 changes: 14 additions & 1 deletion doc/dependency_decisions.yml
Original file line number Diff line number Diff line change
Expand Up @@ -179,7 +179,20 @@
- - :license
- github.com/evanphx/json-patch
- 3-clause BSD License
- :who:
- :who:
:why: 3-clause BSD License https://github.com/evanphx/json-patch/blob/master/LICENSE
:versions: []
:when: 2019-10-24 09:56:58.733603332 Z
- - :whitelist
- CC0 1.0 Universal
- :who: Jeff Kaufmann and Richard Fontana (Red Hat Legal)
:why:
:versions: []
:when: 2019-11-26 10:15:34.001520959 Z
- - :license
- github.com/go-bindata/go-bindata
- CC0 1.0 Universal
- :who:
:why: https://github.com/go-bindata/go-bindata/blob/master/LICENSE
:versions: []
:when: 2020-06-15 09:58:31.161619176 Z
51 changes: 51 additions & 0 deletions doc/operator-monitoring-resources.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# 3scale Monitoring Resources

The 3scale monitoring resources are (optionally) installed when 3scale is installed on Openshift using the 3scale Operator.

## Prerequirements

* [prometheus-operator](https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus#quickstart) needs to be deployed in the cluster.

The prometheus operator is an operator that creates, configures, and manages Prometheus clusters atop Kubernetes. It provides `ServiceMonitor` and `PrometheusRule` custom resources required by 3scale monitoring.

* [grafana-operator](https://github.com/coreos/prometheus-operator/tree/master/contrib/kube-prometheus#quickstart) needs to be deployed in the cluster.

The grafana operator is an operator for creating and managing Grafana instances. It provides `GrafanaDashboard` custom resources required by 3scale monitoring.

## Enabling 3scale monitoring

3scale monitoring is disabled by default. It can be enabled by setting monitoring to `true` in the [APIManager CR](apimanager-reference.md).

```
apiVersion: apps.3scale.net/v1alpha1
kind: APIManager
metadata:
name: apimanager1
spec:
wildcardDomain: example.com
monitoring:
enabled: true
```

## Monitored components

* Kubernetes resources at namespace level where 3scale is installed
* Apicast Staging
* Apicast Production
* 3scale Backend worker
* 3scale Backend listener
* System sidekiq
miguelsorianod marked this conversation as resolved.
Show resolved Hide resolved
* Zync
* Zync-que

## Exposing monitoring resources

3scale operator created monitoring resources, i.e. `ServiceMonitor`, `PrometheusRule` and `ServiceMonitor`, will all be labeled with

```
monitoring-key: middleware
```

Make sure the prometheus services and grafana services created by respective operators are configured to monitor resources with that label.

Depending on the prometheus and grafana service configuration, the namespace where 3scale is installed might require labels too. Check your monitoring provider configuration like grafana and prometheus servers.
1 change: 1 addition & 0 deletions doc/operator-user-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
* [PostgreSQL Installation](#postgresql-installation)
* [Enabling Pod Disruption Budgets](#enabling-pod-disruption-budgets)
* [Setting custom affinity and tolerations](#setting-custom-affinity-and-tolerations)
* [Enabling monitoring resources](operator-monitoring-resources.md)
* [Reconciliation](#reconciliation)
* [Upgrading 3scale](#upgrading-3scale)
* [Feature Operator (in *TechPreview*)](operator-capabilities.md)
Expand Down
4 changes: 4 additions & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -4,12 +4,16 @@ go 1.13

require (
github.com/3scale/3scale-porta-go-client v0.0.3
github.com/Azure/go-autorest v12.2.0+incompatible
github.com/RHsyseng/operator-utils v0.0.0-20200204194854-c5b0d8533458
github.com/coreos/prometheus-operator v0.34.0
github.com/ghodss/yaml v1.0.1-0.20190212211648-25d852aebe32
github.com/go-bindata/go-bindata v3.1.1+incompatible
github.com/go-logr/logr v0.1.0
github.com/go-openapi/spec v0.19.4
github.com/go-playground/validator/v10 v10.2.0
github.com/google/go-cmp v0.3.1
github.com/integr8ly/grafana-operator/v3 v3.1.0
github.com/luci/go-render v0.0.0-20160219211803-9a04cc21af0f
github.com/mitchellh/go-homedir v1.1.0
github.com/openshift/api v3.9.1-0.20190924102528-32369d4db2ad+incompatible
Expand Down
Loading