Skip to content

Commit

Permalink
Prometheus-Kanister design doc - Fixing formatting issues (#2264)
Browse files Browse the repository at this point in the history
* Fixed formatting issues

* Moved image to a more suitable location

* Revert "Moved image to a more suitable location"

This reverts commit a3988f5.

* Revert "Fixed formatting issues"

This reverts commit 1323c93.

* Added Pavan's patch file

* Revert "Revert "Moved image to a more suitable location""

This reverts commit d869461.

---------

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
  • Loading branch information
mellon-collie and mergify[bot] committed Aug 14, 2023
1 parent c1e4639 commit 68ac433
Show file tree
Hide file tree
Showing 2 changed files with 119 additions and 112 deletions.
File renamed without changes
231 changes: 119 additions & 112 deletions design/kanister-prometheus-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ the default handler provides. Adding metrics to track the ActionSets and
Blueprints workflow will help improve the overall observability.

To achieve this, we need to build a framework for exporting metrics from the
Kanister controller, and to start with, export some metrics to Prometheus.
Kanister controller, and to start with, export some metrics to Prometheus.

This framework simplifies the common need for Prometheus counters to publish 0
values at startup for all permutations of labels and label values. This ensures
Expand All @@ -42,7 +42,7 @@ Phase duration, etc.
## Scope

1. Design a framework that allows us to export new Kanister metrics to
Prometheus easily.
Prometheus easily.
2. Add a few fundamental metrics related to ActionSets and Blueprints to start
with.

Expand All @@ -51,158 +51,157 @@ Phase duration, etc.

### Architecture

![Alt text](Prometheus_Metrics_Design.png?raw=true "Prometheus Integration Design")
![Alt text](images/prometheus-metrics-design.png?raw=true "Prometheus Integration Design")

#### Text description

1. The initializer of the consumer package calls newMetrics, a helper method
1. The initializer of the consumer package calls `newMetrics`, a helper method
that talks to Kanister’s metrics package. The result is a new metrics struct
that owns all the Prometheus metrics.

2. In order to initialize all the required Prometheus metrics, the new_metrics
method calls the InitCounterVec, InitGaugeVec, InitHistogramVec in the
metrics package. It passes the metric names and the specific label names
and label values as BoundedLabels to the metrics package. Once it
initializes all the Prometheus metrics successfully, it returns a struct that
wraps all the metrics and that the consumer package can then use.
2. In order to initialize all the required Prometheus metrics, the
`new_metrics` method calls the `InitCounterVec`, `InitGaugeVec`,
`InitHistogramVec` in the metrics package. It passes the metric names and
the specific label names and label values as `BoundedLabels` to the metrics
package. Once it initializes all the Prometheus metrics successfully, it
returns a struct that wraps all the metrics for the consumer package to use.

3. The metrics package internally initializes the Prometheus metrics and
registers them Prometheus. If the registration fails because the specific
metric with label header already exists, the metric will simply be returned
to the caller. If the registration fails due to other reasons, then the
metrics package will cause a panic, signaling programmer error.
In case of the CounterVec, the InitCounterVec function generates all
possible permutations of label values and initializes each counter
within the CounterVec with a value of 0.
3. The metrics package internally initializes the Prometheus metrics and
registers them with Prometheus. If the registration fails because the specific
metric with label header already exists, the metric will simply be returned
to the caller. If the registration fails due to other reasons, then the
metrics package will cause a panic, signaling programmer error. In case of
the `CounterVec`, the `InitCounterVec` function generates all possible
permutations of label values and initializes each counter within the
`CounterVec` with a value of 0.

4. Once the collector is created in the metrics package, it will be returned
to the consumer package’s newMetrics helper method.
4. Once the collector is created in the metrics package, it will be returned to
the consumer package’s `newMetrics` helper method.

5. Once the initialization of all Prometheus metrics are complete, a new
metrics struct will be returned to the consumer’s package initializer.

6. The consumer package may find it useful to implement a helper method that constructs
a prometheus.Labels mapping to access a specific counter from a CounterVec
and perform an increment operation.
5. Once the initialization of all Prometheus metrics are complete, a new
metrics struct will be returned to the consumer package's initializer.

6. The consumer package may find it useful to implement a helper method that
constructs a `prometheus.Labels` mapping to access a specific counter from a
`CounterVec` and perform an increment operation.


### APIs

#### Metrics Package

```golang
// BoundedLabel is a type that represents a label and its associated
// BoundedLabel is a type that represents a label and its associated
// valid values
type BoundedLabel struct {
LabelName string
LabelValues []string
}
```

An example of a BoundedLabel is in the scenario of ActionSet resolutions.
Suppose we want to track these resolutions across different blueprints,
we would create the bounded labels in the following way:
An example of a `BoundedLabel` is in the scenario of ActionSet resolutions.
Suppose we want to track these resolutions across different blueprints, we
would create the bounded labels in the following way:

##### BoundedLabel example

```golang
BoundedLabel {
LabelName: "operation_type"
LabelValues: ["backup", "restore"]
BoundedLabel{
LabelName: "operation_type",
LabelValues: []string{
"backup",
"restore",
},
}

BoundedLabel {
LabelName: "action_set_resolution"
LabelValues: ["success", "failure"]
BoundedLabel{
LabelName: "action_set_resolution",
LabelValues: []string{
"success",
"failure",
},
}
```
```

##### Initialization methods

```golang
// InitCounterVec initializes and registers the counter metrics vector. It takes a list of
// BoundedLabel objects - if any label value or label name is nil, then this method will panic.
// Based on the combinations returned by generateCombinations, it will set each counter value to 0.
// If a nil counter is returned during registration, the method will
// panic

// InitCounterVec initializes and registers the counter metrics vector. It
// takes a list of BoundedLabel objects - if any label value or label name is
// nil, then this method will panic. Based on the combinations returned by
// generateCombinations, it will set each counter value to 0.
// If a nil counter is returned during registration, the method will panic.
func InitCounterVec(r prometheus.Registerer, opts prometheus.CounterOpts, boundedLabels []BoundedLabel) *prometheus.CounterVec

// InitGaugeVec initializes the gauge metrics vector. It takes a list of BoundedLabels, but the
// LabelValue field of each BoundedLabel will be ignored.
// If a nil counter is returned during registration, the method will
// panic
// InitGaugeVec initializes the gauge metrics vector. It takes a list of
// BoundedLabels, but the LabelValue field of each BoundedLabel will be
// ignored. If a nil counter is returned during registration, the method will
// panic.
func InitGaugeVec(r prometheus.Registerer, opts prometheus.CounterOpts, boundedLabels []BoundedLabel) *prometheus.GaugeVec

// InitHistogramVec initializes the histogram metrics vector. It takes a list of BoundedLabels, but the
// LabelValue field of each BoundedLabel will be ignored.
// If a nil counter is returned during registration, the method will
// panic
// InitHistogramVec initializes the histogram metrics vector. It takes a list
// of BoundedLabels, but the LabelValue field of each BoundedLabel will be
// ignored. If a nil counter is returned during registration, the method will
// panic.
func InitHistogramVec(r prometheus.Registerer, opts prometheus.CounterOpts, boundedLabels []BoundedLabel) *prometheus.HistogramVec

// InitCounter initializes a new counter.
// If a nil counter is returned during registration, the method will
// panic
// If a nil counter is returned during registration, the method will panic.
func InitCounter(r prometheus.Registerer, opts prometheus.CounterOpts) prometheus.Counter

// InitGauge initializes a new gauge.
// If a nil counter is returned during registration, the method will
// panic
// If a nil counter is returned during registration, the method will panic.
func InitGauge(r prometheus.Registerer, opts prometheus.GaugeOpts) prometheus.Gauge

// InitHistogram initializes a new histogram.
// If a nil counter is returned during registration, the method will
// panic
// If a nil counter is returned during registration, the method will panic.
func InitHistogram(r prometheus.Registerer, opts prometheus.HistogramOpts) prometheus.Histogram
```

##### Example Initialization Steps for a new CounterVec metric

1. Initialize a new CounterVec with relevant options and label names

2. Attempt to register the new CounterVec

a. If successful,
1. Initialize a new `CounterVec` with relevant options and label names.

i. Generate combinations of label names
2. Attempt to register the new `CounterVec`.
* If successful,
* Generate combinations of label names.
* Create counters for each combination and set the counter to 0.

ii. Create counters for each combination and set the counter
to 0.
* If not successful, check if the error is an `AlreadyRegisteredError`.
* If yes, return the `CounterVec` and ignore the error.
* If no, then panic, signalling programmer error.

b. If not successful, check if the error is an AlreadyRegisteredError

i. If yes, return the CounterVec and ignore the error

ii. If no, return a nil CounterVec and the received error.

3. If received a nil CounterVec from registration, interrupt with a panic,
because an interrupt would suggest a failure in the created
CounterVec, which should be fixed by the programmer.
3. If received a `CounterVec` from registration, it is guaranteed that the
registration is successful.

#### Consumer Package

The below example change will walk through how a consumer package
will be integrated with the metrics package:
The below example will walk through how a consumer package will be integrated
with the metrics package:

Each consumer package in Kanister will have a main struct and a "metrics.go" file .
Each consumer package in Kanister will have a main struct and a `metrics.go`
file.

An example of this would be the controller package:
An example of this would be the controller package:

controller/controller.go

```golang
type Controller struct {
config *rest.Config
crClient versioned.Interface
clientset kubernetes.Interface
dynClient dynamic.Interface
osClient osversioned.Interface
recorder record.EventRecorder
actionSetTombMap sync.Map
metrics *metrics // add a new member to the existing struct
config *rest.Config
crClient versioned.Interface
clientset kubernetes.Interface
dynClient dynamic.Interface
osClient osversioned.Interface
recorder record.EventRecorder
actionSetTombMap sync.Map
metrics *metrics // add a new member to the existing struct
}
```

```golang
// New create controller for watching Kanister custom resources created
// New creates a controller for watching Kanister custom resources created.
func New(c *rest.Config) *Controller {
return &Controller{
config: c,
Expand All @@ -216,43 +215,51 @@ controller/metrics.go

```golang
const (
ACTION_SET_COUNTER_VEC_LABEL_RES = "resolution"
ACTION_SET_COUNTER_VEC_LABEL_OP_TYPE = "operation_type"
ACTION_SET_COUNTER_VEC_LABEL_RES = "resolution"
ACTION_SET_COUNTER_VEC_LABEL_OP_TYPE = "operation_type"
)

type metrics struct {
ActionSetCounterVec *prometheus.CounterVec
ActionSetCounterVec *prometheus.CounterVec
}

// helper method to construct the correct "LabelHeaders":"LabelValues" mapping
// to ensure type safety
func getActionSetCounterVecLabels() []kanistermetrics.BoundedLabels {
bl := make([]kanistermetrics.BoundedLabel, 2)
bl[0] = kanistermetrics.BoundedLabel{LabelName: ACTION_SET_COUNTER_VEC_LABEL_RES,
LabelValues: []string{"success", "failure"}}
bl[1] = kanistermetrics.BoundedLabel{LabelName:
ACTION_SET_COUNTER_VEC_LABEL_BLUEPRINT,
LabelValues: []string{"backup", "restore"}}
return bl
// getActionSetCounterVecLabels is a helper method to construct the correct
// "LabelHeaders":"LabelValues" mapping to ensure type safety.
func getActionSetCounterVecLabels() []kanistermetrics.BoundedLabel {
bl := make([]kanistermetrics.BoundedLabel, 2)
bl[0] = kanistermetrics.BoundedLabel{
LabelName: ACTION_SET_COUNTER_VEC_LABEL_RES,
LabelValues: []string{"success", "failure"},
}
bl[1] = kanistermetrics.BoundedLabel{
LabelName: ACTION_SET_COUNTER_VEC_LABEL_BLUEPRINT,
LabelValues: []string{"backup", "restore"},
}
return bl
}


// constructActionSetCounterVecLabels is a helper method to construct the
// labels correctly.
func constructActionSetCounterVecLabels(operation_type string, resolution string) prometheus.Labels {
return prometheus.Labels{ACTION_SET_COUNTER_VEC_LABEL_OP_TYPE: operation_type,
ACTION_SET_COUNTER_VEC_LABEL_RES: resolution}
return prometheus.Labels{
ACTION_SET_COUNTER_VEC_LABEL_OP_TYPE: operation_type,
ACTION_SET_COUNTER_VEC_LABEL_RES: resolution,
}
}

// newMetrics is a helper method to create a Metrics interface.
func newMetrics(gatherer prometheus.Gatherer) *metrics {
actionSetCounterOpts := prometheus.CounterOpts{
Name: "action_set_resolutions_total",
Help: "Total number of action set resolutions",
}
actionSetCounterVec := kanistermetrics.InitCounterVec(gatherer,
actionSetCounterOpts, getActionSetCounterVecLabels())
return &metrics{ActionSetCounterVec: actionSetCounterVec}
actionSetCounterOpts := prometheus.CounterOpts{
Name: "action_set_resolutions_total",
Help: "Total number of action set resolutions",
}
actionSetCounterVec := kanistermetrics.InitCounterVec(
gatherer,
actionSetCounterOpts,
getActionSetCounterVecLabels(),
)
return &metrics{ActionSetCounterVec: actionSetCounterVec}
}
```

Expand All @@ -261,7 +268,7 @@ be incremented in a method:

```golang
func (c *Controller) handleActionSet(ctx context.Context) {
c.metrics.ActionSetCounterVec.With(constructActionSetCounterVecLabels("backup", "success")).Inc()
c.metrics.ActionSetCounterVec.With(constructActionSetCounterVecLabels("backup", "success")).Inc()
}
```

Expand All @@ -270,7 +277,7 @@ arguments:

```golang
func (c *Controller) handleActionSet(ctx context.Context) {
c.metrics.ActionSetCounterVec.WithLabelValues("backup", "success").Inc()
c.metrics.ActionSetCounterVec.WithLabelValues("backup", "success").Inc()
}
```

Expand All @@ -289,4 +296,4 @@ func (c *Controller) handleActionSet(ctx context.Context) {
package in Prometheus.

3. Integration tests will be added for code that exports new metrics, to ensure
that the behavior of exporting metrics is correct.
that the behavior of exporting metrics is correct.

0 comments on commit 68ac433

Please sign in to comment.