Skip to content

Commit

Permalink
NETOBSERV-1688: relax ebpf drop alert (#674)
Browse files Browse the repository at this point in the history
* NETOBSERV-1688: relax ebpf drop alert

reformulate some of the text for ebpf drop alert, and
relax threshold a little bit.

* Use const for alert threshold; remove unused test

On removing test: agent-metrics-test.go did never run; it should have
been named "agent_metrics_test.go" to run (_ instead of -). When
renamed, it actually fails. Given how little value these tests are
adding, I'm removing them rather than spending time to fix them. (they
only check that the objects created names match with the expected
constant)

The controller integration tests cover more and are much more relevant
(and they run)
  • Loading branch information
jotak committed Jun 11, 2024
1 parent eb3ed71 commit e1c9f3c
Show file tree
Hide file tree
Showing 9 changed files with 27 additions and 82 deletions.
4 changes: 2 additions & 2 deletions apis/flowcollector/v1beta1/flowcollector_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -162,7 +162,7 @@ const (

// Name of an eBPF agent alert.
// Possible values are:<br>
// `NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter being triggered.<br>
// `NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter being triggered.<br>
// +kubebuilder:validation:Enum:="NetObservDroppedFlows"
type EBPFAgentAlert string

Expand All @@ -182,7 +182,7 @@ type EBPFMetrics struct {

// `disableAlerts` is a list of alerts that should be disabled.
// Possible values are:<br>
// `NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter being triggered.<br>
// `NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter being triggered.<br>
// +optional
DisableAlerts []EBPFAgentAlert `json:"disableAlerts"`
}
Expand Down
4 changes: 2 additions & 2 deletions apis/flowcollector/v1beta2/flowcollector_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -167,7 +167,7 @@ const (

// Name of an eBPF agent alert.
// Possible values are:<br>
// `NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter is being triggered.<br>
// `NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter is being triggered.<br>
// +kubebuilder:validation:Enum:="NetObservDroppedFlows"
type EBPFAgentAlert string

Expand All @@ -187,7 +187,7 @@ type EBPFMetrics struct {

// `disableAlerts` is a list of alerts that should be disabled.
// Possible values are:<br>
// `NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter is being triggered.<br>
// `NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter is being triggered.<br>
// +optional
DisableAlerts []EBPFAgentAlert `json:"disableAlerts"`
}
Expand Down
8 changes: 4 additions & 4 deletions bundle/manifests/flows.netobserv.io_flowcollectors.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -268,12 +268,12 @@ spec:
description: |-
`disableAlerts` is a list of alerts that should be disabled.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter being triggered.<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter being triggered.<br>
items:
description: |-
Name of an eBPF agent alert.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter being triggered.<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter being triggered.<br>
enum:
- NetObservDroppedFlows
type: string
Expand Down Expand Up @@ -3785,12 +3785,12 @@ spec:
description: |-
`disableAlerts` is a list of alerts that should be disabled.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter is being triggered.<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter is being triggered.<br>
items:
description: |-
Name of an eBPF agent alert.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter is being triggered.<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter is being triggered.<br>
enum:
- NetObservDroppedFlows
type: string
Expand Down
8 changes: 4 additions & 4 deletions config/crd/bases/flows.netobserv.io_flowcollectors.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -241,12 +241,12 @@ spec:
description: |-
`disableAlerts` is a list of alerts that should be disabled.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter being triggered.<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter being triggered.<br>
items:
description: |-
Name of an eBPF agent alert.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter being triggered.<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter being triggered.<br>
enum:
- NetObservDroppedFlows
type: string
Expand Down Expand Up @@ -3479,12 +3479,12 @@ spec:
description: |-
`disableAlerts` is a list of alerts that should be disabled.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter is being triggered.<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter is being triggered.<br>
items:
description: |-
Name of an eBPF agent alert.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter is being triggered.<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter is being triggered.<br>
enum:
- NetObservDroppedFlows
type: string
Expand Down
56 changes: 0 additions & 56 deletions controllers/ebpf/agent-metrics-test.go

This file was deleted.

17 changes: 9 additions & 8 deletions controllers/ebpf/agent_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -79,14 +79,15 @@ const (
)

const (
exportKafka = "kafka"
exportGRPC = "grpc"
kafkaCerts = "kafka-certs"
averageMessageSize = 100
bpfTraceMountName = "bpf-kernel-debug"
bpfTraceMountPath = "/sys/kernel/debug"
bpfNetNSMountName = "var-run-netns"
bpfNetNSMountPath = "/var/run/netns"
exportKafka = "kafka"
exportGRPC = "grpc"
kafkaCerts = "kafka-certs"
averageMessageSize = 100
bpfTraceMountName = "bpf-kernel-debug"
bpfTraceMountPath = "/sys/kernel/debug"
bpfNetNSMountName = "var-run-netns"
bpfNetNSMountPath = "/var/run/netns"
droppedFlowsAlertThreshold = 100
)

const (
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -124,10 +124,10 @@ func (c *AgentController) agentPrometheusRule(target *flowslatest.FlowCollectorE
rules = append(rules, monitoringv1.Rule{
Alert: string(flowslatest.AlertDroppedFlows),
Annotations: map[string]string{
"description": "NetObserv eBPF agent is not able to process new flows. Possible reasons are the BPF hashmap being full, or the capacity limiter being triggered. Both causes can be worked around by increasing cacheMaxFlows value in Flowcollector resource.",
"summary": "NetObserv eBPF is not able to process any new flows",
"description": "NetObserv eBPF agent is missing packets or flows. The metric netobserv_agent_dropped_flows_total provides more information on the cause. Possible reasons are the BPF hashmap being busy or full, or the capacity limiter being triggered. This may be worked around by increasing cacheMaxFlows value in Flowcollector resource.",
"summary": "NetObserv eBPF agent is missing packets or flows",
},
Expr: intstr.FromString("sum(rate(netobserv_agent_dropped_flows_total[1m])) > 0"),
Expr: intstr.FromString(fmt.Sprintf("sum(rate(netobserv_agent_dropped_flows_total[1m])) > %d", droppedFlowsAlertThreshold)),
For: &d,
Labels: map[string]string{
"severity": "warning",
Expand Down
4 changes: 2 additions & 2 deletions docs/FlowCollector.md
Original file line number Diff line number Diff line change
Expand Up @@ -546,7 +546,7 @@ To filter a range of ports, use a "start-end" range, string format. For example
<td>
`disableAlerts` is a list of alerts that should be disabled.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter being triggered.<br><br/>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter being triggered.<br><br/>
</td>
<td>false</td>
</tr><tr>
Expand Down Expand Up @@ -7939,7 +7939,7 @@ To filter a range of ports, use a "start-end" range in string format. For exampl
<td>
`disableAlerts` is a list of alerts that should be disabled.
Possible values are:<br>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter is being triggered.<br><br/>
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter is being triggered.<br><br/>
</td>
<td>false</td>
</tr><tr>
Expand Down
2 changes: 1 addition & 1 deletion docs/flowcollector-flows-netobserv-io-v1beta2.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -441,7 +441,7 @@ Type::
| `disableAlerts` is a list of alerts that should be disabled.
Possible values are: +

`NetObservDroppedFlows`, which is triggered when the eBPF agent is dropping flows, such as when the BPF hashmap is full or the capacity limiter is being triggered. +
`NetObservDroppedFlows`, which is triggered when the eBPF agent is missing packets or flows, such as when the BPF hashmap is busy or full, or the capacity limiter is being triggered. +


| `enable`
Expand Down

0 comments on commit e1c9f3c

Please sign in to comment.