-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-579 Use netobserv prefix, add 4 new metrics, remove one #173
NETOBSERV-579 Use netobserv prefix, add 4 new metrics, remove one #173
Conversation
Skipping CI for Draft Pull Request. |
dcfd221
to
2340c2b
Compare
/ok-to-test |
New image: ["quay.io/netobserv/network-observability-operator:8a696be"]. It will expire after two weeks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested this with PR image, looks like below metrics doesn't have "netobserv" prefix in them which are for protobuf/protobuf1.0, are these still relevant? if so, should they have "netobserv" prefix in them.
flow_decoder_count{name="protobuf",worker=""} 442
flow_process_nf_delay_summary_seconds{router="",version="protobuf1.0",quantile="0.5"} 1.741961812
flow_process_nf_delay_summary_seconds{router="",version="protobuf1.0",quantile="0.9"} 4.353887055
flow_process_nf_delay_summary_seconds{router="",version="protobuf1.0",quantile="0.99"} 4.967831966
flow_process_nf_delay_summary_seconds_sum{router="",version="protobuf1.0"} 379294.89681290183
flow_process_nf_delay_summary_seconds_count{router="",version="protobuf1.0"} 185834
flow_traffic_summary_size_bytes{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf",quantile="0.5"} 37739
flow_traffic_summary_size_bytes{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf",quantile="0.9"} 42017
flow_traffic_summary_size_bytes{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf",quantile="0.99"} 44748
flow_traffic_summary_size_bytes_sum{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf"} 1.6958135e+07
flow_traffic_summary_size_bytes_count{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf"} 442
@memodi good point, these are metrics that we reuse from goflow2 (they are used natively by goflow2 in the context of IPFIX agent, and we mimic them in the context of the eBPF agent). |
actually, they reside in FLP, I need to open a new PR for FLP, but there's nothing more to do on the operator (so this PR can still be merged) |
New metrics: source/dest bytes traffic per node / per owner
2340c2b
to
2e66f98
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM Thanks!
/qe-approved |
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jotak The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
NETOBSERV-229 create pre-merge image on PR via action
New metrics: source/dest bytes traffic per node / per owner
Removed
bandwidth_per_network_service_per_namespace
(per namespace traffic can be obtained from the new metrics)That also removed the per-service discrimination, but I'm not sure we actually need it. In case we do, we should introduce a new metric.
cc @memodi @Amoghrd
Detailed changes:
The 4 following metrics show traffic per source/dest workload/namespace, or node for those with "node".
They allow to more accurately follow traffic occuring in a given namespace, e.g.
sum(rate(netobserv_received_bytes_total{SrcK8S_Namespace="netobserv"}[1m]))
and
sum(rate(netobserv_received_bytes_total{DstK8S_Namespace="netobserv",SrcK8S_Namespace!="netobserv"}[1m]))
will cover all traffic occuring inside netobserv and from/to other namespaces/nodes
The "send" and "received" words here account for traffic going in or out from the observation point of view, it's basically the same thing as the "Reporter" field in the UI (or FlowDirection field) ; so they don't add up (because that would result in duplicated traffic).
Other changes are: