Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-579 Use netobserv prefix, add 4 new metrics, remove one #173

Merged

Conversation

jotak
Copy link
Member

@jotak jotak commented Sep 27, 2022

New metrics: source/dest bytes traffic per node / per owner
Removed bandwidth_per_network_service_per_namespace (per namespace traffic can be obtained from the new metrics)
That also removed the per-service discrimination, but I'm not sure we actually need it. In case we do, we should introduce a new metric.

cc @memodi @Amoghrd

Detailed changes:

The 4 following metrics show traffic per source/dest workload/namespace, or node for those with "node".

  • Added netobserv_node_received_bytes_total
  • Added netobserv_node_sent_bytes_total
  • Added netobserv_received_bytes_total
  • Added netobserv_sent_bytes_total

They allow to more accurately follow traffic occuring in a given namespace, e.g.

sum(rate(netobserv_received_bytes_total{SrcK8S_Namespace="netobserv"}[1m]))
and
sum(rate(netobserv_received_bytes_total{DstK8S_Namespace="netobserv",SrcK8S_Namespace!="netobserv"}[1m]))

will cover all traffic occuring inside netobserv and from/to other namespaces/nodes

The "send" and "received" words here account for traffic going in or out from the observation point of view, it's basically the same thing as the "Reporter" field in the UI (or FlowDirection field) ; so they don't add up (because that would result in duplicated traffic).

Other changes are:

  • Removed netobserv_bandwidth_per_network_service_per_namespace
  • Added netobserv_ prefix to all FLP internal metrics
  • Changed some FLP internal metrics, cf details here
  • Added netobserv_ prefix to loki client metrics (instead of "promtail" previously), so that they are more easily identifiable

@openshift-ci
Copy link

openshift-ci bot commented Sep 27, 2022

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@memodi
Copy link
Contributor

memodi commented Sep 27, 2022

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Sep 27, 2022
@github-actions
Copy link

New image: ["quay.io/netobserv/network-observability-operator:8a696be"]. It will expire after two weeks.

Copy link
Contributor

@memodi memodi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested this with PR image, looks like below metrics doesn't have "netobserv" prefix in them which are for protobuf/protobuf1.0, are these still relevant? if so, should they have "netobserv" prefix in them.

flow_decoder_count{name="protobuf",worker=""} 442
flow_process_nf_delay_summary_seconds{router="",version="protobuf1.0",quantile="0.5"} 1.741961812
flow_process_nf_delay_summary_seconds{router="",version="protobuf1.0",quantile="0.9"} 4.353887055
flow_process_nf_delay_summary_seconds{router="",version="protobuf1.0",quantile="0.99"} 4.967831966
flow_process_nf_delay_summary_seconds_sum{router="",version="protobuf1.0"} 379294.89681290183
flow_process_nf_delay_summary_seconds_count{router="",version="protobuf1.0"} 185834
flow_traffic_summary_size_bytes{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf",quantile="0.5"} 37739
flow_traffic_summary_size_bytes{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf",quantile="0.9"} 42017
flow_traffic_summary_size_bytes{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf",quantile="0.99"} 44748
flow_traffic_summary_size_bytes_sum{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf"} 1.6958135e+07
flow_traffic_summary_size_bytes_count{local_ip="0.0.0.0",local_port="2055",remote_ip="10.0.0.6",type="protobuf"} 442

@jotak
Copy link
Member Author

jotak commented Sep 28, 2022

@memodi good point, these are metrics that we reuse from goflow2 (they are used natively by goflow2 in the context of IPFIX agent, and we mimic them in the context of the eBPF agent).
Since now we are focusing just on the ebpf agent, I think we can forget about keeping them identical to the goflow's ones, right? So we can be more consistent with the rest of our metrics. I'll update the PR

@jotak
Copy link
Member Author

jotak commented Sep 28, 2022

actually, they reside in FLP, I need to open a new PR for FLP, but there's nothing more to do on the operator (so this PR can still be merged)

New metrics: source/dest bytes traffic per node / per owner
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Sep 28, 2022
Copy link
Contributor

@OlivierCazade OlivierCazade left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Thanks!

@memodi
Copy link
Contributor

memodi commented Sep 28, 2022

/qe-approved

@jotak
Copy link
Member Author

jotak commented Sep 28, 2022

/approve

@openshift-ci
Copy link

openshift-ci bot commented Sep 28, 2022

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 9a36d3c into netobserv:main Sep 28, 2022
KalmanMeth pushed a commit to KalmanMeth/network-observability-operator that referenced this pull request Feb 13, 2023
NETOBSERV-229 create pre-merge image on PR via action
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants