Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1110: Enable support for Flow RTT #394

Merged
merged 9 commits into from
Sep 5, 2023

Conversation

dushyantbehl
Copy link
Contributor

@dushyantbehl dushyantbehl commented Jul 12, 2023

@dushyantbehl dushyantbehl added enhancement New feature or request breaking-change This pull request has breaking changes. They should be described in PR description. labels Jul 12, 2023
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jul 12, 2023

@dushyantbehl: This pull request references NETOBSERV-1110 which is a valid jira issue.

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dushyantbehl
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jul 12, 2023
@github-actions
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:92a3fee
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-92a3fee
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-92a3fee

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:92a3fee make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-92a3fee

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-92a3fee
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@codecov
Copy link

codecov bot commented Jul 12, 2023

Codecov Report

Patch coverage: 17.39% and project coverage change: +0.32% 🎉

Comparison is base (3e5454d) 55.49% compared to head (925fc34) 55.82%.
Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #394      +/-   ##
==========================================
+ Coverage   55.49%   55.82%   +0.32%     
==========================================
  Files          45       45              
  Lines        5874     5888      +14     
==========================================
+ Hits         3260     3287      +27     
+ Misses       2393     2380      -13     
  Partials      221      221              
Flag Coverage Δ
unittests 55.82% <17.39%> (+0.32%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
api/v1alpha1/flowcollector_webhook.go 0.00% <0.00%> (ø)
api/v1alpha1/zz_generated.conversion.go 0.25% <ø> (ø)
api/v1beta1/flowcollector_types.go 100.00% <ø> (ø)
api/v1beta1/zz_generated.deepcopy.go 53.79% <0.00%> (-1.57%) ⬇️
controllers/consoleplugin/consoleplugin_objects.go 94.61% <0.00%> (-0.82%) ⬇️
controllers/ebpf/agent_controller.go 69.59% <0.00%> (-1.11%) ⬇️
...ntrollers/ebpf/internal/permissions/permissions.go 45.50% <0.00%> (+1.32%) ⬆️
controllers/flowlogspipeline/flp_common_objects.go 80.88% <0.00%> (-0.80%) ⬇️
pkg/helper/flowcollector.go 65.97% <57.14%> (-0.69%) ⬇️

... and 5 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jul 13, 2023
@jpinsonneau jpinsonneau added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jul 13, 2023
@github-actions
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:e44e1b3
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-e44e1b3
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-e44e1b3

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:e44e1b3 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-e44e1b3

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-e44e1b3
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jul 13, 2023
@jpinsonneau
Copy link
Contributor

As discussed, the RTT is not present in every flow and the enrichment ratio depends of sampling.

We can retreive the ratio of flows containing TimeFlowRttNs using the following query:

http://localhost:3100/loki/api/v1/query_range?query=count by(app) (count_over_time({app="netobserv-flowcollector",_RecordType="flowLog"}|~`"TimeFlowRttNs`[5m]))/count by(app) (count_over_time({app="netobserv-flowcollector",_RecordType="flowLog"}[5m]))*100&step=5m
Sampling flows with RTT
1 45 %
25 10 %
50 5 %
100 < 1 %

We should document that 📜

@jpinsonneau
Copy link
Contributor

@dushyantbehl can you please rebase your PR ?
After that we'll need to list flowRTT in console plugin configMap features section to be able to detect when it's enabled.

https://github.com/netobserv/network-observability-operator/blob/main/controllers/consoleplugin/consoleplugin_objects.go#L353-L359

@dushyantbehl
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 2, 2023
@github-actions
Copy link

github-actions bot commented Aug 2, 2023

New images:

  • quay.io/netobserv/network-observability-operator:251705a
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-251705a
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-251705a

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:251705a make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-251705a

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-251705a
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 24, 2023
scc.AllowHostDirVolumePlugin = true
}
if (desired.EnablePktDrop != nil && !*desired.EnablePktDrop) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was added to handle special case where u start with both features on then u just disable one since we don't compare current vs desired we could endup clearing the AllowHostDirVolumePlugin there was an issue about this by @memodi

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msherif1234 scc is created a few lines above with AllowHostDirVolumePlugin not initialized , and since it's a bool , it means it's false already . So I don't see the point of setting it to false here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msherif1234 if you're talking about the nil-check, it's not relevant anymore now, as the new Features is a slice, unlike the previous EnablePktDrop / EnableDNSTracking that were bool pointers

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can u try the steps mentioned in the bug shared above to be sure, the reason I added the explicit false to make scc desired different than actual anyway I kind of forget the details its been awhile just make sure the steps showing in the above issue are fine

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tried to reproduce that old ticket, and it works fine: I get no error on the daemonset, pods are updated as expected, flows are flowing with dns/drops info

// Agent feature, can be one of:<br>
// - `PKT_DROP`, to track packet drops.<br>
// - `DNS_TRACKING`, to track specific information on DNS traffic.<br>
// - `FLOW_RTT`, to track L4 latency. <i>Unsupported (*)</i><br>
Copy link
Contributor

@msherif1234 msherif1234 Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls remove
and other weired chars at the end of the above lines and every else in this file

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@msherif1234 I guess you are talking about the HTML blocks right ? That's not new to this PR: https://issues.redhat.com/browse/NETOBSERV-1104 ; it's related to asciidocs
Let's keep this as is and solve it in the related ticket

// If the `spec.agent.eBPF.privileged` parameter is not set, an error is reported.<br>
// - `FLOW_RTT` <i>Unsupported (*)</i>: allows enabling flow latency (RTT) calculations in the eBPF agent during TCP handshakes.
// This feature needs both INGRESS and EGRESS direction flow capture and will be disabled if they are not both enabled.<br>
// +optional
Copy link
Contributor

@msherif1234 msherif1234 Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add or we need to allow Features: block with no features ?

// +kubebuilder:validation:Required
// +kubebuilder:validation:MinItems:=1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need that check. We could simply initialize it empty []

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I don't even think an empty array initialization is needed, we don't do that in other places (e.g. exporters list)

@@ -64,12 +64,12 @@ func (r *FlowCollector) ConvertTo(dstRaw conversion.Hub) error {

dst.Spec.Loki.Enable = restored.Spec.Loki.Enable

dst.Spec.Agent.EBPF.PktDrop = restored.Spec.Agent.EBPF.PktDrop
dst.Spec.Agent.EBPF.DNSTracking = restored.Spec.Agent.EBPF.DNSTracking
if restored.Spec.Agent.EBPF.Features != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a slice u need to allocate the space then copy content

dst.Spec.Agent.EBPF.Features = make([]AgentFeature, len(restored.Spec.Agent.EBPF.Features))
copy(dst.Spec.Agent.EBPF.Features, restored.Spec.Agent.EBPF.Features)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

ConfigDisabled FeatureConfigType = "DISABLED"
PktDrop AgentFeature = "PKT_DROP"
DNSTracking AgentFeature = "DNS_TRACKING"
FlowRTT AgentFeature = "FLOW_RTT"
Copy link
Contributor

@msherif1234 msherif1234 Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if I am not mistaken Strings need to use Pascal not snake case ?

        PktDrop     AgentFeature = "PacketsDrop"
	DNSTracking AgentFeature = "DnsTacking"
	FlowRTT     AgentFeature = "FlowRtt"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, idk why I had in mind the best practice was upper case

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Co-authored-by: Julien Pinsonneau <91894519+jpinsonneau@users.noreply.github.com>
@openshift-ci openshift-ci bot removed the lgtm label Aug 29, 2023
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 29, 2023
@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 29, 2023
@github-actions
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:e7c6868
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-e7c6868
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-e7c6868

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:e7c6868 make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-e7c6868

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-e7c6868
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

- Remove mention of INGRESS/EGRESS having to be enabled since it's not
  configurable
- Remove italic text (cf
  netobserv#407 )
- Update bundle
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 29, 2023
- Use PascalCase for enums
- Copy array in webhook
- Update bundle
@jotak
Copy link
Member

jotak commented Aug 29, 2023

@dushyantbehl @msherif1234 @jpinsonneau : I updated the PR to address feedback

@jotak jotak added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Aug 29, 2023
@github-actions
Copy link

New images:

  • quay.io/netobserv/network-observability-operator:28a41df
  • quay.io/netobserv/network-observability-operator-bundle:v0.0.0-28a41df
  • quay.io/netobserv/network-observability-operator-catalog:v0.0.0-28a41df

They will expire after two weeks.

To deploy this build:

# Direct deployment, from operator repo
IMAGE=quay.io/netobserv/network-observability-operator:28a41df make deploy

# Or using operator-sdk
operator-sdk run bundle quay.io/netobserv/network-observability-operator-bundle:v0.0.0-28a41df

Or as a Catalog Source:

apiVersion: operators.coreos.com/v1alpha1
kind: CatalogSource
metadata:
  name: netobserv-dev
  namespace: openshift-marketplace
spec:
  sourceType: grpc
  image: quay.io/netobserv/network-observability-operator-catalog:v0.0.0-28a41df
  displayName: NetObserv development catalog
  publisher: Me
  updateStrategy:
    registryPoll:
      interval: 1m

@jpinsonneau jpinsonneau mentioned this pull request Aug 29, 2023
10 tasks
@msherif1234
Copy link
Contributor

/lgtm

@jotak
Copy link
Member

jotak commented Sep 5, 2023

/approve

@openshift-ci
Copy link

openshift-ci bot commented Sep 5, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Sep 5, 2023
@openshift-merge-robot openshift-merge-robot merged commit 51294b8 into netobserv:main Sep 5, 2023
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved breaking-change This pull request has breaking changes. They should be described in PR description. enhancement New feature or request jira/valid-reference lgtm no-doc This PR doesn't require documentation change on the NetObserv operator no-qe This PR doesn't necessitate QE approval ok-to-test To set manually when a PR is safe to test. Triggers image build on PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants