-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NETOBSERV-578: deduplicate flows at agent level #56
Conversation
New image: ["quay.io/netobserv/netobserv-ebpf-agent:87c9d1c"]. It will expire after two weeks. |
Codecov Report
@@ Coverage Diff @@
## main #56 +/- ##
==========================================
+ Coverage 32.04% 34.76% +2.72%
==========================================
Files 18 19 +1
Lines 1239 1300 +61
==========================================
+ Hits 397 452 +55
- Misses 823 829 +6
Partials 19 19
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just a few comments but they are not blocking comments feel free to ignore.
pkg/flow/deduper.go
Outdated
// value: listElement pointing to a struct entry | ||
ifaces map[RecordKey]*list.Element | ||
// element: entry structs of the ifaces map ordered by expiry time | ||
ll *list.List |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpicking: may be ll is too generic, may be something like RecRefs?
for _, record := range records { | ||
if !cache.isDupe(&record.RecordKey) { | ||
fwd = append(fwd, record) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible improvement: adding a metric to count the number of duplicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added a debug log below. If you mean prometheus metrics, maybe we can create a new task to integrate a metrics server here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I was talking about a prometheus metric and I am fine with creating a different task about it.
pkg/agent/config.go
Outdated
// a flow hasn't been received for that expiry time, the deduplicator forgets it. That means | ||
// that a flow from a connection that has been inactive during that period could be forwarded | ||
// again from a different interface. | ||
DeduperFCExpiry time.Duration `env:"DEDUPER_FC_EXPIRY" envDefault:"30s"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't the 30s default a bit too high causing unnecessary memory consumption?
For a same flow being captured going through 2 different interface on the same node, I would have thought that this was below the second. So may be a default value like 5s would be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm since I want to be consistent between flows' flush periods, this should be > 1 cache flush periods. So probably we can set it, by default, to "CacheActiveTimeout * 2" (10s)
Detects when the same flow is detected in different interfaces and then forwards the flow from the first interface it was received from.
In hosts with low traffic it decreased the number of flows by 20% and hosts with higher traffic, by 40%.
However, it seems that in OpenShift we still could have some duplicates as the same flow can be detected from two agents in different hosts. In that case, a second deduplication should be performed at flowlogs-pipeline level.