Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP flag-based sampling ("Smart sampling") #91

Open
ronensc opened this issue Jan 30, 2023 · 6 comments
Open

TCP flag-based sampling ("Smart sampling") #91

ronensc opened this issue Jan 30, 2023 · 6 comments
Assignees

Comments

@ronensc
Copy link

ronensc commented Jan 30, 2023

When sampling is enabled, we might miss important events such as establishment and termination of TCP connections.
It may be useful to add a setting that makes the agent always sending flows that contain specific TCP flags (e.g. SYN, FIN).
This will make sure that no connection will be missed.

Cons:

  1. This makes it trickier to normalize the bytes/packets counters (multiplying counters by sampling rate)
  2. If the cluster is flooded with short connections, then most flows will contain the SYN or FIN flags. This makes the sampling ineffective.

cc: @eranra @jotak @praveingk @shach33

@jotak
Copy link
Member

jotak commented Jan 30, 2023

wdyt about making it configurable, e.g.

sampling:
  syn: 0
  fin: 0
  default: 50
  (<any flag>: <value>)

That would allow to mitigate Cons 2 giving users more control on it.
The default values could be this above example, with no sampling on syn/fin, and keep the current default of 50 for the rest

@ronensc
Copy link
Author

ronensc commented Jan 30, 2023

@jotak Just to make sure that I understand your configuration format, considering the following values

sampling:
  syn: 0
  fin: 10
  default: 50

It should be interpreted as:

  • send every flow that contains the SYN flag
  • send every 10th flow that contains the FIN flag
  • send every 50th flow that doesn't contain SYN nor FIN flags
    ?

@eranra
Copy link
Collaborator

eranra commented Jan 30, 2023

@jotak totally agree with the above --- assuming::

(1) @shach33 @praveingk and @msherif1234 will agree that this doesn't add a lot ( or any)of overhead to the eBPF code :-)
(2) The user experience (operator) is friendly --- defaulting precisely to what you have above will be perfect

@jotak
Copy link
Member

jotak commented Jan 30, 2023

@ronensc yes; with the subtlety that it's not exactly "every Nth flow", as the sampling is probabilistic, not deterministic (as you can see here:

if (sampling != 0 && (bpf_get_prandom_u32() % sampling) != 0) {
return TC_ACT_OK;
}
) But I'm nitpicking :)

@jotak
Copy link
Member

jotak commented Jan 30, 2023

@eranra My understanding is that it adds a little overhead, as we need read TCP headers for every packet, including ones that could otherwise be ignored due to sampling.
I guess the perf impact overall will still be largely positive

@eranra
Copy link
Collaborator

eranra commented Jan 30, 2023

@jotak maybe the user space ebpf agent will translate the friendly conf to some more straightforward and efficient rules to be used in the kernel. Yes, we need to "look" on every packet but this is what eBPF TC hook is for anyway.

@dushyantbehl dushyantbehl self-assigned this May 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants