Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NETOBSERV-1112: Add TCP RTT calculation to ebpf-agent and userspace. #117

Merged
merged 13 commits into from
Jul 3, 2023

Conversation

dushyantbehl
Copy link
Contributor

@dushyantbehl dushyantbehl commented Apr 28, 2023

Fixes #102

Feature is disabled by default and can be enable with an environment variable flag.

Here are the CPU and Memory Numbers before and after the PR

Screenshot 2023-06-30 at 11 10 51 AM Screenshot 2023-06-30 at 11 11 07 AM

handshake SYN and ACK packets.

Introduces a new ebpf map for tracking incoming and outgoing packets for
RTT. Can be extended to any other way of TCP sequence tracking like TCPSecr/TCPval,
or continous TCP latency tracking and other protocols like ICMP etc.

Signed-off-by: Dushyant Behl <dushyantbehl@in.ibm.com>
Signed-off-by: Dushyant Behl <dushyantbehl@in.ibm.com>
Signed-off-by: Dushyant Behl <dushyantbehl@in.ibm.com>
Signed-off-by: Dushyant Behl <dushyantbehl@in.ibm.com>
Small changes to bpf rtt calculations.

Signed-off-by: Dushyant Behl <dushyantbehl@in.ibm.com>
@dushyantbehl
Copy link
Contributor Author

Tested on my end to be working end to end for calculating RTT for TCP handshake.

@eranra
Copy link
Collaborator

eranra commented Apr 30, 2023

@dushyantbehl, thanks for working on this PR ( and also, thanks, @praveingk ). @dushyantbehl Can you add some text that explains the logic of the calculation? I suggest using .md file and keep under docs. This will help both to review + for the future to explain what is happening.

@eranra
Copy link
Collaborator

eranra commented Apr 30, 2023

@jotak @jpinsonneau
the agent reports RTT just at the beginning of flows --- so all future/other flows in the conversation (with the same L4 hash) do not get the info. I wonder what we want to show in the flows table? maybe we need to 'clone' the RTT info in all flows from the same conversation? (but this is not really measured information)

@eranra
Copy link
Collaborator

eranra commented Apr 30, 2023

@dushyantbehl I ran some high-level test, and the results are ok, but not exactly what I expected::

(1) I ran in shell#1 while true; do wget http://hgd-speedtest-1.tele2.net/1MB.zip; sleep 10; done
This downloads a file from a known test server

(2) I ran in shell#2 sudo FLOWS_TARGET_HOST=127.0.0.1 FLOWS_TARGET_PORT=9999 ./bin/netobserv-ebpf-agent
starts the agent

(3) I ran in shell#3 ./bin/flowlogs-dump-collector -listen_port=9999 2>&1 >/dev/null | grep 90.130.74.149
dump to screen

The output I get is::

$ ./bin/flowlogs-dump-collector -listen_port=9999 2>&1 >/dev/null | grep 90.130.74.149
ipv4: 11:15:34.760990 eth0 IP 172.26.89.33:35262 > 90.130.74.149:80: protocol:tcp type: 0 code: 0 dir:1 bytes:20945 packets:315 flags:530 ends: 11:15:35.550359 rtt(us): 0
ipv4: 11:15:34.853181 eth0 IP 90.130.74.149:80 > 172.26.89.33:35262: protocol:tcp type: 0 code: 0 dir:0 bytes:1069618 packets:315 flags:272 ends: 11:15:35.549528 rtt(us): 92190
ipv4: 11:15:35.639099 eth0 IP 90.130.74.149:80 > 172.26.89.33:35262: protocol:tcp type: 0 code: 0 dir:0 bytes:66 packets:1 flags:512 ends: 11:15:35.639099 rtt(us): 0
ipv4: 11:15:35.639127 eth0 IP 172.26.89.33:35262 > 90.130.74.149:80: protocol:tcp type: 0 code: 0 dir:1 bytes:66 packets:1 flags:16 ends: 11:15:35.639127 rtt(us): 0
ipv4: 11:15:45.560058 eth0 IP 172.26.89.33:38364 > 90.130.74.149:80: protocol:tcp type: 0 code: 0 dir:1 bytes:74 packets:1 flags:2 ends: 11:15:45.560058 rtt(us): 0
ipv4: 11:15:45.654908 eth0 IP 90.130.74.149:80 > 172.26.89.33:38364: protocol:tcp type: 0 code: 0 dir:0 bytes:1069354 packets:311 flags:784 ends: 11:15:46.489739 rtt(us): 94850
ipv4: 11:15:45.654979 eth0 IP 172.26.89.33:38364 > 90.130.74.149:80: protocol:tcp type: 0 code: 0 dir:1 bytes:20739 packets:312 flags:528 ends: 11:15:46.489763 rtt(us): 0
ipv4: 11:15:56.404436 eth0 IP 172.26.89.33:47890 > 90.130.74.149:80: protocol:tcp type: 0 code: 0 dir:1 bytes:21077 packets:317 flags:530 ends: 11:15:57.340466 rtt(us): 0
ipv4: 11:15:56.499668 eth0 IP 90.130.74.149:80 > 172.26.89.33:47890: protocol:tcp type: 0 code: 0 dir:0 bytes:1069684 packets:316 flags:784 ends: 11:15:57.340431 rtt(us): 95232
ipv4: 11:16:07.255185 eth0 IP 172.26.89.33:55564 > 90.130.74.149:80: protocol:tcp type: 0 code: 0 dir:1 bytes:20417 packets:307 flags:530 ends: 11:16:08.124477 rtt(us): 0
ipv4: 11:16:07.344960 eth0 IP 90.130.74.149:80 > 172.26.89.33:55564: protocol:tcp type: 0 code: 0 dir:0 bytes:1068958 packets:305 flags:784 ends: 11:16:08.124456 rtt(us): 89774
ipv4: 11:16:18.044159 eth0 IP 172.26.89.33:47140 > 90.130.74.149:80: protocol:tcp type: 0 code: 0 dir:1 bytes:18797 packets:271 flags:530 ends: 11:16:18.967787 rtt(us): 0
ipv4: 11:16:18.133470 eth0 IP 90.130.74.149:80 > 172.26.89.33:47140: protocol:tcp type: 0 code: 0 dir:0 bytes:1067836 packets:288 flags:784 ends: 11:16:18.967766 rtt(us): 89310
ipv4: 11:16:28.982189 eth0 IP 90.130.74.149:80 > 172.26.89.33:53638: protocol:tcp type: 0 code: 0 dir:0 bytes:1069156 packets:308 flags:784 ends: 11:16:29.758358 rtt(us): 87864
ipv4: 11:16:28.894324 eth0 IP 172.26.89.33:53638 > 90.130.74.149:80: protocol:tcp type: 0 code: 0 dir:1 bytes:20615 packets:310 flags:530 ends: 11:16:29.758379 rtt(us): 0

When I run ping hgd-speedtest-1.tele2.net, I get

PING hgd-speedtest-1.tele2.net (90.130.74.149) 56(84) bytes of data.
64 bytes from hgd-speedtest-1.tele2.net (90.130.74.149): icmp_seq=1 ttl=49 time=88.0 ms
64 bytes from hgd-speedtest-1.tele2.net (90.130.74.149): icmp_seq=2 ttl=49 time=89.6 ms
64 bytes from hgd-speedtest-1.tele2.net (90.130.74.149): icmp_seq=3 ttl=49 time=89.0 ms
64 bytes from hgd-speedtest-1.tele2.net (90.130.74.149): icmp_seq=4 ttl=49 time=88.7 ms
64 bytes from hgd-speedtest-1.tele2.net (90.130.74.149): icmp_seq=5 ttl=49 time=89.7 ms
64 bytes from hgd-speedtest-1.tele2.net (90.130.74.149): icmp_seq=6 ttl=49 time=88.8 ms

Now, the rtt 88ms ( from ping) is similar to the reported RTT we get from the eBPF agent :-) This looks good. The thing that is a bit challenging is the direction of the reports and the zeros. I see that we get RTT just in the return path (which makes sense), but it is a bit strange. We need to fix that in the UI or maybe in the user-space agent somehow. Can you add to the dump also the flags so we will know in which flows we get "syn-acks" and ion which reports we do not?

eranra
eranra previously approved these changes Apr 30, 2023
Copy link
Collaborator

@eranra eranra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It works :-)

@jpinsonneau
Copy link
Contributor

@jotak @jpinsonneau the agent reports RTT just at the beginning of flows --- so all future/other flows in the conversation (with the same L4 hash) do not get the info. I wonder what we want to show in the flows table? maybe we need to 'clone' the RTT info in all flows from the same conversation? (but this is not really measured information)

From the conversation ID we can link back to the first flow and show any missing info. I would be more in favor of that than duplicating data; even if it would require an extra query in some cases.

FYI there is a doc for conversation tracking followup. Feel free to add any thoughts 😸

@codecov
Copy link

codecov bot commented May 3, 2023

Codecov Report

Merging #117 (bbee31c) into main (27baf29) will decrease coverage by 0.47%.
The diff coverage is 9.09%.

@@            Coverage Diff             @@
##             main     #117      +/-   ##
==========================================
- Coverage   40.11%   39.64%   -0.47%     
==========================================
  Files          31       31              
  Lines        2124     2154      +30     
==========================================
+ Hits          852      854       +2     
- Misses       1227     1254      +27     
- Partials       45       46       +1     
Flag Coverage Δ
unittests 39.64% <9.09%> (-0.47%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/agent/agent.go 37.84% <0.00%> (-1.37%) ⬇️
pkg/ebpf/tracer.go 0.00% <0.00%> (ø)
pkg/flow/record.go 64.10% <28.57%> (-7.33%) ⬇️
pkg/exporter/proto.go 82.97% <100.00%> (+0.24%) ⬆️

@jotak
Copy link
Member

jotak commented May 3, 2023

Thanks @dushyantbehl !

@jotak @jpinsonneau the agent reports RTT just at the beginning of flows --- so all future/other flows in the conversation (with the same L4 hash) do not get the info. I wonder what we want to show in the flows table? maybe we need to 'clone' the RTT info in all flows from the same conversation? (but this is not really measured information)

Help me understand.. the RTT is the time elapsed between the SYN emission and the ACK reception - so when when you say future flows in the conversation do not get the info, do you mean there could be more SYN/ACK for the same n-tuples? Or do you see other cases of potential loss of information?

Also .. I have a concern, is it safe with sampling? ( @dushyantbehl @eranra @praveingk )
E.g. if the ingress/ACK of the handshake wasn't seen due to sampling, but subsequent ACK are received, is it going to result in screwed up measurements?

bpf/flows.c Outdated Show resolved Hide resolved
@dushyantbehl
Copy link
Contributor Author

@jotak @jpinsonneau the agent reports RTT just at the beginning of flows --- so all future/other flows in the conversation (with the same L4 hash) do not get the info. I wonder what we want to show in the flows table? maybe we need to 'clone' the RTT info in all flows from the same conversation? (but this is not really measured information)

From the conversation ID we can link back to the first flow and show any missing info. I would be more in favor of that than duplicating data; even if it would require an extra query in some cases.

FYI there is a doc for conversation tracking followup. Feel free to add any thoughts 😸

@jpinsonneau I am not able to access the doc could you please share access with my redhat account, username is dbehl.

@dushyantbehl
Copy link
Contributor Author

Thanks @dushyantbehl !

@jotak @jpinsonneau the agent reports RTT just at the beginning of flows --- so all future/other flows in the conversation (with the same L4 hash) do not get the info. I wonder what we want to show in the flows table? maybe we need to 'clone' the RTT info in all flows from the same conversation? (but this is not really measured information)

Help me understand.. the RTT is the time elapsed between the SYN emission and the ACK reception - so when when you say future flows in the conversation do not get the info, do you mean there could be more SYN/ACK for the same n-tuples? Or do you see other cases of potential loss of information?

Correct, RTT here is for initial handshake right now, the differnece between the time when SYN packet gets sent and when a corresponding ACK packet is received.
I think I should fix my comment because what I meant was that right now this patch will calculate RTT for handshake packets and not for all packets in the flow so we won't see the RTT measurement for many flow records being evicted from ebpf-agent which correspond to packets other than initial handshake.

Ideally we should do calculations for all packets because latency can vary in the network at any time which is something we can add on top of this patch.

Also .. I have a concern, is it safe with sampling? ( @dushyantbehl @eranra @praveingk ) E.g. if the ingress/ACK of the handshake wasn't seen due to sampling, but subsequent ACK are received, is it going to result in screwed up measurements?

Its safe to the level that this won't cause any wrong RTT values to be shown but if a SYN or its corresponding ACK packet gets sampled we won't see RTT measurement for that flow handshake because any ACK that will come later will have a different sequence id and not the id of first one.

The only problem I foresee is if first ACKs are re-transmitted due to some reason the RTT calculation might not reflect real world round trip latency but in that case I'm not sure how best to mitigate this issue other than perform RTT measurement for all packets in the flow and then take a running average or something on top.

@dushyantbehl
Copy link
Contributor Author

Now, the rtt 88ms ( from ping) is similar to the reported RTT we get from the eBPF agent :-) This looks good. The thing that is a bit challenging is the direction of the reports and the zeros. I see that we get RTT just in the return path (which makes sense), but it is a bit strange. We need to fix that in the UI or maybe in the user-space agent somehow. Can you add to the dump also the flags so we will know in which flows we get "syn-acks" and ion which reports we do not?

Sure thing, I'm assuming by the dump you mean flowlogs-dump-collector.go it already has the flags just that they are not printed as SYN/ACK/FIN etc but just dumped in binary form. I can change that if that's what you meant.

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 28, 2023
@github-actions
Copy link

New image: quay.io/netobserv/netobserv-ebpf-agent:6ae2ab2. It will expire after two weeks.

@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 28, 2023
@dushyantbehl
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 28, 2023
@github-actions
Copy link

New image: quay.io/netobserv/netobserv-ebpf-agent:b4e571b. It will expire after two weeks.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jun 28, 2023

@dushyantbehl: This pull request references NETOBSERV-1112 which is a valid jira issue.

In response to this:

Fixes #102

Feature is disabled by default and can be enable with an environment variable flag.

Here are the CPU and Memory Numbers before and after the PR

Related PR - netobserv/flowlogs-pipeline#446
Memory Before
Memory After
Memory Number Before
Memory Numbers After

CPU Before CPU After CPU Numbers Before CPU Numbers After

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bpf/flows.c Outdated Show resolved Hide resolved
bpf/flows.c Outdated Show resolved Hide resolved
pkg/agent/agent.go Outdated Show resolved Hide resolved
@github-actions github-actions bot removed the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 29, 2023
Signed-off-by: Dushyant Behl <dushyantbehl@users.noreply.github.com>
@msherif1234
Copy link
Contributor

Thanks @dushyantbehl !!
/lgtm

@dushyantbehl
Copy link
Contributor Author

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Jun 30, 2023
@github-actions
Copy link

New image: quay.io/netobserv/netobserv-ebpf-agent:c1d9ac0. It will expire after two weeks.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Jun 30, 2023

@dushyantbehl: This pull request references NETOBSERV-1112 which is a valid jira issue.

In response to this:

Fixes #102

Feature is disabled by default and can be enable with an environment variable flag.

Here are the CPU and Memory Numbers before and after the PR

Screenshot 2023-06-30 at 11 10 51 AM Screenshot 2023-06-30 at 11 11 07 AM

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dushyantbehl added a commit to dushyantbehl/flowlogs-pipeline that referenced this pull request Jun 30, 2023
Related PR - netobserv/netobserv-ebpf-agent#117

Signed-off-by: Dushyant Behl <dushyantbehl@users.noreply.github.com>
dushyantbehl added a commit to dushyantbehl/flowlogs-pipeline that referenced this pull request Jun 30, 2023
Related PR - netobserv/netobserv-ebpf-agent#117

Signed-off-by: Dushyant Behl <dushyantbehl@users.noreply.github.com>
@jotak
Copy link
Member

jotak commented Jul 3, 2023

/lgtm
/approve

@dushyantbehl thanks a lot and sorry for the long time to wait!

@openshift-ci
Copy link

openshift-ci bot commented Jul 3, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jotak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved label Jul 3, 2023
@openshift-merge-robot openshift-merge-robot merged commit 1b85351 into netobserv:main Jul 3, 2023
9 checks passed
openshift-merge-robot pushed a commit to netobserv/flowlogs-pipeline that referenced this pull request Jul 6, 2023
* Add support for RoundTripTime in FLP.
Related PR - netobserv/netobserv-ebpf-agent#117

Signed-off-by: Dushyant Behl <dushyantbehl@users.noreply.github.com>

* Update the ebpf-agent go dependency.

Signed-off-by: Dushyant Behl <dushyantbehl@users.noreply.github.com>

---------

Signed-off-by: Dushyant Behl <dushyantbehl@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved enhancement New feature or request jira/valid-reference lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

NETOBSERV-1112: Support adding of connection setup timestamp
7 participants