Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework hw pytest #598

Merged
merged 3 commits into from
Aug 23, 2024
Merged

Rework hw pytest #598

merged 3 commits into from
Aug 23, 2024

Conversation

PlagueCZ
Copy link
Contributor

@PlagueCZ PlagueCZ commented Aug 9, 2024

There was a switch to run pytest suite using actual hardware (Mellanox NIC). It was highly specific to a setup with corsslinked ports and could not thus cover everything.

I created a better implementation where the Mellanox NIC is connected to another machine (or another NIC on the same machine with network namespace isolation). Then a special scapy script is run on the remote machine to reflect underlay packets back to the test suite (it needs to mangle some of them to prevent isolation rule from firing).

Currently this needs a directly connected machine with no switch in between, as the MAC addresses used would cause the traffic to be dropped I believe. But this is something I want to address later so this test can be run in other environments.

This is also capable of testing the multiport setup and pf1-proxy functionality (as the pytest suite needs to listen on the proxy port)


The change to dpservice C code is actually just a cleanup of a hack to make the previous HW test work.

As for the test suite itself, almost all calls to send() and sniff() were already encapsulated, so adding the de-mangling of packets (for isolation rule prevention) is centralized. There were a few instances of a naked call to send() and sniff() which I encapsulated and it even cleaned the code up.

There is a bit of a messy change to telemetry tests (due to the fact that we change the graph for pf1-proxy, etc.), but this will be fixed once we either remove or rework the way pf1-proxy is implemented.

Apart from the above, the tests themselves are oblivious to this change.

@github-actions github-actions bot added size/L documentation Improvements or additions to documentation enhancement New feature or request labels Aug 9, 2024
@PlagueCZ PlagueCZ marked this pull request as ready for review August 9, 2024 16:11
@PlagueCZ PlagueCZ requested a review from a team as a code owner August 9, 2024 16:11
@byteocean
Copy link
Contributor

I do not have the chance to actually run these tests due to lacking of directly connected machines. I get the idea by reading the code, and see the value during the development. What I can comment here is that, preparing an environment, e.g., providing a qcow2 file, setting up VMs and bridges, are currently manual, thus difficult to integrate into CI/CD. It is of course the same issue as the benchmark test.

ICMPv6EchoRequest())
delayed_sendp(icmp_pkt, PF0.tap)

def external_ping6(dst_ipv6, ul_ipv6):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it pure refactoring if I get it right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Functionally this is pure refactoring, but the reason to do this was because I need to encapsulate all sniff() calls to intercept the mangled packets with EtherType=1337 and fix them.
These calls were the last three instances where a "naked" scapy call was being made.

@guvenc
Copy link
Collaborator

guvenc commented Aug 19, 2024

@PlagueCZ
I am trying to visualize how this test is actually wired including the VFs/VMs, PF0/PF1 (PF1 Tap case ?) and monitoring port (Including the reflector PC)
and I can not quite get the complete picture as a mental image. Would it be possible to prepare a really simple draw.io drawing to show how the things are wired/connected and which script is running where. Nothing fancy. Just some boxes and lines with some description text boxes would be fine.
We can keep it in the PR and it doesnt have to be part of the codebase.

@PlagueCZ
Copy link
Contributor Author

Hopefully this image covers the idea properly. You need to know that only "underlay packets" ever reach dpservice's PF due to isolation rules (this is not technically correct, the rules are more complex, but the idea holds). This is what reflector.py uses to direct the packet back to dpservice (leave the packet as-is) or to pytest (change EtherType to 0x1337 so the rules do not match).

Virtual machines are then set-up so they are a transparent bridge (explained in the markdown docs) to simply pass packets between dpservice and pytest.

The schema only shows one PF and one VF for clarity, there is no dependence among any of them, but of course the reflector does perform the same on two interfaces (PF0 and PF1).
pytest drawio

@PlagueCZ
Copy link
Contributor Author

The PF1 tap case is wired the same except that pytest needs to listen on pf1-tap device instead. But that is actually true for everything running on that machine (frr, curl, ping, ...) as PF1 is inaccessible due to the multiport mode.

In that case dpservice is actually doing more, but you can imagine a blackbox bridge between PF1 and the TAP device and it would work the same, it's just that the blackbox is dpservice itself.
pytest drawio

Copy link
Collaborator

@guvenc guvenc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed drawings. This looks good to me.

@guvenc guvenc merged commit 7ce61c0 into main Aug 23, 2024
6 checks passed
@guvenc guvenc deleted the feature/rework_hw_pytest branch August 23, 2024 07:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request size/L
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants