network-agent netns mode #122

chdxD1 · 2024-05-23T14:25:46Z

Based on #121 (and the reason why we want to have the split).

This should be an alternative to the current "vrf-ibgp" (this is how I would call it today) mode. Either there will be a network-agent-netns and a network-agent-vrf-ibgp or it will be a network-agent with configuration flags. This is left for the implementor to decide.

The current network-operator architecture is based on some workarounds and is highly integrated into the host. This makes it complicated to make the network-operator useful for other people as well. It relies on VRFs which are connected by using veth interfaces in between.

Our goal is to move away from the veth interfaces between VRFs and to a traditional route-leaking setup.

The network-agent (in netns mode) would run in a network namespace / container (HBR container), completely separate from the Kubernetes side.

Currently the setup looks like this:
VRFs:

Layer2s:

With this in mind it would end up like this:

This will require changes to the FRR templates (which I can provide, they do not need to be implemented here) and to the netlink interface.

Looking at the image above we focus on the two upper links: veth between node/HBR and veth Trunk.

We (for now) assume that there is an interface called hbn (attached to VRF hbr, also pre-created) inside the container and an interface called tr (not to be created by network-agent).

For VRFs network-agent must perform the following steps:

Create a VRF, vxlan interface and bridge interface (similar to today)
Skip creating a veth pair (different to today)
Configure FRR with different sets of templates (to be provided when implementation progresses)

For Layer2s network-agent must perform the following steps:

Create a vxlan interface and bridge interface (similar to today), attach them to VRF hbr (default / main VRF) or a different VRF from above.
Create an interface of type vlan which references the interface tr (name something like tr.) and set the master of this interface to the bridge created before

For Layer2 there is an additional step where I am unsure where that could end up. There might be the need for a very small network-agent-host that creates the tr. interfaces on the host network namespace side, see picture above.

The text was updated successfully, but these errors were encountered:

p-strusiewiczsurmacki-mobica · 2024-05-31T14:26:02Z

attach them to VRF hbr (default / main VRF) or a different VRF from above

How should it be decided which VRF to select?

p-strusiewiczsurmacki-mobica · 2024-06-04T14:25:15Z

@chdxD1
Hi, I've created branch with some changes here. https://github.com/p-strusiewiczsurmacki-mobica/das-schiff-network-operator/blob/agent-gradual-netns (this contains changes relevant to #110 #112 #121 as well).

I've separated the code that I think will be common for netns and vrf-ibgp modes in pkg/adapters/netlink.

For netns mode - in pkg/adapters/netns:

For VRFs (Layer 3):

I've left the code same as for the vrf-ibgp except I've disabled creation of veth. https://github.com/p-strusiewiczsurmacki-mobica/das-schiff-network-operator/blob/agent-gradual-netns/pkg/nl/layer3.go#L50

For Layer2:

Creation of vxlan interface and bridge uses the same code as before, but bridge uses hbr (hardcoded for now) interface index as a master. https://github.com/p-strusiewiczsurmacki-mobica/das-schiff-network-operator/blob/agent-gradual-netns/pkg/adapters/netns/netns.go#L82
vlan with name tr.100 (hardcoded as well) is being created with parent index set to tr interface and master index set to bridge. https://github.com/p-strusiewiczsurmacki-mobica/das-schiff-network-operator/blob/agent-gradual-netns/pkg/adapters/netns/netns.go#L96

Currently I'm trying to figure out how to test this.

I have also some questions:

How is the tr.<number> interface and the bridge master determined? Should those be specified in Layer2NetworkConfiguration? Or should those be created dynamically?
Should all the code that is used to reconciling existing Layer 2 configurations (https://github.com/telekom/das-schiff-network-operator/blob/main/pkg/nl/layer2.go#L250) be also used here? I can't see much relevance between creation of those interfaces described above and reconciliation of existing configurations or cleanup when configuration is being deleted, so it might be that I am missing something here.

chdxD1 added the enhancement New feature or request label May 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

network-agent netns mode #122

network-agent netns mode #122

chdxD1 commented May 23, 2024

p-strusiewiczsurmacki-mobica commented May 31, 2024

p-strusiewiczsurmacki-mobica commented Jun 4, 2024 •

edited

Loading

network-agent netns mode #122

network-agent netns mode #122

Comments

chdxD1 commented May 23, 2024

p-strusiewiczsurmacki-mobica commented May 31, 2024

p-strusiewiczsurmacki-mobica commented Jun 4, 2024 • edited Loading

p-strusiewiczsurmacki-mobica commented Jun 4, 2024 •

edited

Loading