Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update OVS pipeline document #5412

Merged
merged 5 commits into from
Apr 26, 2024

Conversation

hongliangl
Copy link
Contributor

@hongliangl hongliangl commented Aug 21, 2023

Resolves #5200

  • PipelineRootClassifierTable
  • ARPSpoofGuardTable
  • ARPResponderTable
  • ClassifierTable
  • SpoofGuardTable
  • UnSNATTable
  • ConntrackTable
  • ConntrackStateTable
  • PreRoutingClassifierTable
  • SessionAffinityTable
  • ServiceLBTable
  • EndpointDNATTable
  • AntreaPolicyEgressRuleTable
  • EgressRuleTable
  • EgressDefaultTable
  • EgressMetricTable
  • L3ForwardingTable
  • EgressMarkTable
  • L3DecTTLTable
  • SNATMarkTable
  • SNATTable
  • L2ForwardingCalcTable
  • TrafficControlTable
  • IngressSecurityClassifierTable
  • AntreaPolicyIngressRuleTable
  • IngressRuleTable
  • IngressDefaultTable
  • IngressMetricTable
  • ConntrackCommitTable
  • OutputTable

@hongliangl hongliangl added the kind/documentation Categorizes issue or PR as related to a documentation. label Aug 21, 2023
@luolanzone luolanzone added this to the Antrea v1.15 release milestone Oct 30, 2023
@hongliangl hongliangl force-pushed the 20230902-ovs-pipeline-doc-1 branch 2 times, most recently from 47f1121 to f660c7e Compare January 12, 2024 02:09
@hongliangl hongliangl changed the title [WIP] Update OVS pipeline document Update OVS pipeline document Jan 17, 2024
@hongliangl hongliangl marked this pull request as ready for review January 17, 2024 06:57
@hongliangl
Copy link
Contributor Author

Not finished all tables yet.

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
Copy link
Contributor

@luolanzone luolanzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not finished.

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a quick look, but I feel like it will be easier to keep reviewing once inital comments from other reviewers have been addressed

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
@hongliangl hongliangl force-pushed the 20230902-ovs-pipeline-doc-1 branch 4 times, most recently from 076fd8d to 0226d1f Compare January 22, 2024 12:34
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
@hongliangl hongliangl force-pushed the 20230902-ovs-pipeline-doc-1 branch 2 times, most recently from b99a639 to 319b724 Compare January 23, 2024 02:14
@hongliangl hongliangl force-pushed the 20230902-ovs-pipeline-doc-1 branch 3 times, most recently from 46d475f to 343a7e3 Compare January 24, 2024 08:02
Copy link
Contributor

@wenyingd wenyingd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel the explanations on flows are too detailed...

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
The first flow outputs all unicast packets to the correct port (the port was
resolved by the "dmac" table, [L2ForwardingCalcTable]). IP packets for which
[L2ForwardingCalcTable] did not set bit 16 of NXM_NX_REG0 will be dropped.
- Flow 1 is to match new non-Service connections and commit them to the connection tracking module.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why we don't use "-dnat" state in the flows in this table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have a ct mark to distinguish Service packets. The ct mark is sufficient.

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
Copy link
Contributor

@luolanzone luolanzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, a few nits

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
port by rewriting the destination MAC (from the Global Virtual MAC to the
local gateway's MAC).
Antrea-native NetworkPolicy relies on the OVS built-in `conjunction` action to implement policies efficiently. This
enables us to do a conjunctive match across multiple dimensions (source IP, destination IP, port) efficiently without
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, it's better to add etc.

Suggested change
enables us to do a conjunctive match across multiple dimensions (source IP, destination IP, port) efficiently without
enables us to do a conjunctive match across multiple dimensions (source IP, destination IP, port, etc.) efficiently without

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
Resolves antrea-io#5200

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost done with reviewing, lots of comments but all of them minor so far


This ACNP is applied to all Pods with the label `app: web` in all Namespaces. It allows only HTTP ingress traffic on
port 8080 from Pods with the label `app: client`, limited to the `GET` method and `/api/v2/*` path. Any other HTTP
ingress traffic on port 8080 from Pods the label `app: client` will be dropped.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ingress traffic on port 8080 from Pods the label `app: client` will be dropped.
ingress traffic on port 8080 from Pods with the label `app: client` will be dropped.

Consider the following Egresses as examples for the remainder of this document.

This is an Egress applied to Pods with the label `app: web`. For these Pods, all egress traffic will be SNAT'd on the
Node `k8s-node-control-plane` from which we dumped flows in the document with the Egress IP `192.168.77.112`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am suggesting replacing the second sentence here with this for clarity:

For these Pods, all egress traffic (traffic leaving the cluster) will be SNAT'd on the Node k8s-node-control-plane using Egress IP 192.168.77.112. In this context, k8s-node-control-plane is known as the "Egress Node" for this Egress resource. Note that the flows presented in the rest of this document were dumped on Node k8s-node-control-plane. Egress flows are different on the "source Node" (Node running a workload Pod to which the Egress resource is applied) and on the "Egress Node" (Node enforcing the SNAT policy).


Flow 1 is designed for case 1, matching ARP request packets for the MAC address of a remote Antrea gateway with IP address
`10.10.1.1`. It programs an ARP reply packet and sends it back to the port where the request packet was received. Note
that both the source hardware address and the source MAC address in the ARP reply packet are set with the *Global Virtual
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/set with/set to

other side of the tunnel.
A similar route is installed on the local Antrea gateway (antrea-gw0) interface every time the Antrea *Node Route Controller*
is notified that a new Node has joined the cluster. The route must be marked as "onlink" since the kernel does not have
a route to the peer gateway `10.10.1.1`. we trick the kernel into believing that `10.10.1.1` is directly connected to
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
a route to the peer gateway `10.10.1.1`. we trick the kernel into believing that `10.10.1.1` is directly connected to
a route to the peer gateway `10.10.1.1`. We "trick" the kernel into believing that `10.10.1.1` is directly connected to


### ServiceHairpinTable (23)
This table is designed to determine the "category" of packets by matching the ingress port of the packets. It
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This table is designed to determine the "category" of packets by matching the ingress port of the packets. It
This table is designed to determine the "category" of IP packets by matching on their ingress port. It

Service Endpoint selection has been done.

Flow 8 and flow 9 are for case 6. Flow 8 has the higher priority than that of flow 9, prioritizing matching the first
packet of connection sourced from a local Pod or the local Node with `FromLocalRegMark` loaded in table [Classifier]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/of connection sourced/of connections sourced

packet of connection sourced from a local Pod or the local Node with `FromLocalRegMark` loaded in table [Classifier]
and destined for the sample [Service with ExternalTrafficPolicy Local]. The target of flow 8 is an OVS group that has
all the Endpoints across the cluster, ensuring accessibility for Service connections originating from local Pods or
Nodes, regardless that `externalTrafficPolicy` of the Service is `Local`. Due to the existence of flow 8, consequently,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Nodes, regardless that `externalTrafficPolicy` of the Service is `Local`. Due to the existence of flow 8, consequently,
Nodes, even though `externalTrafficPolicy` is set to `Local` for the Service. Due to the existence of flow 8, consequently,


### EgressRuleTable (50)
The table implements DNAT for Service connection after Endpoint selection is performed in table [ServiceLB].
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Service connection/Service connections

1. table=EndpointDNAT, priority=200,reg0=0x4000/0x4000 actions=controller(reason=no_match,id=62373,userdata=04)
2. table=EndpointDNAT, priority=200,tcp,reg3=0xa0a0018,reg4=0x20050/0x7ffff actions=ct(commit,table=AntreaPolicyEgressRule,zone=65520,nat(dst=10.10.0.24:80),exec(set_field:0x10/0x10->ct_mark,move:NXM_NX_REG0[0..3]->NXM_NX_CT_MARK[0..3]))
3. table=EndpointDNAT, priority=200,tcp,reg3=0xa0a0106,reg4=0x20050/0x7ffff actions=ct(commit,table=AntreaPolicyEgressRule,zone=65520,nat(dst=10.10.1.6:80),exec(set_field:0x10/0x10->ct_mark,move:NXM_NX_REG0[0..3]->NXM_NX_CT_MARK[0..3]))
4. table=EndpointDNAT, priority=190,reg4=0x20000/0x70000 actions=set_field:0x10000/0x70000->reg4,resubmit(,ServiceLB)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that you didn't describe this flow below. Can you add a quick description?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add. I forgot the description of flows 4 and 5 in this table.

Comment on lines +1059 to +1075
2. table=EndpointDNAT, priority=200,tcp,reg3=0xa0a0018,reg4=0x20050/0x7ffff actions=ct(commit,table=AntreaPolicyEgressRule,zone=65520,nat(dst=10.10.0.24:80),exec(set_field:0x10/0x10->ct_mark,move:NXM_NX_REG0[0..3]->NXM_NX_CT_MARK[0..3]))
3. table=EndpointDNAT, priority=200,tcp,reg3=0xa0a0106,reg4=0x20050/0x7ffff actions=ct(commit,table=AntreaPolicyEgressRule,zone=65520,nat(dst=10.10.1.6:80),exec(set_field:0x10/0x10->ct_mark,move:NXM_NX_REG0[0..3]->NXM_NX_CT_MARK[0..3]))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

quick question out of curiosity: OVS does not let us invoke the nat action with a dst value stored in a register? I didn't remember that we had individual flows for each endpoint her.

cc @wenyingd @tnqn

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc says:

nat(type=addrs[:ports][,flag]...)

The src and dst options take the following arguments:

addrs
The IP address addr or range addr1-addr2 from which the translated address should be selected. If only one address is given, then that address will always be selected, otherwise the address selection can be informed by the optional persistent flag as described below. Either IPv4 or IPv6 addresses can be provided, but both addresses must be of the same type, and the datapath behavior is undefined in case of providing IPv4 address range for an IPv6 packet, or IPv6 address range for an IPv4 packet. IPv6 addresses must be bracketed with [ and ] if a port range is also given.

So the answer should be no.

Copy link
Contributor

@antoninbas antoninbas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done with the review

please address comments using a new / separate commit, so it's easy for me to do a final round of review later

docs/design/ovs-pipeline.md Show resolved Hide resolved

For this table, you will need to keep in mind the Antrea-native NetworkPolicy
[specification](#antrea-native-networkpolicy-implementation). Since the sample egress policy resides in the Application
Tier. If you dump the flows of this table, you may see the following:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Tier. If you dump the flows of this table, you may see the following:
Tier, if you dump the flows of this table, you may see the following:

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
```
- Flow 7 is used to match packets with the source IP address in set {10.10.0.24}, which is from the Pods selected
by the label `app: web`, constituting the first dimension for `conjunction` with `conj_id` 5.
- Flow 8 is used to match any packets, constituting the second dimension for `conjunction` with `conj_id` 5.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you know why we even have this flow? Is it because we need at least 2 dimensions for a conjunctive match? If yes, maybe we should indicate it to avoid confusion from the reader. cc @wenyingd

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it because we need at least 2 dimensions for a conjunctive match?

That was what I thought, and I didn't think more about it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's just add one sentence about it at the end of the bullet point:

This flow, which matches all packets, exists because we need at least 2 dimensions for a conjunctive match.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will add

[specification](#network-policy-implementation) that we are using. We have 2
Pods running on the same Node, with IP addresses 10.10.1.2 to 10.10.1.3. They
are allowed to talk to each other using TCP on port 80, but nothing else.
### AntreaPolicyIngressRule
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please make sure that the comments I left for Egress security tables are also addressed for these tables, when applicable

Flow 1 is for case 1. It matches packets with `L7NPRedirectCTMark` and `OutputToOFPortRegMark`, and then outputs them to
the port `antrea-l7-tap0` specifically created for connecting to an application-aware engine. Notably, these packets are pushed
with an 802.1Q header and loaded with the VLAN ID value persisted in `L7NPRuleVlanIDCTLabel` before being output, due to
the implementation of `L7NetworkPolicy`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though it is a bit orthogonal to the description of this table, I would add the following sentence at the end of the paragraph:

The application-aware engine enforcing L7 policies (e.g., Suricata) can leverage the VLAN ID to determine which set of rules to apply to the packet.

Flow 5 is for case 5, matching packets sent back from an application-aware engine through a specific port and forwarding
them to table [L3Forwarding] to decide the egress port. Like flow 4, the purpose of forwarding the packets to table
[L3Forwarding] is to load tunnel destination IP for packets destined for remote Nodes. `FromTCReturnRegMark` that will
be used in table [TrafficControl] is also loaded to mark the packet source.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe mention that the VLAN tag which was previously added for use by the L7 engine is also removed.

docs/design/ovs-pipeline.md Show resolved Hide resolved
the value stored in `TargetOFPortField`.

Flows 6-7 are for case 6. They match packets by matching `OutputToControllerRegMark` and the value stored in
`PacketInOperationField`, then output them to the OpenFlow controller (Antrea Agent) with corresponding user data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add the following at the end of the paragraph:

In practice, you will see additional flows similar to these ones to accommodate different scenarios (different PacketInOperationField values). Note that packets sent to controller are metered to avoid overrunning the Antrea Agent and using too many resources.

@hongliangl
Copy link
Contributor Author

@antoninbas Not addressed all review comments. Will continue that several hours later.

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
@@ -1078,15 +1093,19 @@ Some bits of ct mark are persisted:
- The value of `PktSourceField` is persisted to `ConnSourceCTMarkField`, storing the source of the connection for the
current packet and subsequent packets of the connection.

Flow 4 is to resubmit the packets which are not matched by flows 1-3 back to table [ServiceLB] to select Endpoint again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not have to add more information now as we are close to release time, but is this in case of an inconsistency between the OVS groups used for Endpoint section and this table?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but is this in case of an inconsistency between the OVS groups used for Endpoint section and this table?

I guess this flow should be rarely used, unless the case you mentioned. Furthermore, it seems that this flows almost exists with AntreaProxy. I am not sure if there are other reasons for this flow.

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
will first go through the gateway, get load-balanced by the kube-proxy data path (DNAT) then re-enter through the
gateway. Then the packets are received on the gateway port with a source IP belonging to a local Pod.
will first go through the gateway port, get load-balanced by the kube-proxy data path (undergo DNAT with a local Endpoint
selected by the kube-proxy) then re-enter through the gateway port. Then the packets are received on the gateway port
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I may be wrong but I think that is also the case when kube-proxy selects a non-local Endpoint: I don't think kube-proxy will masquerade the traffic and the packet will go back through the gateway to be sent to the remote Pod over the appropriate tunnel. The only exception is the hairpin case.

So maybe replace "undergo DNAT with a local Endpoint selected by the kube-proxy" with "undergoes DNAT", unless I am missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You reminded me. When kube-proxy selects a non-local Endpoint, after DNAT, the packets will be forwarded to antrea-gw0 by the onlink routing added by Antrea NodeController.

Copy link
Contributor Author

@hongliangl hongliangl Apr 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added something to remind the reader like this in bold:

When Antrea is deployed with kube-proxy, and AntreaProxy is not enabled, packets from local Pods destined for Services
will first go through the gateway port, get load-balanced by the kube-proxy data path (undergoes DNAT) then re-enter
the OVS pipeline through the gateway port (through a "onlink" route, installed by Antrea, directing the DNAT'd packets to the gateway port

Services will get load-balanced by the kube-proxy data path (DNAT) and then routed to OVS through the gateway without SNAT.
network destined for Services will be routed to OVS through the gateway port without changing the source IP.
- When Antrea is deployed with kube-proxy, packets from the external network destined for Services whose
`externalTrafficPolicy` is set to `Local` will get load-balanced by the kube-proxy data path (undergo DNAT with a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/undergo/undergoes

@@ -1350,30 +1374,31 @@ If you dump the flows of this table, you may see the following:
```

Flows 1-2 match packets originating from local Pods and destined for the transport IP of remote Nodes, and then forward
them to table [L2ForwardingCalc] to skip Egress SNAT. `ToGatewayRegMark` is loaded, indicating that the output port of
the packets is the local Antrea gateway.
them to table [L2ForwardingCalc] to bypass the Pod-to-Node traffic from Egress SNAT. `ToGatewayRegMark` is loaded,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"to bypass the Pod-to-Node traffic from Egress SNAT" should be replaced with "to bypass Egress SNAT for Pod-to-Node traffic"

docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
docs/design/ovs-pipeline.md Outdated Show resolved Hide resolved
Comment on lines 1390 to 1392
The packets, matched by flows 1-3, are forwared to this table by flow 8 in table [L3Forwarding], as they are classified
as part of traffic destined for the external network. However, these packets are not intended to undergo Egress SNAT.
Consequently, flows 1-3 are used to bypass Egress SNAT for these packets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great clarification, thanks

@hongliangl hongliangl force-pushed the 20230902-ovs-pipeline-doc-1 branch 2 times, most recently from 7ab3cf9 to 5b4ec04 Compare April 26, 2024 06:51
Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
luolanzone
luolanzone previously approved these changes Apr 26, 2024
Copy link
Contributor

@luolanzone luolanzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@luolanzone luolanzone mentioned this pull request Apr 26, 2024
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just one comment

bits [16..18] in NXM_NX_REG4 is set in [SessionAffinityTable] by flow `table=40, priority=0 actions=load:0x1->NXM_NX_REG4[16..18]`.
For reply packets from SNAT'd connections, whose destination IP is the translated SNAT IP, after invoking action `ct`,
the destination IP of the packets will be restored to the original IP before SNAT, stored in the connection tracking
field `ct_nw_dst` before SNAT.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
field `ct_nw_dst` before SNAT.
field `ct_nw_dst`.

Comment on lines +1059 to +1075
2. table=EndpointDNAT, priority=200,tcp,reg3=0xa0a0018,reg4=0x20050/0x7ffff actions=ct(commit,table=AntreaPolicyEgressRule,zone=65520,nat(dst=10.10.0.24:80),exec(set_field:0x10/0x10->ct_mark,move:NXM_NX_REG0[0..3]->NXM_NX_CT_MARK[0..3]))
3. table=EndpointDNAT, priority=200,tcp,reg3=0xa0a0106,reg4=0x20050/0x7ffff actions=ct(commit,table=AntreaPolicyEgressRule,zone=65520,nat(dst=10.10.1.6:80),exec(set_field:0x10/0x10->ct_mark,move:NXM_NX_REG0[0..3]->NXM_NX_CT_MARK[0..3]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The doc says:

nat(type=addrs[:ports][,flag]...)

The src and dst options take the following arguments:

addrs
The IP address addr or range addr1-addr2 from which the translated address should be selected. If only one address is given, then that address will always be selected, otherwise the address selection can be informed by the optional persistent flag as described below. Either IPv4 or IPv6 addresses can be provided, but both addresses must be of the same type, and the datapath behavior is undefined in case of providing IPv4 address range for an IPv6 packet, or IPv6 address range for an IPv4 packet. IPv6 addresses must be bracketed with [ and ] if a port range is also given.

So the answer should be no.

Signed-off-by: Quan Tian <quan.tian@broadcom.com>
@tnqn tnqn merged commit e23d074 into antrea-io:main Apr 26, 2024
31 of 37 checks passed
@hongliangl hongliangl deleted the 20230902-ovs-pipeline-doc-1 branch April 26, 2024 13:43
@hongliangl
Copy link
Contributor Author

Thanks you guys for reviewing this big PR!

@antoninbas
Copy link
Contributor

Thanks you guys for reviewing this big PR!

Thanks @hongliangl for working on this massive documentation PR!

@luolanzone
Copy link
Contributor

echo @antoninbas
Thanks @hongliangl for working on this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/documentation Categorizes issue or PR as related to a documentation.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Document of ovs-pipeline needs to be updated.
6 participants