Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix that local traffic cannot be identified in networkPolicyOnly mode #6251

Conversation

hongliangl
Copy link
Contributor

Before this commit, in AntreaProxy, to respect short-circuiting, when installing flows for an external Services, an extra flow with higher priority to match traffic sourced from local (local Pods or local Node) and destined for the external Service will be installed. This is achieved by matching the local Pod CIDR obtained from the local Node object. However, when Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node object is nil, resulting in the failure of install the extra flow mentioned above. To fix the issue, a new reg mark FromLocalRegMark identifying traffic from local Pods or the local Node is introduced to mark the traffic from local. This reg mark can be used in all traffic mode.

Fix #6244

@hongliangl hongliangl added action/backport Indicates a PR that requires backports. action/release-note Indicates a PR that should be included in release notes. area/ovs/openflow Issues or PRs related to Open vSwitch Open Flow. labels Apr 22, 2024
@hongliangl hongliangl force-pushed the 20240419-use-regmark-to-identify-svc-traffic-from-pod branch 2 times, most recently from 2dd2aad to 445be39 Compare April 23, 2024 05:36
pkg/agent/openflow/fields.go Outdated Show resolved Hide resolved
pkg/agent/openflow/pipeline.go Outdated Show resolved Hide resolved
Before this commit, in AntreaProxy, to respect short-circuiting, when
installing flows for an external Services, an extra flow with higher priority
to match traffic sourced from local (local Pods or local Node) and destined
for the external Service will be installed. This is achieved by matching
the local Pod CIDR obtained from the local Node object. However, when
Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node
object is nil, resulting in the failure of install the extra flow mentioned
above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic
from local Pods or the local Node is introduced to mark the traffic from
local. This reg mark can be used in all traffic mode.

Fix antrea-io#6244

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
@hongliangl hongliangl force-pushed the 20240419-use-regmark-to-identify-svc-traffic-from-pod branch from 445be39 to 5ccf210 Compare April 23, 2024 08:04
@hongliangl hongliangl requested a review from tnqn April 23, 2024 08:05
@hongliangl hongliangl added this to the Antrea v2.0 release milestone Apr 23, 2024
@hongliangl
Copy link
Contributor Author

/test-all

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cc @wenyingd to take another look

@@ -533,7 +533,7 @@ func (f *featurePodConnectivity) gatewayClassifierFlows() []binding.Flow {
MatchInPort(f.gatewayPort).
MatchProtocol(ipProtocol).
MatchSrcIP(gatewayIP).
Action().LoadRegMark(FromGatewayRegMark).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit confused, do you mean that Pod traffic in NetworkPolicyOnly scenario will use antrea gatewayIP as the source? If so, this is an invalid assumption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used to accommodate with proxyAll enabled. When proxyAll is enabled, the source IP of Service traffic from the local Node is the gateway IP, and such traffic should be also recognized as local. For NetworkPolicyOnly scenario, this flow will never be used as the Service traffic originated from the local Node will never be handled by Antrea.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the Service traffic originated from the local Node will never be handled by Antrea

It's a bit of an orthogonal discussion but I have a follow up question on this: does that mean that proxyAll can never be used in policyOnly mode? What would prevent that?

Copy link
Contributor Author

@hongliangl hongliangl Apr 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would prevent that?

So far, nothing prevents that, and antrea-agent gets crashed when proxyAll is enabled in networkPolicyMode.

does that mean that proxyAll can never be used in policyOnly mode?

What I said about this flow will never be used as the Service traffic originated from the local Node will never be handled by Antrea was not rigorous. In fact. the proxyAll can be enabled in networkPolicyMode, but Antrea gets something wrong when setting static neighbor, which is prepared for installing static routes for forwarding Service traffic from the external network or the local Node to antrea-gw0. The root cause of this issue should be that the index of antrea-gw0 in nodeConfig is not initialized in networkPolicyOnly mode.

I have two simple ideas:

  1. Prevent proxyAll being enabled in networkPolicyMode mode.
  2. Fix the crash issue mentioned above, and proxyAll can be enabled in networkPolicyMode mode. However, I'm not sure if the route installed by Antrea for forwarding Service traffic has some "conflicts" with the primary CNI.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hongliangl could you another issue for proxyAll with networkPolicyOnly mode?
We can merge the PR first, at least it doesn't worse proxyAll with networkPolicyOnly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, will do it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes at the very least we should fail early when we validate options, and longer term we can investigate supporting proxyAll for policyOnly mode.

@wenyingd
Copy link
Contributor

/test-windows-all

@hongliangl
Copy link
Contributor Author

/test-conformance

@tnqn tnqn merged commit 27424c3 into antrea-io:main Apr 24, 2024
56 of 59 checks passed
@hongliangl hongliangl deleted the 20240419-use-regmark-to-identify-svc-traffic-from-pod branch April 24, 2024 06:34
hongliangl added a commit to hongliangl/antrea that referenced this pull request Apr 28, 2024
…antrea-io#6251)

Before this commit, in AntreaProxy, to respect short-circuiting, when
installing flows for an external Services, an extra flow with higher priority
to match traffic sourced from local (local Pods or local Node) and destined
for the external Service will be installed. This is achieved by matching
the local Pod CIDR obtained from the local Node object. However, when
Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node
object is nil, resulting in the failure of install the extra flow mentioned
above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic
from local Pods or the local Node is introduced to mark the traffic from
local. This reg mark can be used in all traffic mode.

Fix antrea-io#6244

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this pull request Apr 28, 2024
…antrea-io#6251)

Before this commit, in AntreaProxy, to respect short-circuiting, when
installing flows for an external Services, an extra flow with higher priority
to match traffic sourced from local (local Pods or local Node) and destined
for the external Service will be installed. This is achieved by matching
the local Pod CIDR obtained from the local Node object. However, when
Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node
object is nil, resulting in the failure of install the extra flow mentioned
above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic
from local Pods or the local Node is introduced to mark the traffic from
local. This reg mark can be used in all traffic mode.

Fix antrea-io#6244

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
hongliangl added a commit to hongliangl/antrea that referenced this pull request Apr 28, 2024
…antrea-io#6251)

Before this commit, in AntreaProxy, to respect short-circuiting, when
installing flows for an external Services, an extra flow with higher priority
to match traffic sourced from local (local Pods or local Node) and destined
for the external Service will be installed. This is achieved by matching
the local Pod CIDR obtained from the local Node object. However, when
Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node
object is nil, resulting in the failure of install the extra flow mentioned
above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic
from local Pods or the local Node is introduced to mark the traffic from
local. This reg mark can be used in all traffic mode.

Fix antrea-io#6244

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
antoninbas pushed a commit that referenced this pull request Apr 30, 2024
…#6251) (#6268)

Before this commit, in AntreaProxy, to respect short-circuiting, when
installing flows for an external Services, an extra flow with higher priority
to match traffic sourced from local (local Pods or local Node) and destined
for the external Service will be installed. This is achieved by matching
the local Pod CIDR obtained from the local Node object. However, when
Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node
object is nil, resulting in the failure of install the extra flow mentioned
above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic
from local Pods or the local Node is introduced to mark the traffic from
local. This reg mark can be used in all traffic mode.

Fix #6244

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
antoninbas pushed a commit that referenced this pull request May 1, 2024
…#6251) (#6269)

Before this commit, in AntreaProxy, to respect short-circuiting, when
installing flows for an external Services, an extra flow with higher priority
to match traffic sourced from local (local Pods or local Node) and destined
for the external Service will be installed. This is achieved by matching
the local Pod CIDR obtained from the local Node object. However, when
Antrea is deployed in networkPolicyOnly mode, the Pod CIDR in the local Node
object is nil, resulting in the failure of install the extra flow mentioned
above. To fix the issue, a new reg mark `FromLocalRegMark` identifying traffic
from local Pods or the local Node is introduced to mark the traffic from
local. This reg mark can be used in all traffic mode.

Fix #6244

Signed-off-by: Hongliang Liu <lhongliang@vmware.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/backport Indicates a PR that requires backports. action/release-note Indicates a PR that should be included in release notes. area/ovs/openflow Issues or PRs related to Open vSwitch Open Flow.
Projects
None yet
4 participants