Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support --random-fully for iptables SNAT / MASQUERADE rules #6602

Merged

Conversation

antoninbas
Copy link
Contributor

@antoninbas antoninbas commented Aug 8, 2024

We expose 2 new configuration parameters for the Agent: snatFullyRandomPorts and egress.snatFullyRandomPorts. When the first one is set to true, the MASQUERADE iptables rules used to implement "local" Node SNAT will use randomized source port mappings. When the second one is set to true, the SNAT iptables rules used to implement Egress SNAT will use randomized source port mappings. This is achieved with the --random-fully iptables flag.

These new configuration parameters are only supported on Linux Nodes. They require iptables >= 1.6.2, which is not a problem with the standard antrea-agent container images. They also require Linux kernel >= 3.13 and >= 3.14 respectively, but these are very old releases so we can ignore this requirment, which is consistent with the K8s implementation.

snatFullyRandomPorts is set to false by default (this may change in the future). egress.snatFullyRandomPorts is set to null (empty) by default, which means that unless the parameter is explicitly set, we will use the top-level snatFullyRandomPorts value.

Fixes #6544

@antoninbas antoninbas requested a review from tnqn August 8, 2024 23:06
pkg/config/agent/config.go Show resolved Hide resolved
@@ -439,7 +441,7 @@ COMMIT
-A ANTREA-OUTPUT -m comment --comment "Antrea: DNAT local to NodePort packets" -m set --match-set ANTREA-NODEPORT-IP dst,dst -j DNAT --to-destination 169.254.0.252
:ANTREA-POSTROUTING - [0:0]
-A ANTREA-POSTROUTING -m comment --comment "Antrea: SNAT Pod to external packets" ! -o antrea-gw0 -m mark --mark 0x00000001/0x000000ff -j SNAT --to 1.1.1.1
-A ANTREA-POSTROUTING -m comment --comment "Antrea: masquerade Pod to external packets" -s 172.16.10.0/24 -m set ! --match-set ANTREA-POD-IP dst ! -o antrea-gw0 -j MASQUERADE
-A ANTREA-POSTROUTING -m comment --comment "Antrea: masquerade Pod to external packets" -s 172.16.10.0/24 -m set ! --match-set ANTREA-POD-IP dst ! -o antrea-gw0 -j MASQUERADE --random-fully
-A ANTREA-POSTROUTING -m comment --comment "Antrea: masquerade LOCAL traffic" -o antrea-gw0 -m addrtype ! --src-type LOCAL --limit-iface-out -m addrtype --src-type LOCAL -j MASQUERADE --random-fully
-A ANTREA-POSTROUTING -m comment --comment "Antrea: masquerade OVS virtual source IP" -s 169.254.0.253 -j MASQUERADE
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to highlight this masquerade rule (used by proxyAll) for which we have to way to configure randomized source port mappings. I don't think it is an issue. It also wasn't part of the user feature request. IIRC, this traffic goes through 2 rounds of SNAT: once in OVS and once in the host network. So at this point the port has already been remapped by OVS (hash or random port mapping, which is different from the default iptables SNAT behavior?). One could argue that --random-fully would actually be a good default here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So at this point the port has already been remapped by OVS (hash or random port mapping, which is different from the default iptables SNAT behavior?)

Invalid statement, see https://github.com/torvalds/linux/blob/3d5f968a177d468cd13568ef901c5be84d83d32b/net/openvswitch/conntrack.c#L1092

@antoninbas antoninbas added the action/release-note Indicates a PR that should be included in release notes. label Aug 8, 2024
Comment on lines 65 to 66
// from local Pods to the external network. This has no impact on the SNAT rules for the
// Egress feature. This has no impact on the SNAT rules for the Egress feature. Default is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two "This has no impact on the SNAT rules for the Egress feature."

pkg/config/agent/config.go Show resolved Hide resolved
@@ -126,6 +126,10 @@ trafficEncapMode: {{ .Values.trafficEncapMode | quote }}
# performs SNAT and this option will be ignored; for other modes it must be set to false.
noSNAT: {{ .Values.noSNAT }}

# Fully randomize source port mapping in SNAT rules enforced by Nodes for egress traffic from local
# Pods to the external network. This has no impact on the SNAT rules for the Egress feature.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While considering the use cases, I couldn't think of a scenario where users prefer random source port for Node SNAT but prefer original source port for Egress SNAT, or vice verse. And requiring users to set two options of the same name could be error-prone. We received a few false bug reports in the past, which turned out to be because user set a feature gate in controller.conf but not agent.conf. I feel this could be similar, though the consequence of any values of the two options is not really critical.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tend to agree with you. That was my preference, but I based the implementation on the user's comment: #6544 (comment)

I think that overall, there is very little reason not to use --random-fully if it is available.
There was one other approach I was considering: we can make egress.snatFullyRandomPorts a pointer and when omitted it will use whichever value is configured for snatFullyRandomPorts. This way we still have the toggle, but it's out of the way, and users only need to set one configuration. I can push a commit so you can take a look at what I mean.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tnqn I pushed an update, PTAL
If you still think that it is not needed and that we should not introduce the complexity in Antrea code, let me know. I don't feel strongly about it and given the lack of tangible use case, I am ok with removing the extra configuration parameter altogether.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still prefer a single configuration parameter a bit for simplicity. Perhaps the new code can be added when there is such a use case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We got good feedback from @Scoobed (who requested the feature) for the given configuration.
"Hiding" the second parameter from the manifest as is the case in the current version may be a good compromise and help avoid confusion? That being said, I am still on the fence myself.

@Scoobed @robbo10 do you really see a need for egress.snatFullyRandomPorts and the following installation commands:

helm install -n kube-system antrea antrea/antrea --set snatFullyRandomPorts=true --set egress.snatFullyRandomPorts=false
helm install -n kube-system antrea antrea/antrea --set egress.snatFullyRandomPorts=true

These commands would enable fine-grained control over which specific SNAT rules use --random-fully.

@tnqn and I believe that most users would end up just doing the following, which would set --random-fully for all SNAT rules (Node-local and Egress):

helm install -n kube-system antrea antrea/antrea --set snatFullyRandomPorts=true

tnqn
tnqn previously approved these changes Aug 12, 2024
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 209 to 219
{{- if ne .snatFullyRandomPorts nil }}
# Fully randomize source port mapping in Egress SNAT rules. This has no impact on the default SNAT
# rules enforced by each Node for local Pod traffic. By default, we use the same value as for the
# top-level snatFullyRandomPorts configuration, but this field can be used as an override.
snatFullyRandomPorts: {{ .snatFullyRandomPorts }}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to hide the parameter when it's not specified?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was my plan, even though I see that we don't do it for any other configuration parameter.
My rationale was that: 1) most users should not need it and setting the top-level .snatFullyRandomPorts parameter should be enough, 2) it will avoid confusion / issues if users search the parameter by name in the manifest and up setting .egress.snatFullyRandomPorts when they meant to set .snatFullyRandomPorts. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel it would become a bit contradictory: if we assume two options are useful and the feature requester needs to set both, they will only find one option in the published yaml: https://github.com/antrea-io/antrea/blob/ce833392aa4bc77464969345b7e3ef85da63728d/build/yamls/antrea.yml. Only helm users would know there are two options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pushed a commit to stop hiding the field when it is nil / empty

@Scoobed
Copy link

Scoobed commented Aug 12, 2024

Thanks for working on this issue. The configuration is what I was expecting from those conversations.

@robbo10
Copy link

robbo10 commented Aug 19, 2024

Hi @antoninbas - would you have an ETA on when this could be made available? We have a few products hitting bottlenecks which would greatly appreciate this feature 🙂

@antoninbas antoninbas added this to the Antrea v2.2 release milestone Aug 19, 2024
@antoninbas
Copy link
Contributor Author

Hi @antoninbas - would you have an ETA on when this could be made available? We have a few products hitting bottlenecks which would greatly appreciate this feature 🙂

@robbo10 I was out all of last week because of covid, I am back today and I will try to get this merged this week. It will definitely be included in the next release (v2.2).

@robbo10
Copy link

robbo10 commented Aug 19, 2024

Thank you @antoninbas 😊 I hope your feeling better

@antoninbas antoninbas force-pushed the support-random-fully-for-snat-rules branch 2 times, most recently from 028aa9a to ce83339 Compare August 19, 2024 18:16
tnqn
tnqn previously approved these changes Aug 22, 2024
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas
Copy link
Contributor Author

/test-all

We expose 2 new configuration parameters for the Agent:
snatFullyRandomPorts and egress.snatFullyRandomPorts. When the first one
is set to true, the MASQUERADE iptables rules used to implement "local"
Node SNAT will use randomized source port mappings. When the second one
is set to true, the SNAT iptables rules used to implement Egress SNAT
will use randomized source port mappings. This is achieved with the
`--random-fully` iptables flag.

These new configuration parameters are only supported on Linux
Nodes. They require iptables >= 1.6.2, which is not a problem with the
standard antrea-agent container images. They also require Linux kernel
>= 3.13 and >= 3.14 respectively, but these are very old releases so we
can ignore this requirment, which is consistent with the K8s
implementation.

snatFullyRandomPorts is set to false by default (this may change in the
future). egress.snatFullyRandomPorts is set to null (empty) by default,
which means that unless the parameter is explicitly set, we will use the
top-level snatFullyRandomPorts value.

Fixes antrea-io#6544

Signed-off-by: Antonin Bas <antonin.bas@broadcom.com>
@antoninbas
Copy link
Contributor Author

/test-all

@antoninbas antoninbas requested a review from tnqn August 22, 2024 20:29
@antoninbas
Copy link
Contributor Author

@tnqn I had to resolve a conflict, and I also took the opportunity to squash all commits

Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@antoninbas antoninbas merged commit 707d3ea into antrea-io:main Aug 23, 2024
53 of 58 checks passed
@antoninbas antoninbas deleted the support-random-fully-for-snat-rules branch August 23, 2024 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/release-note Indicates a PR that should be included in release notes.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add --random-fully to SNAT IP Table Rules
4 participants