Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why traffic noEncap/hybrid only support with Antrea-Proxy enabled #2600

Closed
Jexf opened this issue Aug 16, 2021 · 13 comments
Closed

Why traffic noEncap/hybrid only support with Antrea-Proxy enabled #2600

Jexf opened this issue Aug 16, 2021 · 13 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.

Comments

@Jexf
Copy link
Member

Jexf commented Aug 16, 2021

Why traffic noEncap/hybrid only support with Antrea-Proxy enabled

[root@tos-00 ~]# kubectl logs -n kube-system antrea-agent-g4fck -f
I0816 05:55:23.970930       1 log_file.go:99] Set log file max size to 104857600
F0816 05:55:23.972597       1 main.go:55] Failed to validate: TrafficEncapMode hybrid requires AntreaProxy to be enabled
goroutine 1 [running]:
k8s.io/klog/v2.stacks(0xc000346501, 0xc00033a0b0, 0x79, 0xa2)
	/root/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:1021 +0xb9
k8s.io/klog/v2.(*loggingT).output(0x381dd80, 0xc000000003, 0x0, 0x0, 0xc0004e11f0, 0x372e2db, 0x7, 0x37, 0x0)
	/root/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:970 +0x19b
k8s.io/klog/v2.(*loggingT).printf(0x381dd80, 0xc000000003, 0x0, 0x0, 0x0, 0x0, 0x252bf44, 0x16, 0xc000159450, 0x1, ...)
	/root/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:751 +0x191
k8s.io/klog/v2.Fatalf(...)
	/root/go/pkg/mod/k8s.io/klog/v2@v2.8.0/klog.go:1509
main.newAgentCommand.func1(0xc00035c2c0, 0xc00012b880, 0x0, 0x8)
	/root/showmaker/antrea/cmd/antrea-agent/main.go:55 +0x255
github.com/spf13/cobra.(*Command).execute(0xc00035c2c0, 0xc0001b0010, 0x8, 0x8, 0xc00035c2c0, 0xc0001b0010)
	/root/go/pkg/mod/github.com/spf13/cobra@v1.1.1/command.go:854 +0x2c2
github.com/spf13/cobra.(*Command).ExecuteC(0xc00035c2c0, 0x0, 0x0, 0x0)
	/root/go/pkg/mod/github.com/spf13/cobra@v1.1.1/command.go:958 +0x375
github.com/spf13/cobra.(*Command).Execute(...)
	/root/go/pkg/mod/github.com/spf13/cobra@v1.1.1/command.go:895
main.main()
	/root/showmaker/antrea/cmd/antrea-agent/main.go:37 +0x52

By the way, maybe it would be better to give a warning log instead of carsh directly.

@Jexf Jexf added the kind/feature Categorizes issue or PR as related to a new feature. label Aug 16, 2021
@tnqn
Copy link
Member

tnqn commented Aug 16, 2021

When AntreaProxy is not enabled, pod-to-service traffic is handled by iptables/ipvs in root netns, if the endpoint is not local the DNATed traffic will be output to physical network directly without going back to OVS for Egress NetworkPolicy enforcement, which breaks basic security functionality. If it's just a warning, many users won't notice it and their Pods won't be secured as expected. As it is about security, the validation is made mandatory.

@Jexf
Copy link
Member Author

Jexf commented Aug 16, 2021

Thanks for reply, I got it @tnqn

@Jexf Jexf closed this as completed Aug 16, 2021
@lionstack
Copy link

lionstack commented Aug 27, 2021

How about adding a parameter named ”force_hybrid_without_proxy“, and the default value is "false". When it's set to "true", we can use hybrid mode without antrea-proxy, which will improve network performance in some environments,even communicate with other CNI(such as flannel DR mode) in one cluster. Also, the log will show a warning "The Egress NetworkPolicy won't work correctly when force_hybrid_without_proxy is true, please use Ingress NetworkPolicy to manage the security access”
@tnqn

@antoninbas
Copy link
Contributor

we can use hybrid mode without antrea-proxy, which will improve network performance in some environments

@lionstack out-of-curiosity in which situation do you see better performance with kube-proxy than with AntreaProxy? If anything we are looking to invest more into AntreaProxy in the future, with the possible addition of features that only work with AntreaProxy.

@lionstack
Copy link

lionstack commented Aug 31, 2021

@antoninbas Hi, thx for your reply. I mean if the two nodes are in one subnet, in the hybrid mode the pods on the nodes can communicate with each other by host route like: 10.224.1.0/24 via 192.168.10.10 dev eth0, which won't use network tunnel. And the performance of route is better than network tunnel.
But for now, the hybrid mode is not compatible with kube-proxy, so I want to add a parameter that to make sure the hybrid mode can always be used, even we don't use AntreaProxy (Because of kernel version and openvswitch version, we cannot use AntreaProxy to replace kube-proxy)

@antoninbas
Copy link
Contributor

this seems like a legitimate request, but it would be hard for us to assist users if they run into issues with this mode, so they basically would be on their own. Features beyond NetworkPolicies (e.g. Traceflow) may break and additional features may break unexpectedly in the future. I suppose one way for us to enable you to do what you want and force "AntreaProxy" to be disabled even in hybrid mode, would be to introduce an environment variable which would disable the check in the Agent. You can then manually edit your Antrea YAML manifest to set this environment variable. It's a bit more hidden than a config parameter in that case. @jianjuns @tnqn what do you think? But moving forward, I want to emphasize that this is not a configuration we would be validating in CI or providing guarantees for.

@jianjuns
Copy link
Contributor

jianjuns commented Sep 2, 2021

If we are sure it works, I am fine to add the flag. I do not remember if we made any code changes when disallowing noEncap/hybrid/policyOnly with kube-proxy. I can check.

@jianjuns
Copy link
Contributor

jianjuns commented Sep 4, 2021

I did some tests and see at least noEncap and hybrid modes do work with kube-proxy (Pod -> Service traffic can still go through). So, I am fine to add a ConfigMap parameter or environment variable to allow noEncap/hybrid/networkPolicyOnly modes with kube-proxy (and say it is not a guaranteed configuration). @antoninbas @tnqn

@tnqn
Copy link
Member

tnqn commented Sep 9, 2021

An environment variable sounds good to me.

@jianjuns
Copy link
Contributor

jianjuns commented Sep 9, 2021

@Jexf : wonder if you like to make the change as discussed above :) If not, I can take it.

@Jexf
Copy link
Member Author

Jexf commented Sep 10, 2021

@jianjuns Thank you for reminding me, I would like to take it.

@Jexf Jexf reopened this Sep 10, 2021
WenzelZ pushed a commit to WenzelZ/antrea that referenced this issue Sep 16, 2021
For performance, NoEncap mode can make the traffic output to physical network directly without going back to OVS for Egress NetworkPolicy enforcement.Although this destroys the basic security function, we can force support NoEncap with TrafficDirectRouting environment for performance.

Signed-off-by: Wenze Gao <wenze.gao@transwarp.io>
WenzelZ pushed a commit to WenzelZ/antrea that referenced this issue Sep 17, 2021
NoEncap mode can make the traffic output to physical network directly. When antrea proxy is disable, traffic won't go back to OVS for Egress NetworkPolicy enforcement, it breaks the basic security function, we can force support NoEncap with TrafficDirectRouting environment parameter for performance.

Signed-off-by: Wenze Gao <wenze.gao@transwarp.io>
WenzelZ pushed a commit to WenzelZ/antrea that referenced this issue Sep 17, 2021
NoEncap mode can make the traffic output to physical network directly. When antrea proxy is disable, traffic won't go back to OVS for Egress NetworkPolicy enforcement, it breaks the basic security function, we can force support NoEncap with TrafficDirectRouting environment parameter for performance.

Signed-off-by: Wenze Gao <wenze.gao@transwarp.io>
WenzelZ pushed a commit to WenzelZ/antrea that referenced this issue Sep 23, 2021
NoEncap mode can make the traffic output to physical network directly. When antrea proxy is disable, traffic won't go back to OVS for Egress NetworkPolicy enforcement, it breaks the basic security function, we can force support NoEncap with TrafficDirectRouting environment parameter for performance.

Signed-off-by: Wenze Gao <wenze.gao@transwarp.io>
Jexf pushed a commit to WenzelZ/antrea that referenced this issue Dec 10, 2021
NoEncap mode can make the traffic output to physical network directly.
When antrea proxy is disable, traffic won't go back to OVS for Egress
NetworkPolicy enforcement, it breaks the basic security function, we
can force support NoEncap without antrea proxy by using
ALLOW_NO_ENCAP_WITHOUT_ANTREA_PROXY environment parameter for performance.

Signed-off-by: Wenze Gao <wenze.gao@transwarp.io>
Signed-off-by: Wu zhengdong <zhengdong.wu@transwarp.io>
@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2021
@antoninbas antoninbas removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 17, 2021
@github-actions
Copy link
Contributor

This issue is stale because it has been open 90 days with no activity. Remove stale label or comment, or this will be closed in 90 days

@github-actions github-actions bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants