Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default COPP DHCP Rate Removal for DHCP DoS Mitigation Feature #19065

Closed
wants to merge 1 commit into from

Conversation

asraza07
Copy link

@asraza07 asraza07 commented May 24, 2024

Why I did it

To depreciate default COPP DHCPv4 rate limit in support of new DHCP DoS Mitigation feature.

How I did it

Edited COPP j2 file to remove existing DHCPv4 trap.

How to verify it

Now there is only one rate limit by default in SONiC (via TC) which is kept at 300 packets/sec to ensure backward compatibility after default COPP DHCP rate depreciation.

@asraza07 asraza07 requested a review from lguohan as a code owner May 24, 2024 07:25
@asraza07 asraza07 changed the title Defualt COPP DHCP Rate Removal Default COPP DHCP Rate Removal for DHCP DoS Mitigation Feature May 24, 2024
@prabhataravind
Copy link
Contributor

@asraza07 I don't quite get why we need to remove dhcp trap to support this feature? And why not dhcpv6? This is a potentially risky change. @dgsudharsan @prsunny

@@ -97,7 +97,7 @@
"trap_group": "queue4_group3"
},
"dhcp_relay": {
"trap_ids": "dhcp,dhcpv6",
"trap_ids": "dhcpv6",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DHCP trap is required to redirect and police the DHCP traffic in ASIC before hitting the kernel/control plane. Kernel level DHCP rate-limiting in sonic-net/sonic-swss#3130 may be required for software platforms, but this trap is required for all hardware switches.

@asraza07
Copy link
Author

@asraza07 I don't quite get why we need to remove dhcp trap to support this feature? And why not dhcpv6? This is a potentially risky change. @dgsudharsan @prsunny

Hi @prabhataravind , the reason for removing the COPP trap was to remove the default COPP DHCP rate limit. Since COPP collectively limits the rate of incoming DHCP packets for the entire switch in one single queue, it fails to effectively stop a DoS attack. This is because in the event of a DoS attack where the attacking packets exceed the set rate of 300 packets/sec, although packets are rate-limited, it still negatively impacts legitimate DHCP clients (clients are still unable to be serviced by the DHCP server).

Our HLD proposes a more effective way to protect against DoS attacks by rate-limiting DHCP packets at an interface level (each interface has its own rate-limiting queue) using Traffic Control in the kernel. Using this approach, we can effectively isolate all non-attacking ports from the effects of a DoS attack, allowing legitimate DHCP clients on other ports to be serviced by the DHCP server.

In order for this method to work in all DoS attack scenarios, it is important to remove the existing default DHCP COPP rate so that it does not interfere with packets before they reach the kernel, where the attack will be effectively mitigated. The reason for not removing the DHCPv6 trap is that our alternative method for rate-limiting in the kernel currently supports only DHCPv4 in this HLD. We plan on adding support for DHCPv6 in the future.

To ensure backward compatibility, we have introduced our rate-limiting feature to boot with a default value of 300 packets/second on every interface as this was the limit previously enforced by COPP.

@dgsudharsan
Copy link
Collaborator

Hi @prabhataravind , the reason for removing the COPP trap was to remove the default COPP DHCP rate limit. Since COPP collectively limits the rate of incoming DHCP packets for the entire switch in one single queue, it fails to effectively stop a DoS attack. This is because in the event of a DoS attack where the attacking packets exceed the set rate of 300 packets/sec, although packets are rate-limited, it still negatively impacts legitimate DHCP clients (clients are still unable to be serviced by the DHCP server).

Our HLD proposes a more effective way to protect against DoS attacks by rate-limiting DHCP packets at an interface level (each interface has its own rate-limiting queue) using Traffic Control in the kernel. Using this approach, we can effectively isolate all non-attacking ports from the effects of a DoS attack, allowing legitimate DHCP clients on other ports to be serviced by the DHCP server.

I don't recommend removing the copp trap. If its removed, even though copp traffic will be rate limited by this feature in the software, there will be no rate-limiting in the hardware and this has the potential to disrupt other control traffic and quite risky.

@kperumalbfn
Copy link
Contributor

Hi @prabhataravind , the reason for removing the COPP trap was to remove the default COPP DHCP rate limit. Since COPP collectively limits the rate of incoming DHCP packets for the entire switch in one single queue, it fails to effectively stop a DoS attack. This is because in the event of a DoS attack where the attacking packets exceed the set rate of 300 packets/sec, although packets are rate-limited, it still negatively impacts legitimate DHCP clients (clients are still unable to be serviced by the DHCP server).
Our HLD proposes a more effective way to protect against DoS attacks by rate-limiting DHCP packets at an interface level (each interface has its own rate-limiting queue) using Traffic Control in the kernel. Using this approach, we can effectively isolate all non-attacking ports from the effects of a DoS attack, allowing legitimate DHCP clients on other ports to be serviced by the DHCP server.

I don't recommend removing the copp trap. If its removed, even though copp traffic will be rate limited by this feature in the software, there will be no rate-limiting in the hardware and this has the potential to disrupt other control traffic and quite risky.

Agree with you, also once we remove this DHCP copp rule, there will be no trap entry for DHCP programmed in ASIC to punt the packets to CPU.

@asraza07
Copy link
Author

asraza07 commented May 29, 2024

Hi @prabhataravind , the reason for removing the COPP trap was to remove the default COPP DHCP rate limit. Since COPP collectively limits the rate of incoming DHCP packets for the entire switch in one single queue, it fails to effectively stop a DoS attack. This is because in the event of a DoS attack where the attacking packets exceed the set rate of 300 packets/sec, although packets are rate-limited, it still negatively impacts legitimate DHCP clients (clients are still unable to be serviced by the DHCP server).
Our HLD proposes a more effective way to protect against DoS attacks by rate-limiting DHCP packets at an interface level (each interface has its own rate-limiting queue) using Traffic Control in the kernel. Using this approach, we can effectively isolate all non-attacking ports from the effects of a DoS attack, allowing legitimate DHCP clients on other ports to be serviced by the DHCP server.

I don't recommend removing the copp trap. If its removed, even though copp traffic will be rate limited by this feature in the software, there will be no rate-limiting in the hardware and this has the potential to disrupt other control traffic and quite risky.

Agree with you, also once we remove this DHCP copp rule, there will be no trap entry for DHCP programmed in ASIC to punt the packets to CPU.

@dgsudharsan @kperumalbfn @prabhataravind , thanks for your feedback. Agreed with you all, removing the COPP trap seems to be potentially risky. We can consider modifying our approach and keeping the COPP limit as it is, which will work alongside our proposed kernel-based mechanism. The COPP limit will serve to protect the overall flow of control-plane packets in the ASIC, while our kernel-based rate-limiting mechanism will effectively isolate DHCP DoS attacks from all other non-attacking ports of the switch.

We would appreciate your comments on this. If agreed, we can close this PR and proceed with the rest of the feature and HLD.

@asraza07 asraza07 closed this Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants