Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upstream some internal code #17605

Merged
merged 9 commits into from
Dec 11, 2024
Merged

Conversation

donaldsharp
Copy link
Member

Typical found some old code that should be upstreamed. Hence doing so.

It's a grab bag of fixes

zebra/rt_netlink.c Outdated Show resolved Hide resolved
@ton31337
Copy link
Member

ton31337 commented Dec 7, 2024

Possible to make frrbot happy?

vivek-cumulus and others added 9 commits December 9, 2024 12:29
When a MAC gets deleted but associated neighbors remain, the MAC is
kept in the zebra MAC database as an internal ("auto") entry. When
this happens, reset the MAC's remote sequence number. This ensures that
when the host with the MAC later comes up behind a remote VTEP, the
local switch accepts the MAC and installs it into the bridge FDB and
we don't end up in a situation where remote MACs are not installed
into the bridge FDB.

This fix is a corollary of CM-22753 and is this time done for local
MACs upon delete.

Note: Commit is marked Cumulus-only because I need to evalute more
comprehensive changes before upstreaming it.

Ticket: CM-29581
Reviewed By: As above
Testing Done:
1. Multiple rounds of manual testing
2. Two rounds of evpn-smoke, 1 round of precommit

Signed-off-by: Vivek Venkatraman <vivek@cumulusnetworks.com>
Acked-by:      Chirag Shah <chirag@cumulusnetworks.com>
Acked-by:      Anuradha Karuppiah <anuradhak@cumulusnetworks.com>
Check that the L3VNI is "up" before taking action to announce or
withdraw the EVPN type-5 default based on configuration. Otherwise,
there can be timing conditions where a EVPN type-5 default route
gets announced without a VNI and with invalid route targets.

Signed-off-by: Vivek Venkatraman <vivek@nvidia.com>

Ticket: #2684144
Reviewed By: Chirag Shah
Testing Done:
1. Rerun failed test multiple times successfully
2. Some manual testing
3. precommit and partial evpn-smoke
For those packets that we are not sending 16k of data, but something
far less than 256 bytes.  Reduce those stream sizes we allocate
to something much more reasonable.

Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Looks like the cap setting was added for testing mlag via zebra test cli
to config the mlag role. However it is interfering with the valid state
updates rxed from the MLAG daemon based on timing (in some cases the
MLAG state changes are rxed before the capabilities).

Reference logs -
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
root@TORC11:mgmt:/home/cumulus# grep -ri "my_role\|MlagRole" /var/log/frr/bgpd.log
2021/06/18 13:26:40.380130 PIM: pim_mlag_process_mlagd_state_change: msg dump: my_role: SECONDARY, peer_state: DOWN
2021/06/18 13:26:40.380766 PIM: pim_mlag_process_mlagd_state_change: msg dump: my_role: SECONDARY, peer_state: DOWN
2021/06/18 13:26:41.382258 PIM: pim_mlag_process_mlagd_state_change: msg dump: my_role: SECONDARY, peer_state: RUNNING
2021/06/18 13:26:41.382379 PIM: pim_mlag_process_mlagd_state_change: msg dump: my_role: PRIMARY, peer_state: RUNNING
2021/06/18 13:26:52.386071 ZEBRA: Sending capabilities to client pim: MPLS enabled numMultipath 128 GR disabled MaintMode off MlagRole 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Ticket: #2691629

Signed-off-by: Anuradha Karuppiah <anuradhak@nvidia.com>
Ticket: 2740911
Signed-off-by: Wesley Coakley <wcoakley@nvidia.com>
This commit:
"tools: run `vtysh -b` once for all-startup"

changed things so that `vtysh -b` is run after all daemons have started
up instead of doing it for each daemon as they are started up. This
results in one long `vtysh -b`, which for large configs and many daemons
(in the case I saw, 4 daemons and 30,000 line config) can exceed the 20
second timer watchfrr uses to kill "hung" background tasks.

Shouldn't be any harm to increasing this to 90 seconds to give us some
leeway while still making sure we kill anything truly misbehaving.

Signed-off-by: Quentin Young <qlyoung@nvidia.com>
Effectively When bgp would send a route update down
to zebra and immediately after that a asic update
from the kernel was read.  Zebra would choose the
asic update and drop the bgp update leaving us in
a state where bgp was not used as the true source.

Modify the code so that in rib_multipath_nhe
we notice that we have an unprocessed route update
from bgp.  And if so just drop this kernel update
about an older version of the route since it is
no longer needed.

Ticket: 2722533
Signed-off-by: Donald Sharp <sharpd@nvidia.com>
Prior to this we were only filtering EVPN routes from the import logic
if they were not route-type 1/2/3/5, which allowed things like RT-5s to
be imported into an L2VNI/MAC-VRF. This adds additional logic to ensure
routes are only imported into EVIs where they make sense.
No more nonsensical route importing!

Ticket: 2848204
Signed-off-by: Trey Aspelund <taspelund@nvidia.com>
When a unicast route from source vrf is imported into
evpn as type5 route, prepend the asn of the source vrf into
type5 asn path list.

The condition of evpn type5 prefix path info is present but
source vrf route has been cleared via clear command. In this
case existing path info needs to rewrite the source vrf asn.

prepends asn of the source vrf, but the further condition
for existing path attribute for the same route needs to prepend
source vrf asn.

Ticket: #2943080
Testing:
Before fix:
r4# clear ip bgp vrf overlay prefix 0.0.0.0/0
Route Distinguisher: 128.117.243.209:4
*> [5]:[0]:[0]:[0.0.0.0]
         203.0.113.1          0          0 194 ? <--- 64512 is missing
         ET:8 RT:64532:104001 Rmac:06:ec:bf:59:e8:93

After fix:
r4# clear ip bgp vrf overlay prefix 0.0.0.0/0

Route's source vrf bgp output containing ASN 64512:
r4# show bgp vrf overlay
BGP table version is 2, local router ID is 128.117.243.209, vrf id 10
Default local pref 100, local AS 64512
...

Notice after clear command source vrf asn 64512 is retained.
Route Distinguisher: 128.117.243.209:4
*> [5]:[0]:[0]:[0.0.0.0]
         203.0.113.1          0          0 64512 194 ?
         ET:8 RT:64532:104001 Rmac:06:ec:bf:59:e8:93

Signed-off-by: Chirag Shah <chirag@nvidia.com>
@donaldsharp
Copy link
Member Author

frrbot should be happy now

@ton31337
Copy link
Member

ton31337 commented Dec 9, 2024

One more frrbot still?

@donaldsharp
Copy link
Member Author

as per yesterdays discussion in the tech meeting this code is appropriate and frrbot can be safely ignored as taht it is nothing more than 80->100 columns complaints

@ton31337 ton31337 merged commit 024c944 into FRRouting:master Dec 11, 2024
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants