Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Unregister_netdev VRF obsreved and kernel hangs during vrf delete (#109)
This was observed when we have below configuration: This happens when we have ~100 vlans configured: Sequence: 1. Physical ports member of PortChannel which is member of Vlan(s) and this VLAN is in VRF and IP is assigned. 2. Move PortChannel out of Vlan(s) and bind to VRF and assign IP (Note: there is no config save here) 3. Perform config reload 4. Repeat step 2 5. Perform VRF delete This will cause unregister_netdev for VRF. This is due to netdev adjacency graph maintained in kernel is not cleaned up properly. Netdev adjacency will be maintained between VRF-Vlan, VRF-PortChannel, VRF-Ethernet*. This is maintained for all possible combinations which forms graph. But during cleanup, Vlan netdev adjacency(and all adj below it in hierarchy) unlink doesnt cleanup all adjacency with VRF due to inherent Vlan netdev behavior. Overall, this complex netdev adj graph building is not necessary and only direct adjacency can be maintained. This has been identified by linux kernel community and fixed in version >=4.10. For instance: Netdev adjacency tracking fails to create proper dependencies in certain configuration /-- Vlan10 - PortChannel - Ethernet0 vrf \-- Vlan20 - PortChannel - Ethernet0 When vrf is deleted adjacency between PortChannel - Vrf, Vrf - PortChannel, Ethernet0 - vrf, vrf - Ethernet0 were missing. This happens when PortChannel moves out of above hierarchy and added back. This holds refcount of vrf l3mdev by PortChannel and Ethernet0 preventing it from being deleted and hanging vrf delete This has been addressed with 11 commits in kernel version 4.10 as mentioned in https://lists.openwall.net/netdev/2016/10/18/8 Below patch merges 11 commits to single patch file David Ahern (11): net: Remove refnr arg when inserting link adjacencies net: Introduce new api for walking upper and lower devices net: bonding: Flip to the new dev walk API IB/core: Flip to the new dev walk API IB/ipoib: Flip to new dev walk API ixgbe: Flip to the new dev walk API mlxsw: Flip to the new dev walk API rocker: Flip to the new dev walk API net: Remove all_adj_list and its references net: Add warning if any lower device is still in adjacency list net: dev: Improve debug statements for adjacency tracking
- Loading branch information