Skip to content

Commit

Permalink
Unregister_netdev VRF obsreved and kernel hangs during vrf delete (#109)
Browse files Browse the repository at this point in the history
This was observed when we have below configuration: This happens when we
have ~100 vlans configured:
Sequence:
1. Physical ports member of PortChannel which is member of Vlan(s) and
this VLAN is in VRF and IP is assigned.
2. Move PortChannel out of Vlan(s) and bind to VRF and assign IP (Note:
there is no config save here)
3. Perform config reload
4. Repeat step 2
5. Perform VRF delete

This will cause unregister_netdev for VRF. This is due to netdev
adjacency graph maintained in kernel is not cleaned up properly.
Netdev adjacency will be maintained between VRF-Vlan, VRF-PortChannel,
VRF-Ethernet*. This is maintained for all possible combinations which
forms graph.
But during cleanup, Vlan netdev adjacency(and all adj below it in
hierarchy) unlink doesnt cleanup all adjacency with VRF due to inherent
Vlan netdev behavior.
Overall, this complex netdev adj graph building is not necessary and
only direct adjacency can be maintained.
This has been identified by linux kernel community and fixed in version
>=4.10.

For instance:

    Netdev adjacency tracking fails to create proper dependencies in certain
    configuration
        /-- Vlan10 - PortChannel - Ethernet0
    vrf
        \-- Vlan20 - PortChannel - Ethernet0

    When vrf is deleted adjacency between PortChannel - Vrf, Vrf -
    PortChannel, Ethernet0 - vrf, vrf - Ethernet0 were missing.
    This happens when PortChannel moves out of above hierarchy and added
    back.
    This holds refcount of vrf l3mdev by PortChannel and Ethernet0
    preventing it from being deleted and hanging vrf delete

This has been addressed with 11 commits in kernel version 4.10 as
mentioned in
    https://lists.openwall.net/netdev/2016/10/18/8

Below patch merges 11 commits to single patch file

David Ahern (11):
  net: Remove refnr arg when inserting link adjacencies
  net: Introduce new api for walking upper and lower devices
  net: bonding: Flip to the new dev walk API
  IB/core: Flip to the new dev walk API
  IB/ipoib: Flip to new dev walk API
  ixgbe: Flip to the new dev walk API
  mlxsw: Flip to the new dev walk API
  rocker: Flip to the new dev walk API
  net: Remove all_adj_list and its references
  net: Add warning if any lower device is still in adjacency list
  net: dev: Improve debug statements for adjacency tracking
  • Loading branch information
preetham-singh authored and lguohan committed Oct 10, 2019
1 parent 5786674 commit 5dbf6d5
Show file tree
Hide file tree
Showing 2 changed files with 1,303 additions and 0 deletions.
Loading

0 comments on commit 5dbf6d5

Please sign in to comment.