Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not delete IPv6 link-local route in reconciler #5483

Merged
merged 1 commit into from
Sep 20, 2023

Conversation

wenyingd
Copy link
Contributor

@wenyingd wenyingd commented Sep 13, 2023

In the existing code, the IPv6 link-local route on antrea-gw0 is deleted in
route reconcile, which results in the IPv6 Neighbor Solicitation sent from Pod's
link-local address is dropped on the Node by kenel reverse path filtering, and
Pod would mark the antrea-gw0 as a "FAILED" neighbor. Then the Pod's accross
Node traffic or the Pod-to-external traffic does not work as expected.

This change includes,

  1. Do not delete IPv6 link-local routes in the reconcile function,
  2. Restore IPv6 link-local route in syncRoute function.

Fix: #5482

// A route destined to an IPv6 link-local CIDR is always system auto-generated along with a link-local
// address, which is not configured by antrea and is supposed to be ignored in the "deletion" list.
// Such routes are helpful in some case, e.g., IPv6 NDP.
if route.Dst.IP != nil && route.Dst.IP.IsLinkLocalUnicast() && route.Dst.IP.To4() == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPNet won't have a nil IP, the first check can be removed. L902 never check it as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, updated.

@@ -0,0 +1,62 @@
package wfp
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove the file

@tnqn tnqn added area/transit/ipv6 Issues or PRs related to IPv6. action/backport Indicates a PR that requires backports. action/release-note Indicates a PR that should be included in release notes. labels Sep 13, 2023
@tnqn
Copy link
Member

tnqn commented Sep 13, 2023

To prevent regression in the future, can we improve testReconcileGatewayRoutesOnStartup to cover this validation of this route?

@wenyingd
Copy link
Contributor Author

/test-all
/test-ipv6-all
/test-ipv6-only-all

Comment on lines 894 to 916
// A route destined to an IPv6 link-local CIDR is always system auto-generated along with a link-local
// address, which is not configured by antrea and is supposed to be ignored in the "deletion" list.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should also add the route to

gwAutoconfRoutes = append(gwAutoconfRoutes, &netlink.Route{
LinkIndex: c.nodeConfig.GatewayConfig.LinkIndex,
Dst: c.nodeConfig.PodIPv6CIDR,
Src: c.nodeConfig.GatewayConfig.IPv6,
Scope: netlink.SCOPE_LINK,
})
given we already have logic to recover required routes and to make bumping up release can fix broken clusters. Otherwise the cluster would not recover until Nodes are rebooted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@wenyingd
Copy link
Contributor Author

/test-ipv6-all
/test-ipv6-only-all
/test-all

@wenyingd
Copy link
Contributor Author

/test-conformance
/test-ipv6-only-e2e
/test-ipv6-only-conformance

@wenyingd
Copy link
Contributor Author

/test-ipv6-only-e2e
/test-ipv6-only-conformance

tnqn
tnqn previously approved these changes Sep 15, 2023
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, one question

pkg/agent/route/route_linux.go Show resolved Hide resolved
@tnqn
Copy link
Member

tnqn commented Sep 15, 2023

Please update the commit message to reflect the actual change: it will restore the link-local route if it doesn't exist.

@wenyingd
Copy link
Contributor Author

/test-all
/test-ipv6-all
/test-ipv6-only-all

pkg/agent/route/route_linux.go Outdated Show resolved Hide resolved
pkg/agent/route/route_linux.go Outdated Show resolved Hide resolved
pkg/agent/route/route_linux.go Outdated Show resolved Hide resolved
pkg/agent/route/route_linux.go Outdated Show resolved Hide resolved
@wenyingd
Copy link
Contributor Author

/test-all

@wenyingd
Copy link
Contributor Author

The IPv6 e2e failure is not related with this change, using another issue #5492 to track it.

Copy link
Contributor

@luolanzone luolanzone left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, one nit.

pkg/agent/route/route_linux.go Outdated Show resolved Hide resolved
luolanzone
luolanzone previously approved these changes Sep 19, 2023
Comment on lines 233 to 244
// The system auto-generated IPv6 link-local route always uses "fe80::/64"
// as the destination regardless of the interface's global address's mask.
llRouteDstv6 := "fe80::/64"
_, llrCIDR, _ := net.ParseCIDR(llRouteDstv6)
routeKey := func(r *netlink.Route) string {
if r.Dst.String() == llRouteDstv6 {
// Use "$Dst_$linkIndex" as the key of IPv6 link-local route because
// such route entry is auto-configured on each net interface.
return fmt.Sprintf("%s_%d", r.Dst.String(), r.LinkIndex)
}
return r.Dst.String()
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can make it more generic, instead of making a map with different kinds of keys and later compare the values, the routes could be saved in a set, item of which contains only the fields we care about:

type routeKey struct {
	LinkIndex int
	Dst       string
	Gw        string
}
routeKeys := sets.New[routeKey]()
for _, route := range routeList {
	routeKeys.Insert(routeKey{
		LinkIndex: route.LinkIndex,
		Dst:       route.Dst.String(),
		Gw:        route.Gw.String(),
	})
}

restoreRoute := func(route *netlink.Route) bool {
	if routeKeys.Has(routeKey{
		LinkIndex: route.LinkIndex,
		Dst:       route.Dst.String(),
		Gw:        route.Gw.String(),
	}) {
		return true
	}
	if err := c.netlink.RouteReplace(route); err != nil {
		klog.ErrorS(err, "Failed to sync route", "Route", route)
		return false
	}
	return true
}

Comment on lines 239 to 242
// The system auto-generated IPv6 link-local route always uses "fe80::/64"
// as the destination regardless of the interface's global address's mask.
llRouteDstv6 := "fe80::/64"
_, llrCIDR, _ := net.ParseCIDR(llRouteDstv6)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Static value could be declared as global variable to avoid repeated parse, following

globalVMAC, _ = net.ParseMAC("aa:bb:cc:dd:ee:ff")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

In the existing code, the IPv6 link-local route on antrea-gw0 is deleted in
route reconcile, which results in the IPv6 Neighbor Solicitation sent from Pod's
link-local address is dropped on the Node by kenel reverse path filtering, and
Pod would mark the antrea-gw0 as a "FAILED" neighbor. Then the Pod's accross
Node traffic or the Pod-to-external traffic does not work as expected.

This change includes,
1. Do not delete IPv6 link-local routes in the reconcile function,
2. Restore IPv6 link-local route in syncRoute function.

Signed-off-by: wenyingd <wenyingd@vmware.com>
Copy link
Member

@tnqn tnqn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tnqn
Copy link
Member

tnqn commented Sep 20, 2023

/test-all
/test-ipv6-all

@wenyingd
Copy link
Contributor Author

/test-ipv6-e2e

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
action/backport Indicates a PR that requires backports. action/release-note Indicates a PR that should be included in release notes. area/transit/ipv6 Issues or PRs related to IPv6.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

antrea-gw0 doesn't respond IPv6 neighbor solicitation message with a link-local source address in ipv6 env
3 participants