Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use NetworkMonitor in zedrouter #3033

Merged
merged 1 commit into from
Feb 14, 2023

Conversation

milan-zededa
Copy link
Contributor

@milan-zededa milan-zededa commented Feb 9, 2023

NetworkMonitor provides all features needed by zedrouter to watch for changes in the network stack (used mostly for policy based routing purposes). This way we can remove some code from the zedrouter and avoid duplication. Moreover, NetworkMonitor is an interface and there will be multiple implementations for different network stacks (currently there is only one for Linux).

This is just a small step in the process of zedrouter refactoring. My goal is to avoid creating large PRs and overwhelming myself and reviewers. Instead, I would like to refactor in small steps (while keeping zedrouter functional in-between). But the value of some of these smaller steps will only be apparent once everything is completed.

Signed-off-by: Milan Lenco milan@zededa.com

@rouming
Copy link
Contributor

rouming commented Feb 10, 2023

That's cool. Sorry to stick my nose into not my area, I was just curious how you implemented the NetworkMonitor and found this comment:

	// XXX is AF_UNSPEC ok?
	nlRoutes, err := netlink.RouteListFiltered(syscall.AF_UNSPEC, &filter, fflags)

This seems to be correct. I followed the following path:

func (h *Handle) RouteListFiltered(family int, filter *Route, filterMask uint64) ([]Route, error) {
	req := h.newNetlinkRequest(unix.RTM_GETROUTE, unix.NLM_F_DUMP)
	infmsg := nl.NewIfInfomsg(family)
	req.AddData(infmsg)

Creates a netlink request, which gets routes and dumps them.

The following is the callback for RTM_GETROUTE messages and PF_UNSPEC family registered in the kernel:

*** net/core/rtnetlink.c:
rtnetlink_init[6198]           rtnl_register(PF_UNSPEC, RTM_GETROUTE, NULL, rtnl_dump_all, 0);

Which registers router netlink rtnl_dump_all callback for UNSPEC family and RTM_GETROUTE requests. The rtnl_dump_all dumps all the routes (thus name: rtnl_dump_all) for all the other families iterating over all other protocols and calls dumpit callback if registered.

The dumpit callback passed to the rtbl_resiter call in the system init, for example here is the FIB for the ipv4:

*** net/ipv4/fib_frontend.c:
ip_fib_init[1630]              rtnl_register(PF_INET, RTM_GETROUTE, NULL, inet_dump_fib, 0);

and here is the ipv6:

*** net/ipv6/ip6_fib.c:
fib6_init[2456]                ret = rtnl_register_module(THIS_MODULE, PF_INET6, RTM_GETROUTE, NULL,

So '// XXX is AF_UNSPEC ok?' comment is not needed :) (if you did not mean something different of course)

LGTM (at least to my shallow understanding of what you are doing in this patchset).

@milan-zededa
Copy link
Contributor Author

@rouming Nice, thanks for sticking your nose into the networking area, that was educational :)
Added comment removal into this PR.

@@ -557,6 +557,7 @@ func (m *LinuxNetworkMonitor) linkSubscribe(doneChan chan struct{}) chan netlink
}
linkOpts := netlink.LinkSubscribeOptions{
ErrorCallback: linkErrFunc,
ListExisting: true, // XXX: currently required by zedrouter (later will be removed)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do you plan to remove the listExisting?
Can't use a get+subscription since the order would not be defined between what you retrieve with a get and when you see any additions or deletings from the subscription.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we have very small routing tables (only getting routes from DHCP or static config, no dynamic routing protocols), I would prefer to fully reconcile a routing table on change (reread the current state, update the intended state, run reconcile). In other words, use these notifications only to detect that config is out-of-sync. It should be more robust at the cost of more netlink calls made overall.
But I may rethink this once I get to it, so for now keeping there the XXX comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, such an approach can avoid race conditions as long as you notification generated after you start the read of the current causes the code to redo the full read. I don't know if you also need to make sure you discard the results of that first read - whether the set of routes you read could be inconsistent. By that I mean that a read could have e.g., a default route with a nexthop of 192.168.17.1 but you have no route for 192.168.17.1 in the set of routes you read (because that interface/route was deleted just after netlink read the default route.)

Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM; let it test.

@milan-zededa
Copy link
Contributor Author

dev_local_info test seems to be reliably failing, will investigate... (please do not merge for now)

@milan-zededa
Copy link
Contributor Author

Eden tests should be fixed, @eriknordmark please rerun when you can.

NetworkMonitor provides all features needed by zedrouter to watch for
changes in the network stack (used mostly for policy based routing
purposes). This way we can remove some code from the zedrouter and
avoid duplication. Moreover, NetworkMonitor is an interface and there
will be multiple implementations for different network stacks
(currently there is only one for Linux).

This is just a small step in the process of zedrouter refactoring.

Signed-off-by: Milan Lenco <milan@zededa.com>
Copy link
Contributor

@eriknordmark eriknordmark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-run eden

@eriknordmark eriknordmark merged commit b71a225 into lf-edge:master Feb 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants