PoC scouting bugfix for IPv4LL/IPv6LL employed on multiple NICs #1598

psiegl · 2024-11-19T22:59:56Z

This might be a niche issue, however it is yet a bug in Zenoh.
Below is the most simplest description, but the issue could be a bit bigger.

Imagine there are two machines. Machine A has 2 NICs, where the first is connected to a DHCP IPv4 router.
Furthermore, machine A has a second NIC, that is connected to machine B, while both employ IPv4LL/IPv6LL.
This could look like as follows:

Machine A:
- eth0: IPv4: IPv4 address supplied by a DHCP router
        IPv6: IPv6LL address, as the DHCP router did solely serve an IPv4
- eth1: IPv4: IPv4LL address (connected to Machine B)
        IPv6: IPv6LL address (connected to Machine B)

Machine B:
- eth1: IPv4: IPv4LL address (connected to Machine A)
        IPv6: IPv6LL address (connected to Machine A)

(For later simplicity, machine Bs interface starts counting with 1.)

Intention is, to employ a PUB on machine A, and a SUB on machine B. Thus, what matters is ethernet 1.
Both machines employ Zenoh in peer mode, however other modes will have the same issue.
The network protocol of interest could be Zenohs default TCP.
Both machines shall be listening on IPv6:

  listen: {
		...
		endpoints: { router: ["tcp/[::]:7447"], peer: ["tcp/[::]:7447"] }
		...
  }

One could hardcode the NIC with the Linux feature #iface=eth1, but it won't change the issue.

Now, what is important is:
Zenohs scouting shall be employed to find each other due to IPv4LL/IPv6LL addresses.
Consequently, a Zenoh config could hereby look as follows (in case for IPv4 Multicast):

  scouting: {
		...
		multicast: {
			...
				address: "224.0.0.224:7446"
			...
		}
		...
  }

Using again the Linux-feature of hardcoding the NIC with #iface=eth1 does not make a difference.
For now, I have not tried IPv6 Multicast (sth. like [ff00::224]).
However, by looking into the code, I assume it wouldn't fix the issue.

Well, Zenohs scouting reaches out to anyone, and a Hello returns with specific endpoints, as described in:
zenoh/commons/zenoh-protocol/src/scouting/hello.rs

Obviously, this is happening for machine A obtaining a Locator with an IPv6LL with port of machine B:
See zenoh/src/net/runtime/orchestrator.rs -> Runtime::scout()

Similarily machine B obtains a Locator with an IPv6LL with port of machine A.
As soon as both try to establish the connection, machine A fails to do so and thus both can't communicate.

Reason for this is, that link-local addresses don't come explicitly with a default route.
An ip -6 route on machine A will show:

fe80::/64 dev eth0 proto kernel metric 256 pref medium
fe80::/64 dev eth1 proto kernel metric 256 pref medium

Thus, despite the fact that machine B is physically connected with machine A via eth1, Zenoh will try to connect with eth0.
Reason for this is, that Zenoh does not determine for an incoming Hello message, what NIC this message truly came in.
Meaning, Zenoh is hoping the routing is nicely set up, which might not be the case for link-local.

While investigating into zenoh/src/net/runtime/orchestrator.rs -> Runtime::scout(), the last time the socket with the incoming Hello message is given is at line:

  let recvs = futures::future::select_all(sockets.iter().map(move |socket| {

As soon as this select_all() incl. the loop finishes, the information about the receiving NIC is gone.
However, the Locator does not obtain any NIC definition, see commons/zenoh-protocol/src/core/locator.rs -> Locator::new()
The default value for config is "", thus Runtime::scout() won't keep the information about the NIC:

                            let res: Result<ScoutingMessage, DidntRead> = codec.read(&mut reader);
                            if let Ok(msg) = res {
                                if let ScoutingBody::Hello(hello) = &msg.body {

As of this, my suggestion is to modify each hello.locator[] in the sense of appending the specific interface via #iface=<iface found>
Zenoh provides such lookup already with: zenoh_util::net::get_interface_names_by_addr(local_addr.ip())

Sole pity is: get_interface_names_by_addr() can return multiple NICs; for now i.e. discussion I fixed it with taking the first one.

Now, similar issues should be capable to be seen with IPv4LL only, in case another machine C would be connected to machine As eth0. While two link-local networks would be employed, i.e. between A and C, as well as A and B.

Bottom line: scouting has an issue, in case multiple NICs are present and ip route is inconclusive (such as in case of IPv4LL/IPv6LL). However, Zenoh detects any Hello response on its particular socket, thus has the capability to determine, on what NIC such Hello truly came in. Zenoh is solely required to keep this information and use it the moment a connection shall be established. At least for Linux the #iface= feature is present, which the fix is leveraging. However, it seems Windows and Mac miss this features, thus it is wise to implement it properly for these OSses as well.

github-actions · 2024-11-19T23:00:09Z

PR missing one of the required labels: {'documentation', 'enhancement', 'bug', 'internal', 'dependencies', 'breaking-change', 'new feature'}

github-actions · 2024-11-19T23:03:24Z

PR missing one of the required labels: {'enhancement', 'new feature', 'documentation', 'breaking-change', 'internal', 'bug', 'dependencies'}

commons/zenoh-protocol/src/core/locator.rs

zenoh/src/net/runtime/orchestrator.rs

psiegl · 2024-11-27T18:44:52Z

@Mallets would be great if you could re-start the CI status checks.

Mallets · 2024-11-29T08:40:34Z

@psiegl the testing and review of this PR is on my backlog, I didn't forget about it :)

Mallets · 2024-12-03T13:42:27Z

After investigating more the code, I believe we have few road blockers:

Assumptions

However, Zenoh detects any Hello response on its particular socket, thus has the capability to determine, on what NIC such Hello truly came in.

The Hello message is crafted only once with all the locators included and sent untouched on the multicast group. This means that the message going out on a given interface may contain locators that belong to another interface.
So, until scouting is capable of crafting and sending ad-hoc messages per interface, the above assumption does not hold.
I.e., it's not guaranteed that a locator received on a interface will be reached on that interface.
This is clearly a limitation (if not a bug...) as of Today.

Proposed approach

The proposed approach always adds the iface= config parameter on every locator. Given the previous point, it may happen that a global IP address is bind to an interface, overriding the IP routing table of the host. This is undesirable and it should only be applied on link local type of links.

Considerations

The orchestrator is agnostic to any specific type of link, adding such information would require some extension in the link implementation and to be able to bubble up in an abstract way this level of information to be able to properly handle the interface binding.
Support of interface binding is a limitation coming from tokio where only Android or Fuchsia or Linux are supported at the time of writing. So, even if all the above is implemented, its applicability remains limited to Linux for the time being. I barely see adopting anything different than tokio in Zenoh now.
Does it make sense to support link local at all in the scouting? While reviewing this PR and analysing the current status, I've start wondering wether link local locators in multicast scouting have any real applicability. More than that, scouting is something in Zenoh is not only limited to multicast scouting but also to gossip scouting (sort of in-band scouting). In that case it will be impossible to determine the right interface.

So, do you have any more detailed use case of using link local addresses in combination of scouting? And what would be the case where only link local addresses are configured and no global addresses are available? At the end, should link local locators simply be removed from scouting?

psiegl · 2024-12-03T21:25:57Z

Well, as mentioned in my very first statement:

This might be a niche issue, however it is yet a bug in Zenoh.

Maybe we could emphasis the niche and based on our discussion it seems to be more an enhancement.

I also detected, that the Hello message gathers all given locators and sends one message back. To me, this looks rather like a performance optimisation, not even a limitation as you state:
Imagine a node containing multiple NICs that itself are connected to a dedicated network segment i.e. each NIC is connected to a different network segment (e.g. one via fiber, another via mobile phone). Another node might be equally connected to this network setup. The first Hello received will provide the scouting node a hint that there might be also other routes to reach the Helloing node. (In such setup, Zenoh could act similar as in Multipath TCP or Link aggregation.)
Thus, the scouting node needs at least to determine, which of the Locators given in the Hello is truly the one that came in via the current NIC employed. The other locators the scouting node could at least keep, e.g. for later redundancy or performance optimisation.

However, the gossip scouting seems to be truly a big issue (in case I understood correctly).

I guess Zenoh is completely fine with one pure link-local network itself. Or on a node that contains two NICs, where the first NIC obtains both IPv4 and IPv6 via DHCP, while the second NIC performs IPv4LL/IPv6LL.
So one could phrase:

Zenoh might run into an issue, in case there are two (or more) NICs that all employ link-local addresses i.e. IPv4LL/IPv6LL. This also applies, in case a DHCP router provides to any of these NICs either an IPv4 or an IPv6, while the sibbling IP is filled with a link-local address as long as there are other NICs with IPv4LL/IPv6LL.

Link-local addresses are of interest in Zero-configuration networking, which might not be the daily driver of Zenoh workloads. However, in my case due to some requirements IPv4LL/IPv6LL are quite of interest (see my very first intro explanation).

Considering that as of today, Zenoh would solely be able to fix this issue in Unix-derivates due to tokio, and considering that Zeroconf might not be the typical use-case of Zenoh users: I wonder if you have an ERRATA for Zenoh, where you could at least hint above as a known issue?
At least I have a workaround for now: either this patch, in case of mixed DHCP-supplied IP/link-local IP on one NIC and pure link-local i.e. Zeroconf on the other NIC. Or ensure that the DHCP server provides IPv4/IPv6 on the one NIC and pure link-local i.e. Zeroconf on the other NIC. And yeah, I would be unhappy if you would remove link-local IP support :)

What do you think?

Mallets · 2024-12-04T09:23:45Z

I agree on all the zero touch networking, it's a concept I'm pretty familiar with so it wasn't hard to convince me on that one :)

Another option is to directly handle this at link level (e.g. TCP, UDP, etc.): whenever a link local address is provided, either by multicast/gossip scouting or the user, it will simply try to establish a connection on any of the available interfaces. If the link local is not reachable on the given interfaces, then the connection establishment at link level will fail and the next interface will be tried. Connection establishment may require a bit more time but it should always work.

This approach will also be able to solve also the issue of being forced to provide the interface name in the case of configuring Zenoh via config file or in the case of using mDNS resolving to IPv4LL/IPv6LL.

Clearly, this will be available only on Unix-derivates due to tokio, but still better than nothing :)

Regarding known issues and errata, we track them in GitHub or directly in the documentation.

psiegl · 2024-12-04T20:16:57Z

I like your idea, because it is truly simple. And likely better maintainable in the long run.

psiegl changed the title ~~PoC bugfix for link-local networks employed on multiple NICs~~ PoC bugfix for IPv4LL/IPv6LL employed on multiple NICs Nov 19, 2024

psiegl changed the title ~~PoC bugfix for IPv4LL/IPv6LL employed on multiple NICs~~ PoC scouting bugfix for IPv4LL/IPv6LL employed on multiple NICs Nov 20, 2024

gabrik added the bug Something isn't working label Nov 20, 2024

Mallets reviewed Nov 22, 2024

View reviewed changes

commons/zenoh-protocol/src/core/locator.rs Outdated Show resolved Hide resolved

Mallets reviewed Nov 22, 2024

View reviewed changes

zenoh/src/net/runtime/orchestrator.rs Show resolved Hide resolved

Mallets added enhancement Existing things could work better and removed bug Something isn't working labels Nov 22, 2024

psiegl added 6 commits November 26, 2024 19:38

PoC bugfix for link-local networks employed on multiple NICs

4000bfb

Remove extra lines

4a66a0d

Rework towards a structure, that can return metadata and endpoints

160a627

Fix linter

a6a695c

Fix linter 2nd attempt

7a8c890

3rd attempt to fix linter

447f5af

psiegl force-pushed the ll_fix branch from 4fd2cf4 to 447f5af Compare November 26, 2024 18:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PoC scouting bugfix for IPv4LL/IPv6LL employed on multiple NICs #1598

PoC scouting bugfix for IPv4LL/IPv6LL employed on multiple NICs #1598

psiegl commented Nov 19, 2024 •

edited

Loading

github-actions bot commented Nov 19, 2024

github-actions bot commented Nov 19, 2024

psiegl commented Nov 27, 2024

Mallets commented Nov 29, 2024 •

edited

Loading

Mallets commented Dec 3, 2024

psiegl commented Dec 3, 2024 •

edited

Loading

Mallets commented Dec 4, 2024

psiegl commented Dec 4, 2024 •

edited

Loading

PoC scouting bugfix for IPv4LL/IPv6LL employed on multiple NICs #1598

Are you sure you want to change the base?

PoC scouting bugfix for IPv4LL/IPv6LL employed on multiple NICs #1598

Conversation

psiegl commented Nov 19, 2024 • edited Loading

github-actions bot commented Nov 19, 2024

github-actions bot commented Nov 19, 2024

psiegl commented Nov 27, 2024

Mallets commented Nov 29, 2024 • edited Loading

Mallets commented Dec 3, 2024

Assumptions

Proposed approach

Considerations

psiegl commented Dec 3, 2024 • edited Loading

Mallets commented Dec 4, 2024

psiegl commented Dec 4, 2024 • edited Loading

psiegl commented Nov 19, 2024 •

edited

Loading

Mallets commented Nov 29, 2024 •

edited

Loading

psiegl commented Dec 3, 2024 •

edited

Loading

psiegl commented Dec 4, 2024 •

edited

Loading