Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ipv6 test is failing on windows #330

Closed
samoht opened this issue Aug 1, 2017 · 7 comments
Closed

ipv6 test is failing on windows #330

samoht opened this issue Aug 1, 2017 · 7 comments

Comments

@samoht
Copy link
Member

samoht commented Aug 1, 2017

$ _build/default/test/test.exe test ipv6 0
[...]
Send a UDP packet from one IPV6 stack and check it is received by another.
_build/_tests\ipv6.000.output:
test.exe: [INFO] Connected Ethernet interface 02:50:00:00:00:01
test.exe: [INFO] IP6: Starting
test.exe: [DEBUG] ND6: Sending RS
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:1 tgt=fe80::50:ff:fe00:1
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:23 tgt=fc00::23
test.exe: [DEBUG] no size limit, sending
test.exe: [INFO] UDP interface connected on
test.exe: [INFO] Connected Ethernet interface 02:50:00:00:00:02
test.exe: [INFO] IP6: Starting
test.exe: [DEBUG] ND6: Sending RS
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:2 tgt=fe80::50:ff:fe00:2
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:45 tgt=fc00::45
test.exe: [DEBUG] no size limit, sending
test.exe: [INFO] UDP interface connected on
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:45 tgt=fc00::45
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] ND: Received NS: src=:: dst=ff02::1:ff00:45 tgt=fc00::45
test.exe: [DEBUG] SLAAC: fe80::50:ff:fe00:1 --> PREFERRED
test.exe: [DEBUG] SLAAC: fc00::23 --> PREFERRED
test.exe: [DEBUG] SLAAC: fe80::50:ff:fe00:2 --> PREFERRED
test.exe: [DEBUG] SLAAC: fc00::45 --> PREFERRED
------------------------------------------------------------------------------------------------------------------------------
ASSERT UDP packet should have been received
------------------------------------------------------------------------------------------------------------------------------
Test error: Error UDP packet should have been received.
samoht added a commit to samoht/vpnkit that referenced this issue Aug 1, 2017
because of mirage/mirage-tcpip#330 and mirage/mirage-tcpip#331

Signed-off-by: Thomas Gazagnaire <thomas@gazagnaire.org>
samoht added a commit to samoht/vpnkit that referenced this issue Aug 3, 2017
because of mirage/mirage-tcpip#330 and mirage/mirage-tcpip#331

Signed-off-by: Thomas Gazagnaire <thomas@gazagnaire.org>
@djs55
Copy link
Member

djs55 commented Aug 4, 2017

This test case seems to be pure OCaml using mirage-vnetif -- I can't see any system dependency. Yet it does fail reliably on Windows for me too. (/cc @MagnusS in case you have any ideas!)

@djs55
Copy link
Member

djs55 commented Aug 4, 2017

Here's a diff between a working trace on Linux and a broken trace on Windows:

--- working     2017-08-04 18:03:55.000000000 +0100
+++ broken      2017-08-04 18:04:09.000000000 +0100
@@ -6,7 +6,7 @@
 test.exe: [DEBUG] no size limit, sending
 test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:23 tgt=fc00::23
 test.exe: [DEBUG] no size limit, sending
-test.exe: [INFO] UDP interface connected on 
+test.exe: [INFO] UDP interface connected on
 test.exe: [INFO] Connected Ethernet interface 02:50:00:00:00:02
 test.exe: [INFO] IP6: Starting
 test.exe: [DEBUG] ND6: Sending RS
@@ -15,19 +15,12 @@
 test.exe: [DEBUG] no size limit, sending
 test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:45 tgt=fc00::45
 test.exe: [DEBUG] no size limit, sending
-test.exe: [INFO] UDP interface connected on 
+test.exe: [INFO] UDP interface connected on
+test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
+test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:45 tgt=fc00::45
+test.exe: [DEBUG] no size limit, sending
+test.exe: [DEBUG] ND: Received NS: src=:: dst=ff02::1:ff00:45 tgt=fc00::45
 test.exe: [DEBUG] SLAAC: fe80::50:ff:fe00:1 --> PREFERRED
 test.exe: [DEBUG] SLAAC: fc00::23 --> PREFERRED
 test.exe: [DEBUG] SLAAC: fe80::50:ff:fe00:2 --> PREFERRED
 test.exe: [DEBUG] SLAAC: fc00::45 --> PREFERRED
-test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
-test.exe: [DEBUG] ND6: Sending NS src=fc00::23 dst=ff02::1:ff00:45 tgt=fc00::45
-test.exe: [DEBUG] no size limit, sending
-test.exe: [DEBUG] ND: Received NS: src=fc00::23 dst=ff02::1:ff00:45 tgt=fc00::45
-test.exe: [DEBUG] ND6: Sending NA: src=fc00::45 dst=fc00::23 tgt=fc00::45 sol=true
-test.exe: [DEBUG] IP6: Sending packet: dst=fc00::23 mac=02:50:00:00:00:01
-test.exe: [DEBUG] no size limit, sending
-test.exe: [DEBUG] ND: Received NA: src=fc00::45 dst=fc00::23 tgt=fc00::45
-test.exe: [INFO] NUD: fc00::45 --> REACHABLE
-test.exe: [DEBUG] IP6: Releasing queued packets: dst=fc00::45 mac=02:50:00:00:00:02
-test.exe: [DEBUG] no size limit, sending

With my completely untrained eye I observe

  • the broken trace Queues the packet earlier, before the SLAAC stuff
  • the working trace has ND6: Sending NA: src=fc00::45 dst=fc00::23 tgt=fc00::45 sol=true and ten IP6: Sending packet

So it looks like (again I'm guessing here, not understanding IPv6 at all) that the broken trace has the packet sent before some internal routes are set up and the packet gets dropped? The working trace where the packet is sent a bit later after the SLAAC stuff seems to work better.

Perhaps the delays can be tweaked to work around this. It would be nice to have a way of waiting for the system to be stable rather than sleep :/

djs55 added a commit to djs55/mirage-tcpip that referenced this issue Aug 4, 2017
There's a startup delay where the stack isn't quite listening/connected/
ready to send traffic which we can't know in advance. Rather than guess
a delay, this patch makes the test case send UDP every 50s and then stop
when the first packet is received successfully.

Fixes mirage#330

Signed-off-by: David Scott <dave@recoil.org>
@djs55
Copy link
Member

djs55 commented Aug 4, 2017

I tried to make the test more reliable by

  • removing the initial delay (the one with the comment (* Duration.of_ms 500 makes this test fail - why? *))
  • sending UDP every 50ms
  • arranging for the test to succeed when the first packet is received

The idea is that we don't really know what the initial startup time needed is, before which traffic is presumably being dropped.

However this still doesn't work -- possibly related to the comment referenced above -- it seems that if one packet is sent early, no traffic will ever be received. Check out this trace:

s$ _build/default/test/test.exe test ipv6 -v
test.exe: [INFO] Connected Ethernet interface 02:50:00:00:00:01
test.exe: [INFO] IP6: Starting
test.exe: [DEBUG] ND6: Sending RS
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:1 tgt=fe80::50:ff:fe00:1
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:23 tgt=fc00::23
test.exe: [DEBUG] no size limit, sending
test.exe: [INFO] UDP interface connected on 
test.exe: [INFO] Connected Ethernet interface 02:50:00:00:00:02
test.exe: [INFO] IP6: Starting
test.exe: [DEBUG] ND6: Sending RS
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:2 tgt=fe80::50:ff:fe00:2
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] no size limit, sending
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:45 tgt=fc00::45
test.exe: [DEBUG] no size limit, sending
test.exe: [INFO] UDP interface connected on 
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
test.exe: [DEBUG] ND6: Sending NS src=:: dst=ff02::1:ff00:45 tgt=fc00::45
test.exe: [DEBUG] no size limit, sending
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
test.exe: [DEBUG] SLAAC: fe80::50:ff:fe00:1 --> PREFERRED
test.exe: [DEBUG] SLAAC: fc00::23 --> PREFERRED
test.exe: [DEBUG] SLAAC: fe80::50:ff:fe00:2 --> PREFERRED
test.exe: [DEBUG] SLAAC: fc00::45 --> PREFERRED
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45
Udp.write
test.exe: [DEBUG] IP6: Queueing packet: dst=fc00::45

If I put the initial delay back in, then it works as expected. I suspect the stack gets into a stuck state if traffic is sent too early.

@MagnusS
Copy link
Member

MagnusS commented Aug 4, 2017

@djs55 had a quick look at the mirage-vnetif setup. The test is using the default backend which will just deliver all sent packets in async -- so even if the callback function blocks the next packet should be delivered.

There's a race in Lwt.pick between the receiver and sender, afaik we don't have a mechanism to reliably see when the receiver is up - but 500ms seems like it should be more than enough... We use a delay in the ipv4 tcp connect test too (but a shorter one): https://github.com/mirage/mirage-tcpip/blob/master/test/test_connect.ml#L69.

djs55 added a commit to djs55/mirage-tcpip that referenced this issue Aug 4, 2017
There's a startup delay where the stack isn't quite listening/connected/
ready to send traffic which we can't know in advance. Rather than guess
a delay, this patch makes the test case send UDP every 50s and then stop
when the first packet is received successfully.

Fixes mirage#330

Signed-off-by: David Scott <dave@recoil.org>
@djs55
Copy link
Member

djs55 commented Aug 4, 2017

Thanks for taking a look @MagnusS .

When I improved the unit test to send packets in a loop (every 50ms) it turned out that, if I send a packet early, the stack never works. It seems there's a logic bug which prevents outgoing traffic ever being sent (to that address) as soon as the first packet is queued.

I've made a candidate "fix" in the linked PR, but I don't really know what I'm doing with v6 so it might be nonsense! The test case improvement patch hits the bug reliably though.

@djs55
Copy link
Member

djs55 commented Aug 5, 2017

OK I think I've figured this out. When sending packets to a local address, we use RFC2461's neighbor discovery. A "Neighbor Solicitation" message is sent too early and is dropped -- this is perfectly ok. It is supposed to be resent after a "Retrans Timer" timeout but the initial INCOMPLETE state is accidentally set with a timeout based on the "Reachable Time" timeout. The "Retrans Timer" is the time between probes and is about 1s; the "Reachable Time" timeout is the time for which to assume a confirmed neighbor cache entry is still valid and is about 30s. The code did correctly use the "Retrans Timer" timeout for second and subsequent probes, it's just the initial timeout which looks wrong (probably a typo).

@nojb
Copy link
Contributor

nojb commented Aug 6, 2017

It's been a while since I worked on this, but this sounds about right to me.

djs55 added a commit to djs55/mirage-tcpip that referenced this issue Feb 14, 2018
There's a startup delay where the stack isn't quite listening/connected/
ready to send traffic which we can't know in advance. Rather than guess
a delay, this patch makes the test case send UDP every 50s and then stop
when the first packet is received successfully.

Fixes mirage#330

Signed-off-by: David Scott <dave@recoil.org>
djs55 added a commit to djs55/mirage-tcpip that referenced this issue Feb 14, 2018
There's a startup delay where the stack isn't quite listening/connected/
ready to send traffic which we can't know in advance. Rather than guess
a delay, this patch makes the test case send UDP every 50s and then stop
when the first packet is received successfully.

Fixes mirage#330

Signed-off-by: David Scott <dave@recoil.org>
@avsm avsm closed this as completed in #334 Dec 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants