-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't drop Binding Requests in Controlled Agent #426
Don't drop Binding Requests in Controlled Agent #426
Conversation
Codecov Report
@@ Coverage Diff @@
## master #426 +/- ##
==========================================
- Coverage 78.79% 78.43% -0.37%
==========================================
Files 34 34
Lines 3882 3882
==========================================
- Hits 3059 3045 -14
- Misses 636 648 +12
- Partials 187 189 +2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
@jech I have it now! Mind trying again? The behavior would exhibit itself if FireFox made another request before it responded to Pion. I was able to reproduce by adding some arbitrary delays, then debugged on the FireFox side. |
A controlled Agent would discard incoming Binding Requests if it didn't cause the pair to be selected. For UDP Candidate this would be interpreted as packet loss. For TCP Candidates not responding with a Binding Success could be interpreted as a failure. Firefox's ICE Agent would disconnect TCP Candidates because of this behavior. Resolves to pion/webrtc#2125 Resolves to pion/webrtc#1356 See https://bugzilla.mozilla.org/show_bug.cgi?id=1756460
4bb5d2c
to
fca128a
Compare
Very slight improvement:
|
@jech does http://18.216.219.191:8080/ work for you on FireFox (Pion is answerer) Will try the other configurations now! |
Yes, it works fine. What I suspect is going on is that I'm not disabling the other traffic types, I'm just dropping traffic so that ICE has to fallback to TCP after a timeout. If the timeouts are wrong, then ICE will switch to failed before it has time to go through all of the candidates. |
Oh yes that makes sense! Let me try that now also. In my example I am only allowing TCP.
|
Yes.
Both. Another element: if I look in Firefox's |
@jech To repro easier I started https://github.com/Sean-Der/ice-tcp-test it allows the browser to be Offer or Answer. Right now it only does non-Trickle. It is available at http://18.216.219.191:8080/ again.
This example works for me in FireFox/Brave/Chrome. I will add Trickle tomorrow and hopefully I am able to reproduce! Thank you so much for your patience on this. I am really excited by all the progress we have made on it so far. Two very long standing bugs fixed. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
I'm not seeing any IPv6 candidates on that server. |
@jech oh good point! I moved it to http://143.198.72.93:8080/ all tests done on FireFox+Chrome on MacOS. I have IPv4 and IPv6 connectivity working.
Everything seems to be working with me. Mind testing again? |
That's bizarre... I've successfully blocked UDP, and everything works as expected (host-reflexive TCP candidate). Let me compare with Galene. |
Glad to hear it works! Thank you so much for all your patience/debugging help on this. High point of my week to have these longstanding issues finally figured out. I have been a bit hesitant to take these on.... |
Bizarrer and bizarrer. Your code works fine even when run on my server. I've tried copying all the details from your code into Galene, I've tried disabling trickle ICE in Galene, I've tried disabling ICE restarts in Galene, and it still fails. I'm at a loss. |
@jech I am going to merge + pin, and then will deploy Galene to my server! I don't have any good debugging ideas. I spent a lot of time printf debugging FireFox, no good single source for info with this stuff :/ |
It actually starts failing after a while the first one or two TCP candidates work, then they start failing. Apparently, the only way to fix the issue is to restart the server. The sockets are properly closed, the new connections are accepted, but no data flows. For now, you may test at https://galene.org:8444/group/public/ (note the non-standard port, port 8443 is running mainline Galene). To test on your own server, do
|
@jech Does adding a replace statement make a difference for you?
|
I tried locally and got the same thing, but I believe the replace statement fixed it! I don't know Go modules well enough to dump what exact versions are being used/dependency tree. |
Not for me, unfortunately. I'm seeing the same results as above: good behaviour for the first few connections after server restart, then ICE failures until I restart the server. (Which is actually a good thing — ICE failures is something we can easily report to the user, unlike the erratic behaviour we were suffering from before your patches.) Very weird. Perhaps a data structure somewhere that accumultates obsolete data? |
@jech I just tested 100 PeerConnections with Will test with Galene now |
A controlled Agent would discard incoming Binding Requests if it didn't
cause the pair to be selected. For UDP Candidate this would be
interpreted as packet loss. For TCP Candidates not responding with a
Binding Success could be interpreted as a failure.
Firefox's ICE Agent would disconnect TCP Candidates because of this
behavior.
Resolves to pion/webrtc#2125
Resolves to pion/webrtc#1356
See https://bugzilla.mozilla.org/show_bug.cgi?id=1756460