-
Notifications
You must be signed in to change notification settings - Fork 866
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IPE when testing connection timeout on Windows #468
Comments
It looks like nobody use SRT in async mode for Windows :)
|
I remember what we have introduced that has caused this message to be printed. We are checking the status code of the call to If you turn on the debug logging, you should see the line printed by the following instruction. This should show the exact return code:
Would be nice if you get this error, probably this is excessive. This is the list of non-success codes treated as fatal errors:
|
The error is 10054 - WSAECONNRESET. |
BTW there is a note for WSAECONNRESET on https://docs.microsoft.com/en-us/windows/desktop/api/winsock2/nf-winsock2-wsarecvfrom page: "For a UDP datagram socket, this error would indicate that a previous send operation resulted in an ICMP "Port Unreachable" message." which may be related to the issue. |
Ok, if you can please test if commenting this out from this above array (it's Looks weird, really. "Previous send operation resulted in (...)" as a reason for an error for a receiving operation... Might be that this list should be seriously revised. |
Commenting out channel.cpp:547 (WSAECONNRESET entry in "fatals" array) fixes the issue for me (I actually mentioned it in the first comment). |
Thanks for reply @ethouris.
@maxtomilov has already checked this and it works as expected after the fix. Do you think we need to push this fix ?
maybe it's postponed issue Windows kernel can report only on the next recv operation? I'm just guessing:)
We want to release new 1.3.1 version for Nimble Streamer soon. When do you plan to inspect this code ? We have regression and unit test for SRT and they found the problem. |
Ok, so push the fix, please. And many thanks for finding it. Please note that whatever things happen visible in the epoll flags, the call to a function that reads a UDP packet is completely asynchronous to it. The packets in SRT first get received from a UDP socket (and stored into the receiver buffer), then they become ready for extraction, when the ACK is sent for these packets (this is the original UDT design, and not even the most desired thing in live mode, but we haven't yet planned any changes here) , and then the TsbPd thread, when the time comes, makes them ready for delivery. The ACK action is being performed in the RcvQ:worker thread (that is, the same that read from the UDP socket) just once per a timely event, and TsbPd is its own thread. But once the error is reported from the UDP receiving function, the connection is immediately broken, no matter what is currently happening in other contexts. |
@ethouris, we're always ready to find, fix and contribute ! |
Thanks :) Well, it probably makes sense how the Windows API functions behave, just there's no clear declaration, which exactly errors inform about inability to continue (makes no sense to retry, or retrying will result in the same error anyway), and which are simply informative about the current situation - which might, but need not be crucial to your program. Windows API lacks that clear information, or maybe I wasn't smart enough to find it. |
Maybe it's just my view to Windows api. Could you please bless our commit :) |
It is blessed. |
I am not sure this is correct conclusion: the behavior of returning an error to recv() due to incoming ICMP error packet that resulted from a send() on the same socket is basically correct. Linux should be doing the same thing, see https://tools.ietf.org/html/rfc1122, section 4.1.3.3. The difference is that Windows returns WSAECONNRESET and Linux should be returning ECONNREFUSED but the handing is SRT code is currently the same for either. Have you confirmed that in your Linux runs you get ICMP error back when running this test? |
Yes, I see ICMP messages with "Destination unreachable (Port unreachable)" info after each sendmsg() call while running the test, but recvmsg() always returns EAGAIN, not ECONNREFUSED. |
Hello @alexpokotilo , So here is a quick test app that demonstrates that ECONNREFUSED can be returned from recvmsg() on Linux. The app attempts to replicate socket calls made by SRT stack when your test case is executed: Somehow we do not (usually) get to that condition in SRT on Linux but the possibility of this return code remains, as demonstrated by the above sample, and should be handled in the same way as WSAECONNRESET on Windows. Could you please add similar handling for ECONNREFUSED to your PR? It should result in RST_AGAIN status return on Linux/OSX, same as for WSAECONNRESET on Windows. Thank you! |
Hello @rndi, Pull request has been updated with appropriate changes for handling ECONNREFUSED. |
connection_timeout_test.zip
Trying to test SRT connection timeout on Windows results with the following output after srt_connect call:
Code from connection_timeout_test.cpp (sample is attached):
Removing WSAECONNRESET handling as a fatal error after WSARecvFrom call in srtcore/channel.cpp seem to fix the issue, but I'm not sure if it is correct for all cases.
The text was updated successfully, but these errors were encountered: