-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gnrc_tcp: gnrc_tcp_recv() never generates -ECONNABORTED #17896
Comments
I'll take a look into this |
Hi @benpicco, After taking a look into the code this behavior works as intended and is inline with the TCP RFC. In TCP, it is valid behavior if there is silence on the wire for long periods of time. There is no mechanism intended to check if the endpoints are available outside of the execution time of the given TCP functions. Receive acts only on incoming data, it does not send anything. For a receive call, it is therefore not possible to determine if the remote endpoint is available. Each time a user calls receive, it implied that there was or will be a data transmission in the near future this leads to the following behaviors:
This basically means: If you either receive non-blocking or with a user-timeout, it is the users responsibility to figure out if the remote endpoint is still there. One simple way to figure this out, is a call to send() with an empty payload. This causes the remote Endpoint to send an acknowledge as in reply. If this acknowledge was received (and send therefore does not timeout), we know that the endpoint is available. That being said, after taking a look into the gnrc_tcp code base, send returns immediately if the payload is empty. From my point of view, this is the real Issue here and this should be solved. Any thoughts on that? |
I tried with LWIP (#17899) and it manages to detect that the remote disconnected after ~10s. |
It might be possible to accumulate timeouts between multiple receive attempts and to check if the accumulated value is greater than the connect timeout, but I need to think this through if this is a legit way to solve this. @benpicco Can LWIP detect a disconnect if you read non-blocking? Might be interesting. |
Yes. --- a/sys/net/application_layer/telnet/telnet_server.c
+++ b/sys/net/application_layer/telnet/telnet_server.c
@@ -204,7 +204,7 @@ static void *telnet_thread(void *arg)
uint8_t is_option = 0;
while (1) {
_acquire();
- res = sock_tcp_read(client, rx_buf, sizeof(rx_buf), SOCK_TCP_TIMEOUT_MS);
+ res = sock_tcp_read(client, rx_buf, sizeof(rx_buf), SOCK_NO_TIMEOUT);
_release();
if (res == -ETIMEDOUT) {
continue;
|
Interesting. I am reading the TCP RFC and as far as I can see the timeout handling here is up to the implementation. Therefore the above approach seems to be valid. I'll try to come up with a solution although the next weeks are busy :/. Thanks for finding this. |
Looks like TCP Keep-Alive packets are a thing. |
Description
The connection timeout (
_sched_connection_timeout()
) is only scheduled inside functions and removed when exiting the function, e.g ingnrc_tcp_recv()
.That means that if the user timeout supplied is smaller than the connection timeout, the connection timeout will never be generated.
This is a problem if the user timeout is set to a low value for interactive applications.
Those can never detect if the connection is lost.
Expected results
If there was no interaction with the remote within
CONFIG_GNRC_TCP_CONNECTION_TIMEOUT_DURATION_MS
, the connection must be terminated.Actual results
The connection is kept open indefinitely.
Steps to reproduce the issue
Consider the telnet server:
To ensure interactivity there is a low user timeout (otherwise the
sock_tcp_read()
call will block until the wholerx_buf
is filled). If the remote connection is lost, the connection will be terminated to listen for a new connection, otherwise try to read more bytes.For easier reproducibility I lowered
CONFIG_GNRC_TCP_CONNECTION_TIMEOUT_DURATION_MS
to 10s.I ran
examples/telnet_server
onsame54-xpro
. The board has an Ethernet interface, so loss of network connection can be forced by disconnecting the cable.telnet fe80::fec2:3dff:fe23:22df%eno1
CONFIG_GNRC_TCP_CONNECTION_TIMEOUT_DURATION_MS
)telnet fe80::fec2:3dff:fe23:22df%eno1
Versions
The text was updated successfully, but these errors were encountered: