Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No TCP backpressure relief on Linux? #1539

Closed
pdtwonotes opened this issue Jan 28, 2017 · 10 comments
Closed

No TCP backpressure relief on Linux? #1539

pdtwonotes opened this issue Jan 28, 2017 · 10 comments

Comments

@pdtwonotes
Copy link
Contributor

My program gets 'throttled' by TCP, but never 'unthrottled'. In tcp_connection.pony there is a function _pending_writes that is the only place the _release_backpressure function gets called.

  1. unthrottled notification is only called from _release_backpressure
  2. _release_backpressure is only called form two places, lines 501 and 529. Line 501 is inside an ifdef windows so will never be called. Line 529 is inside _pending_writes.
  3. There are two calls to pending_writes, at lines 373 and 393. Line 373 is inside a not connected test and would seem not applicable. That leaves line 393.

I put in some Debug.out calls at line 393 (just as the event is decided to be 'ours') and inside _pending_writes. Neither one is called, so my program never gets the unthrottled notifcation, until I abort the transfer from the other end. The program at the other end is wget.

I am trying to track down if this changed recently.

@SeanTAllen
Copy link
Member

It shouldn't unthrottle until it empties it's buffer. If you continue to send data while its being throttled, ti will continue to report that its throttled. There's nothing in the current code to prevent you from continuing to send data to a TCPConnection that is being throttled. I suspect that is what you are doing. Throttling on writes requires you do communicate back to senders that throttling is occurring and have them pause.

@pdtwonotes
Copy link
Contributor Author

Yes I realize that, and I do stop sending when I get the 'throttled' notification. But the trouble is that I never get the 'unthrottled' notification when it does empty the buffer. Well, I assume it empties the buffer; the wget program that is receiving the data stalls, waiting for more data.

I put in more Debug calls in tcp_connection. The original throttled notifcation arises because the returned count of bytes written is less than those submitted. But after that it is not receiving any "events" at all, until I terminate the connection from the receiving side - it does see that.

@SeanTAllen
Copy link
Member

So Epoll doesn't send another writeable event after throttling until disconnect?

If you don't get a writeable event then you aren't going to send more data and its not going to ever be unthrottled. Can you verify that when you get throttled, that the write event is resubscribed?

It's possible there is a race condition with one shot, we've made additional changes to one shot handling at Sendence to address possible race conditions. We haven't yet submitted them back upstream yet as they are still undergoing testing.

Can you clarify what "buffer" you are referring to in:

"But the trouble is that I never get the 'unthrottled' notification when it does empty the buffer"

@pdtwonotes
Copy link
Contributor Author

I have tracing on the event arrivals, and I do not see a 'wriateable' showing up. I will put tracing in on the re-subscription calls too. The "buffer" I was referring to there is the _pending queue of unsent data. Once its backlog drops below a threshold (16 I think) there is supposed to be a unthrottle call, but that never happens. It might be that the pending queue is in fact not emptying.

The symptom on the receiving side is that some of the data arrives, then it hangs. The data is a PNG file of about 64KB. I send it 15KB at a time.

@pdtwonotes
Copy link
Contributor Author

When the connection first comes in there is a resub call with flags=0x0101 (read + oneshot). After I write a few chunks I see the "length written was less that submitted" conidtion, which triggers a 'throttled'. No more resubscriptions until I break the connection. I never see it set the writable ASIO bit.

Then when I break the connection, and it is processing that event, it does some more resubs but by then it is too late.

@SeanTAllen
Copy link
Member

SeanTAllen commented Jan 29, 2017

The threshold on Linux is 0. Once the buffer is cleared. Windows is different. You appear to be mixing Linux and Windows code together in your analysis.

That said, I have figured out where the bug is.
We need to consider resubscribing after any write event as you might have gone from writeable to unwriteable. As a short term fix you can call _resubscribe_event at the end of write and writev.
That should take care of the problem.

We have several improvements to one shot handling that should be landing soon that would make that change not needed.

@pdtwonotes
Copy link
Contributor Author

Yes, I was looking at the wrong place - Windows and Linux use different unthrottle thresholds. But since the events were not arriving in the first place, it doesn't matter.
Inserting the extra calls to _resubscribe_event did fix the problem.

@jemc
Copy link
Member

jemc commented Jan 29, 2017

In addition to fixing this bug with the added call to _resubscribe_event, can a test be added to the standard library test suite that opens a pair of TCP sockets and proves the correct behaviour? This seems too subtle (varying across platforms) and too important to not have some kind of automated test.

@SeanTAllen
Copy link
Member

you would need to force a throttling scenario which is already quite hard to do. its going to be an unreliable test. very unreliable.

@pdtwonotes
Copy link
Contributor Author

The complication is that a TCP connection with both ends on the same machine is so fast that throttling hardly ever happens. I test across multiple machines of varying speed, over a wifi link, which is how I found this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants