Improve fairness of endpoint task #1119

Matthias247 · 2021-05-13T20:31:31Z

This contains 2 commits which in combination try to prevent the endpoint task from using too many executor resources while keeping performance or event improving on it.

Performance difference various a bit between the machinesI tested. On one there is no change, on another an increase in througput:

Before:

Sent 10737418240 bytes on 1 streams in 22.50s (455.04 MiB/s)

After:

Sent 10737418240 bytes on 1 streams in 21.92s (467.06 MiB/s)

quinn/src/lib.rs

The endpoint driver is currently utilizing a different behavior for transmits and receives: - For receives it allows a certain number of datagrams to be received per Endpoint iteration - For transmits it allows a certain number of transmit calls to be issued per Endpoint iteration. Each transmit call can transmit up to `BATCH_SIZE` transmits. Those might contain even more datagrams due to GSO. This change unifies the behavior, and the bound will always limit the amount of datagrams instead of `sendmsg/sendmmsg/recvmsg/recvmmsg` calls. Given that a `sendmmsg` call for N datagrams is still roughly N times as expensive as a sending a single datagram this makes sense. The overall `IO_LOOP_BOUND` was adjusted to accomodate the new behavior. `BATCH_SIZE` was modified to

As long as there is data to send or receive the `Endpoint` `Future` will currently continue to execute and thereby retain the eventloop. In order to allow other code to execute in between we yield back to the executor after each iteration. Instead of doing mulitiple iterations in one `EndpointDriver::poll` call we can just increase the maximum amount of packets to send or receive in a single iteration and pick a number which optimizes performance. Therefore the loop gets completely removed.

djc · 2021-05-17T07:24:05Z

Sorry for not reviewing this sooner -- I was on holiday for two weeks. The reason I hadn't approved yet was because I was wondering about handle_events(). As I understand it, it is not currently limited by IO_BOUND_LOOP. Should we worry about a scenario where there are a lot of events to handle and we get stuck inside handle_events() for too long? I guess that's pretty much orthogonal to this PR, but still wanted to mention it.

(Otherwise, and of course especially since this is orthogonal to the PR, this makes a lot of sense!)

Matthias247 · 2021-05-17T20:08:18Z

Should we worry about a scenario where there are a lot of events to handle and we get stuck inside handle_events() for too long? I guess that's pretty much orthogonal to this PR, but still wanted to mention it.

Yes, I think it also makes sense to limit those interactions. It however won't be an issue with a singlethreaded runtime since connections can't produce new events while the endpoint task is running, so it's guaranteed to terminate. The IO part was more risky since it gets events from the outside, and a peer sending data fast enough could prevent the endpoint task from ever yielding.

However for better performance with multithreaded runtimes it could make sense to limit the amount of events handled in handle_events(). As you said, it's kind of orthogonal.

Ralith · 2021-05-17T21:23:44Z

The expected rate of endpoint events is much lower than the expected rate of incoming/outgoing packets, FWIW. They're only raised during connection lifecycle events like handshaking, migration, and loss.

Matthias247 · 2021-05-18T04:11:51Z

Endpoint::handle_events does not only handle the endpoint related events. It receives all the outgoing messages from connections and enqueues them for transmission. This could theoretically lead to an unbounded outgoing queue.

Another potential issue I noticed is in drive_send endpoint related datagrams are only enqueued if not enough other packets are enqueued. That could mean that in a busy endpoint those might never get sent. Probably that condition should just be dropped.

Ralith previously approved these changes May 13, 2021

View reviewed changes

quinn/src/lib.rs Outdated Show resolved Hide resolved

Matthias247 added 2 commits May 13, 2021 21:14

Matthias247 dismissed Ralith’s stale review via 55b9079 May 13, 2021 21:14

Matthias247 force-pushed the hog branch from d9bc0dc to 55b9079 Compare May 13, 2021 21:14

Ralith approved these changes May 13, 2021

View reviewed changes

Ralith merged commit fa0ac1f into quinn-rs:main May 17, 2021

Matthias247 deleted the hog branch May 17, 2021 20:03

Matthias247 mentioned this pull request May 26, 2021

Unbounded queues might lead to excessive memory usage #1131

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve fairness of endpoint task #1119

Improve fairness of endpoint task #1119

Matthias247 commented May 13, 2021

djc commented May 17, 2021 •

edited

Loading

Matthias247 commented May 17, 2021

Ralith commented May 17, 2021

Matthias247 commented May 18, 2021

Improve fairness of endpoint task #1119

Improve fairness of endpoint task #1119

Conversation

Matthias247 commented May 13, 2021

djc commented May 17, 2021 • edited Loading

Matthias247 commented May 17, 2021

Ralith commented May 17, 2021

Matthias247 commented May 18, 2021

djc commented May 17, 2021 •

edited

Loading