Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Deadline setting performance #1234

Closed
wants to merge 2 commits into from

Conversation

Tronic
Copy link
Contributor

@Tronic Tronic commented Oct 6, 2019

Experimenting with optimised deadline setting. Initial commit increased HTTP server performance from 7700 req/s to 8800 req/s, but is also likely to break something and may cause latencies elsewhere.

The approach taken is to no longer keep deadlines in a sorted dict, but to only track the earliest deadline. The earliest deadline is recalculated by traversing all deadlined cancel scopes whenever the old deadline expires, which usually also involves cancelling a scope.

Not ready for merging, I intend to keep working on this. Posted here for comments.

@codecov
Copy link

codecov bot commented Oct 6, 2019

Codecov Report

Merging #1234 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #1234      +/-   ##
==========================================
+ Coverage   99.57%   99.57%   +<.01%     
==========================================
  Files         106      106              
  Lines       12823    12832       +9     
  Branches      984      989       +5     
==========================================
+ Hits        12768    12777       +9     
  Misses         35       35              
  Partials       20       20
Impacted Files Coverage Δ
trio/_core/_run.py 99.73% <100%> (ø) ⬆️

@njsmith
Copy link
Member

njsmith commented Oct 6, 2019

I'm really excited you're working on this :-). And that speedup sounds impressive!

Like I mentioned in gitter, the main thing I want to watch out for here is to make sure we don't introduce any new pathological cases that people might hit if they're unlucky. The algorithm here does great if deadlines never expire; but every time a deadline does expire, then it takes O(total deadlines) time to re-scan the list of pending deadlines. So if you have a program with lots of deadlines that do expire, then it becomes O(total deadlines ** 2), which can get very slow.

Here's a little test program to demonstrate – it spawns 10k background tasks that sleep for a long time, plus 1 task that does lots of short sleeps:

import trio
import trio.testing
import time
import contextlib

print(trio.__file__)

LONG_SLEEP_COUNT = 10000
SHORT_SLEEP_COUNT = 10000

async def deep_sleeper(*, task_status=trio.TASK_STATUS_IGNORED):
    await trio.sleep(99999)

async def heavy_sleeper():
    for _ in range(SHORT_SLEEP_COUNT):
        await trio.sleep(1e-6)

async def main():
    async with trio.open_nursery() as nursery:
        print("Starting up")
        for _ in range(LONG_SLEEP_COUNT):
            nursery.start_soon(deep_sleeper)
        print("Running")
        start = time.process_time()
        await heavy_sleeper()
        end = time.process_time()
        print(f"Running portion took: {end - start:.2f} s CPU time")
        print("Shutting down")
        nursery.cancel_scope.cancel()

# Use MockClock to make things deterministic
clock = trio.testing.MockClock(autojump_threshold=0)
trio.run(main, clock=clock)

With Trio v0.12.1, I get:

~/trio$ time python /tmp/s.py        
/home/njs/.user-python3.7/lib/python3.7/site-packages/trio/__init__.py
Starting up
Running
Running portion took: 1.44 s CPU time
Shutting down
python /tmp/s.py  2.61s user 0.05s system 99% cpu 2.666 total

With this branch, I get:

~/trio$ time PYTHONPATH=. python /tmp/s.py
/home/njs/trio/trio/__init__.py
Starting up
Running
Running portion took: 12.94 s CPU time
Shutting down
PYTHONPATH=. python /tmp/s.py  13.94s user 0.08s system 99% cpu 14.018 total

So it's about 5-10x slower, and with more background tasks the difference would get more extreme.

This case isn't necessarily common, but it's not totally implausible either. (E.g., imagine a sanic server holding a bunch of idle websocket connections that each have some far-distant timeout, plus some task somewhere in the same process that does short sleeps for some reason.) So I could be convinced that it's a good tradeoff to optimize for more common cases even if they make a case like this slower... but usually that should be like, a 20% slowdown, not a 10x slowdown :-).

@njsmith
Copy link
Member

njsmith commented Oct 6, 2019

Hmm, that benchmark might somewhat overestimate the impact, because the autojump clock causes more lookups of the next-expiring-deadline than you would have otherwise. But it's only a factor of 2 at most.

@Tronic
Copy link
Contributor Author

Tronic commented Dec 28, 2019

Closing in favour of a cleaner implementation in #1351 (which however does not use this method nor get any speedups yet).

@Tronic Tronic closed this Dec 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants