Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalized concurrency backend usage #217

Closed
wants to merge 9 commits into from
Closed

Generalized concurrency backend usage #217

wants to merge 9 commits into from

Conversation

florimondmanca
Copy link
Member

@florimondmanca florimondmanca commented Aug 15, 2019

Base for #214

@sethmlarson
Copy link
Contributor

We definitely need to have this before we can start implementing additional backends. Thanks for opening this @florimondmanca!

Copy link
Contributor

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only had one comment on the current changes.

httpx/concurrency.py Outdated Show resolved Hide resolved
tests/conftest.py Show resolved Hide resolved
@sethmlarson
Copy link
Contributor

So this is great, I've got a couple thoughts in my head right now:

Can we somehow create a backend that detects the current async framework we're running in and then just use that AsyncioBackend or TrioBackend? Then there's no need for users to even think about that. I know there's a library under the trio org that can do this.

Second Im really strongly feeling that we should package trio backend support in a contrib module and then have an extra for it. Easier for us to test and easier for users since they won't have to go track down another package to get what they need to use trio just python -m pip install httpx[trio] or if they already have it it'll just work.

@florimondmanca
Copy link
Member Author

@sethmlarson Agreed with all your points.

Backend auto-detection is a really nice feature to add to the backlog.

One thing I'm thinking about just now — I suppose the httpx API would also use the detected backend by default? If so, we'll probably still need a way to manually set the backend used in that case, e.g.

import httpx

httpx.backend = "asyncio"

I know there's a library under the trio org that can do this.

👉 sniffio

… we should package trio backend support in a contrib module …

I'm also more inclined towards doing this as well, also for increased control and ease of change on the backend interface.

@florimondmanca florimondmanca changed the title Use concurrency backend everywhere Generalized concurrency backend usage Aug 18, 2019
@florimondmanca
Copy link
Member Author

@sethmlarson Alright, I completed reworked this PR.

The commit history is clean(er) now, so feel free to go through each commit diff. The main difference with the previous state is concurrency turned into a sub-package (ready to welcome trio.py & others 🎉).

(This doesn't have backend auto-detection because it seems premature at this stage; we can easily add it once we have TrioBackend in.)

@florimondmanca florimondmanca marked this pull request as ready for review August 18, 2019 09:46
Copy link
Contributor

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, one tiny comment!

httpx/concurrency/asyncio.py Outdated Show resolved Hide resolved
@florimondmanca
Copy link
Member Author

florimondmanca commented Aug 18, 2019

@sethmlarson I fixed the type hint. I also added a test for passing an ASGI app to AsyncClient, which ended up exposing a bug in how pending tasks were cleaned up in the AsyncioBackend background manager. :-)

@sethmlarson
Copy link
Contributor

Nice! I'm taking a look now. :)

Copy link
Contributor

@sethmlarson sethmlarson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One comment I had then this looks good to merge! 🎉

for task in pending:
try:
# Give it a chance to finish soon
await asyncio.wait_for(task, timeout=0.01)
Copy link
Contributor

@sethmlarson sethmlarson Aug 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it actually does finish should we move the task to done?

Copy link
Member Author

@florimondmanca florimondmanca Aug 18, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup. Which made me realize we can simplify this by passing the small timeout to the first .wait() call instead. 👍 Edited my last commit.

@sethmlarson sethmlarson reopened this Aug 19, 2019
@sethmlarson
Copy link
Contributor

No idea what's going on with our CI, it's building what looks to be the completely wrong commit and I can't get it to do anything else? Weird.

@florimondmanca
Copy link
Member Author

florimondmanca commented Aug 19, 2019

Flake8 was reporting a NameError in a test file, and I believe that was the cause of this. (Still strange that Travis is not displaying the branch and latest commit on the build!)

@tomchristie
Copy link
Member

tomchristie commented Aug 19, 2019

Great stuff!!!

I've got some thoughts on this:

  • I think it might not be a bad idea for us to start with a smaller PR, that doesn't include the test parameterization (which is essentially stubbed out at this point in time, since it's all vs. asnycio in this pull request)
  • Rather than BodyIterator the concurrency interface should just expose a Queue primitive, with the neccessary get/put operations.
  • When we're using __enter__ and __exit__ we should stub out async start() and async close() methods, and have concrete pre-existing __enter__ and __exit__ methods (eg. see the dispatcher interface)

I've got other thoughts on how we might want to finesse, but I think we'd look at those as a follow on:

  • Improving the semaphore interface to be more of a standard "give me a semaphore" rather than being coupled into our PoolLimits configuration.
  • Simplifying the stream interface. (I reckon we probably want to merge the reader/writer stuff into a single Stream class.)
  • We might want to think about pulling the concurrency interfaces into eg. concurrency/interfaces.py, given the highly specific technical nature of why we have it, and the fairly large API surface area it exposes. (Ie. we don't really want general user implementations here, except for very specific cases eg. a trio backend, a curio backend. Which is pretty different to eg. the dispatcher interface, which is a very public bit of API.)
  • Review any places we're using timeouts and see if we ought to be doing a better job with how those fit in to the concurrency backend interfaces.

@tomchristie
Copy link
Member

Also aware that I'm being a bit rigerous wrt. asking for a smaller footprint PR here first, with the parameterization coming afterwards.

If y'all think I'm being overzealous there, then potentially happy to re-consider there. 😃

@florimondmanca
Copy link
Member Author

@tomchristie I’m in favor of small PRs as well, and you’re correct saying that test parametrizarion is very much YAGNI worthy here. We could add it along with the trio backend (although that particular PR will be >400 LOC already).

I’ll look into your other suggestions soon and update this PR. :-)

@tomchristie
Copy link
Member

We could add it along with the trio backend

Either that or add it incrementally, so that it's nicely isolated in it's own PR.

I think we're probably still undecided around if trio support should be built-in or third-party right. (Both feasible avenues to my mind) In either case the main thing right now is nailing down a decent concurrency-backend API, and making sure that we're actually using it everywhere.

@sethmlarson
Copy link
Contributor

sethmlarson commented Aug 19, 2019

Here are my thoughts for this PR, let me know what you think:

  • Keep the move from concurrency.py -> concurrency/asyncio.py
  • Split the interfaces.py related to concurrency into concurrency/interfaces.py as mentioned.
  • Keep the changes to the concurrency interface that don't have comments (create_event(), etc)

Future PRs (in order of priority?):

  • Change create_semaphore() to not use PoolLimits but instead be a regular semaphore
  • Change from a reader/writer to a stream interface (This change would benefit my start_tls() work within proxies!)
  • Changes to allow for parametrize testing, add backend to A/WSGIDispatchers
  • Add Trio backend implementation, implement parameterized testing in the same PR cuz it makes sense.

Thoughts?

@florimondmanca
Copy link
Member Author

florimondmanca commented Aug 19, 2019

I’d be okay with reworking the PR towards that, except:

add backend to A/WSGIDispatchers

This needs to stick in here IMO — assuming the goal of this PR is to make sure finalize the asyncio abstraction later (ie use backends everywhere, except in tests then).

Fun fact about the stream API bit — this is actually how trio does things, ie opening a connection exposes one single stream object (instead of a reader/writer pair) that allows to send and receive data. Overall I think it would be a simpler abstraction to deal with indeed.

@tomchristie
Copy link
Member

This needs to stick in here IMO

Indeed, yup

@sethmlarson
Copy link
Contributor

Oh yeah leave that in then!

@florimondmanca
Copy link
Member Author

florimondmanca commented Aug 20, 2019

I think each item listed by both of you could be its own PR, and that this would be so much easier to review and discuss independantly, so I started rolling out incremental PRs, e.g. with #244.

I'm closing this if that's OK. :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants