Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is anyone maintaining this project? #6

Closed
goodboy opened this issue Feb 18, 2020 · 10 comments
Closed

Is anyone maintaining this project? #6

goodboy opened this issue Feb 18, 2020 · 10 comments

Comments

@goodboy
Copy link

goodboy commented Feb 18, 2020

I've created a couple issues now without feedback and I'm wondering if I should just be forking and moving foward?

I would much prefer to work with the original authors if possible 👍

@pipermerriam
Copy link
Member

My apologies for not getting back to your issues quickly. I've seen them coming in but kept not making the time.

Some background on this library.

  1. It is something I want and intend to take out of the alpha stage.
  2. It was my first stab at this pattern, doing process isolation in an async friendly way (and explicitely not using multiprocessing)
  3. It was developed to directly serve a need we have in a separate codebase which straddles asyncio and trio but is trying to migrate much more heavily towards trio if not entirely. It is currently dominated by asyncio.

This library has a sister, asyncio-run-in-process. The asyncio-run-in-process ended up getting more of my attention. It has a lot more polish and generally addresses a bunch of things that I didn't get quite right in this library.

My goal is to bring the two together into a single library. I only just barely got started on that effort here: https://github.com/ethereum/aio-run-in-process

So, here is what I have to offer.

  1. I'll review your pull requests here, but probably won't be able to do anything for another 1.5 weeks due to some holiday scheduling and general busyness. I'm likely to be conservative on adding features as I don't want this library to diverge very far from it's asyncio-run-in-process twin because I still want them to get brought together (see point 3).
  2. You can fork this and run with it on your own and I'll hand over the pypi name as I don't intend to use it going forward and I don't believe we're using it in any capacity today (I need to confirm this and it will likely be a few weeks for me to do my due diligence here)
  3. You can take up the torch in the aio-run-in-process and help bring these two things together into a cohesive library that the broader python async community can use to get off of multiprocessing. This is the route under which I'll likely be the very most engaged because our interests will be aligned. This can begin as some copy/paste between the two libraries, just bringing the two codebases together into the same place and under the same package namespace, after which we can work together on deduplicating logic, etc.

Let me know what you'd like to do.

@goodboy
Copy link
Author

goodboy commented Feb 20, 2020

@pipermerriam no worries and thanks for getting back to me!

It was developed to directly serve a need we have in a separate codebase which straddles asyncio and trio but is trying to migrate much more heavily towards trio if not entirely. It is currently dominated by asyncio.

Which library? Just so I can keep an eye on it.

3. sounds good to me and I'd be interested to maintain, test, factor where I can. I have an immediate need and it's good incentive. I'd be more then happy to take ownership of the trio half as it were. Learning off what you have for asyncio will be very useful to me as well because I was hoping to actually have some integration between the two frameworks possibly via a streaming IPC protocol (instead of fighting with compatibility systems for event loops).

I do want to ping @agronholm because he has done some serious work to bring asyncio and trio together already in anyio. @agronholm has also mentioned to me that I should try porting tractor to anyio but one hold up has been proper subprocess support across anyio's supported backends; I guess this aio-run-in-process endeavor might suffice that need eventually as well.

One question: are you dead set on the run-in-process name? Not that I have anything better at the moment but I was thinking something a little more terse might be appealing.

I look forward to further discussion 👍

@chebee7i
Copy link

Hey all, has there been additional development anywhere?

@goodboy
Copy link
Author

goodboy commented May 27, 2020

Hey @chebee7i unfortunately not quite yet but I was just thinking about doing something today 😸.

I actually want to make sure we aren't doing duplicate work that's already included in anyio's new subprocess support on the v2 branch. @agronholm has already started making a cross async-lib Process api.

I think easiest path forward is:

  1. get this code merged into aio-run-in-process as per 3. above from @pipermerriam
  2. move the issues from this repo over as well
  3. start the merge/de-duplicate process

@gsalgado
Copy link
Contributor

@goodboy recently asyncio-run-in-process gained the ability to run trio code in the subprocess, so another alternative would be to use that as a base for aio-run-in-process

@goodboy
Copy link
Author

goodboy commented May 27, 2020

@gsalgado good to know :)

I wonder then where ya'll want to organize this?
I feel like there's a bunch of duplication in this <blah>-run-in-process stuff in terms of process launching in the parent and what anyio added for subproc support.

I really only need the trio option at the moment with #4 fixed. I do eventually want to have both asyncio and trio in the child support but it's unclear to me if this needs to be duplicate of the existing well built out compat layer of anyio. It also seems like the situation here with like, what, 3 repos now is a bit hard to follow.

@goodboy
Copy link
Author

goodboy commented May 27, 2020

@gsalgado @pipermerriam yeah I gotta be honest, just taking a brief look at the code in all 3 projects (trio-run-in-process, asyncio-run-in-process, anyio v2 branch) and it seems all that's needed is your child code (pipe messaging + pickling) with a little startup protocol to tell the parent when the child is ready for signals and a payload. You can spawn the subprocs from the anyio generic process launching API and save yourselves a ton of code and portability.

The work in asyncio-rip is already looking and feeling like less rigorous anyio compat layer stuff and using the anyio apis would likely simplify _open_in_process(), like the state tracking (which I'm not even sure I really get), signal handling, and it would mean you don't have to implement your own version of anyio's task group which already works on asyncio.

This whole state tracking thing which is relayed through the Process is very convoluted to me. Why not just use a small startup protocol over the pipes (or heck stdstreams like anyio does jokingly in a test) where you pass a couple messages back and forth before starting the wait/signal-relaying period(s) in the parent? It's sections like this that make me think that the state tracking is probably unnecessary and on the whole is more confusing to debug and read. Just looking at the asyncio state passing to the parent makes me very confused. Why does there need to be so much synchronization when the child is already waiting on pickled data to arrive? There seems to be only one state message that is likely needed considering the logic in the parent; the code between subproc invocation and receiving the only important child message seems superfluous. That is, you don't need:

  • a task to monitor a state sequence, just wait on a message with a pid (oh look you already do ;)) and a final result once
  • you don't need to relay signals via an asyncio.Queue later, just setup a regular signal handler that relays after the child has sent its startup I'm ready message to the parent
  • you don't need a separate task to send the child's payload over a pipe, just do that in _open_in_process() after you've received the child's pid message

Honestly, the whole state system feels like intending to do async subproc spawning but where you synchronously wait for startup: it's doing both the hard way when you could just sync spawn the subproc.

If I were to take on this "merging" of the two libs asyncio/trio-run-in-process it would probably go like this:

  • get rid of your Process api; anyio already did it and the child state tracking should not be coupled to it
  • use anyio for the async lib compat layer and subprocess launching
  • port all the necessary subproc wrapper code (i.e. signal relays, pickling routines, "monitoring") to the anyio apis. Note you can also run whatever necessary process launching wrapper code from the respective async lib's .run()
  • get rid of (or at the least vastly simplify) the child state tracking system and just have the parent read from either pipes or std streams and wait on 1 (or 2) startup messages in order to be notified of the child's readiness.
  • child program scripts can probably be used mostly as is though probably also simplified

One more quiff on asyncio-run-in-process: why aren't you using asyncio.run() in the child executor? It might simplify your _handle_coro() func.

@pipermerriam
Copy link
Member

It's very likely that asyncio-rip and trio-rip will end up getting combined into a single async-rip project that supports both asyncio and trio from both directions. The code is already all written for both, they just need to be massaged into a single library.

I'm unlikely to contribute towards anyio simply because the project is large and much more mature than the run-in-process codebases and I still want room to explore and experiment without being subject to the development practices of a more mature project. That said, the code here is permissively licenced and I won't be bothered if someone wants to copy/pasta it.

One area of development that is still on the horizon is implementing something like a WorkerProcess which keeps the process open and allows submission of multiple units of work as well as a WorkerProcessPool which manages multiple worker processes, distributing work among them. These would be focused on being able to execute many smaller and shorter lived units of work in separate processes.

@goodboy
Copy link
Author

goodboy commented May 27, 2020

I'm unlikely to contribute towards anyio simply because the project is large and much more mature than the run-in-process codebases and I still want room to explore and experiment without being subject to the development practices of a more mature project.

That's unfortunate especially since all the hard work has been done. I won't personally be able to spend time on yet another async-lib compat layer when I'm in need of SC (structured concurrency) compatible process launching asap. anyio is already doing this using the async-lib specific apis so likely I'll make a move towards that effort.

That said, the code here is permissively licenced and I won't be bothered if someone wants to copy/pasta it.

Great I'll probably start up some convo with @agronholm since he already has the cpu bound function todo as a goal of agronholm/anyio#9.

One area of development that is still on the horizon is implementing something like a WorkerProcess...WorkerProcessPool

This is exactly the design pattern I'm trying to avoid at this abstraction layer. Likely these goals don't align with my project either since IMO this kind of work delivery should be built on top of a scalability protocol. On top of that, trio is already making design moves toward this end. I'll likely take this code and make the simplifications I've suggested (also somewhat repeated here) and then try a hand at using anyio's spawning abstractions to get moving.

Thanks for the heads up 👍

@goodboy
Copy link
Author

goodboy commented Jul 24, 2020

We've removed this as a dependency in goodboy/tractor#128

@goodboy goodboy closed this as completed Jul 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants