Allow passing an iterable to run #6

dhrrgn · 2021-05-01T20:41:20Z

dhrrgn
May 1, 2021

I am experimenting with implementing this in a few of our data processing pipelines. This occasionally means processing 12+ million records. Obviously I can't just fork 12 million processes. This lead me to send a PR (#5) to add support for adding a concurrency limit, which solves part of the problem (and I thought others would find it useful).

The last hurdle is generating the callables to send to run. On larger sets of tasks, it isn't really feasible to load all the data needed into memory, and generate the closures.

My thought is that if we could send an iterable to run (or a new method), it would allow using generators to generate the callables to be ran. Combined with my PR mentioned above this would theoretically allow infinitely many tasks to be ran (until the generator ends).

I wanted to put the idea out there for discussion before I sent a PR. Thoughts?

brendt · 2021-05-03T04:30:11Z

brendt
May 3, 2021

I think it's a good idea, but requires a more complex solution, which is why I didn't add it yet. You implementation will chunk the processes into equals parts, and run them part by part. Say you chunk by 10, you have to wait until all 10 processes finish until the next 10 are scheduled. So instead of chunking, we'd need to monitor how many active processes there are, and create new ones based on that number.

In that case though, why not use https://github.com/spatie/async, which has all of this functionality built-in?

2 replies

dhrrgn May 3, 2021
Author

Ya, I have an idea for a simple solution to remove the "chunked" implementation. Going to work it up later today and see what you think.

I know spatie/async has this functionality. However, it runs each task in a separate PHP process, using far more system resources than process forking (MB of mem per process as opposed to KB).

That being said, I agree that async is a better option for more complex needs (especially when more isolation is needed). However, I think concurrency limiting is a vital feature in any async library.

brendt May 3, 2021

Looking forward to the updated PR :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow passing an iterable to run #6

{{title}}

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

Allow passing an iterable to run #6

dhrrgn May 1, 2021

Replies: 1 comment · 2 replies

brendt May 3, 2021

dhrrgn May 3, 2021 Author

brendt May 3, 2021

dhrrgn
May 1, 2021

Replies: 1 comment 2 replies

brendt
May 3, 2021

dhrrgn May 3, 2021
Author