Replies: 1 comment 2 replies
-
I think it's a good idea, but requires a more complex solution, which is why I didn't add it yet. You implementation will chunk the processes into equals parts, and run them part by part. Say you chunk by 10, you have to wait until all 10 processes finish until the next 10 are scheduled. So instead of chunking, we'd need to monitor how many active processes there are, and create new ones based on that number. In that case though, why not use https://github.com/spatie/async, which has all of this functionality built-in? |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
I am experimenting with implementing this in a few of our data processing pipelines. This occasionally means processing 12+ million records. Obviously I can't just fork 12 million processes. This lead me to send a PR (#5) to add support for adding a concurrency limit, which solves part of the problem (and I thought others would find it useful).
The last hurdle is generating the callables to send to run. On larger sets of tasks, it isn't really feasible to load all the data needed into memory, and generate the closures.
My thought is that if we could send an
iterable
torun
(or a new method), it would allow using generators to generate the callables to be ran. Combined with my PR mentioned above this would theoretically allow infinitely many tasks to be ran (until the generator ends).I wanted to put the idea out there for discussion before I sent a PR. Thoughts?
Beta Was this translation helpful? Give feedback.
All reactions