Cluster small archetypes in parallel iteration #7303

james7132 · 2023-01-20T14:32:19Z

What problem does this solve or what need does it fill?

QueryParIter establishes a batch size to chunk matching archetypes and tables into tasks to run on. Any archetype or table smaller than the batch size ends up in it's own task. This might help with work stealing, but it adds scheduling overhead.

What solution would you like?

Collect batches smaller than the provided batch size into an arrayvec and dispatch it as a task when either the total count exceeds the batch size or the buffer is full.

What alternative(s) have you considered?

Eating the performance cost and leaving it as is.

The text was updated successfully, but these errors were encountered:

hymm · 2023-01-20T15:31:37Z

I'm currently working on a change that does something similar.

Rayon uses a significantly different algorithm for generating batches than we do. It splits the data in half. Sends one half to a new task; Checks the remaining half if it is now smaller than the batch size; Then iterates on if if it is or recurses if it's not.

I'm pretty we can do something similar and it'll be faster when we're bottle necked on spawning tasks. Unfortunately my current work borked iteration speed so I need to figure that out.

james7132 · 2023-01-21T11:00:51Z

Rayon uses a significantly different algorithm for generating batches than we do. It splits the data in half. Sends one half to a new task; Checks the remaining half if it is now smaller than the batch size; Then iterates on if if it is or recurses if it's not.

This is a double edged sword: it alleviates the burden of scheduling all of the tasks from one thread and more fluidly sharding the workload, in exchange for either higher contention or higher startup latency. Given that Rayon seems to do better for compute bound workloads that what we current have, that might be worth trying out.

…12846) # Objective - Fix #7303 - bevy would spawn a lot of tasks in parallel iteration when it matchs a large storage and many small storage ,it significantly increase the overhead of schedule. ## Solution - collect small storage into one task

re0312 mentioned this issue Apr 2, 2024

Cluster small table/archetype into single Task in parallel iteration #12846

Merged

james7132 closed this as completed in #12846 Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster small archetypes in parallel iteration #7303

Cluster small archetypes in parallel iteration #7303

james7132 commented Jan 20, 2023

hymm commented Jan 20, 2023 •

edited

Loading

james7132 commented Jan 21, 2023

Cluster small archetypes in parallel iteration #7303

Cluster small archetypes in parallel iteration #7303

Comments

james7132 commented Jan 20, 2023

What problem does this solve or what need does it fill?

What solution would you like?

What alternative(s) have you considered?

hymm commented Jan 20, 2023 • edited Loading

james7132 commented Jan 21, 2023

hymm commented Jan 20, 2023 •

edited

Loading