Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster small archetypes in parallel iteration #7303

Closed
james7132 opened this issue Jan 20, 2023 · 2 comments · Fixed by #12846
Closed

Cluster small archetypes in parallel iteration #7303

james7132 opened this issue Jan 20, 2023 · 2 comments · Fixed by #12846
Labels
A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times

Comments

@james7132
Copy link
Member

What problem does this solve or what need does it fill?

QueryParIter establishes a batch size to chunk matching archetypes and tables into tasks to run on. Any archetype or table smaller than the batch size ends up in it's own task. This might help with work stealing, but it adds scheduling overhead.

What solution would you like?

Collect batches smaller than the provided batch size into an arrayvec and dispatch it as a task when either the total count exceeds the batch size or the buffer is full.

What alternative(s) have you considered?

Eating the performance cost and leaving it as is.

@james7132 james7132 added C-Feature A new feature, making something new possible S-Needs-Triage This issue needs to be labelled A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times and removed C-Feature A new feature, making something new possible S-Needs-Triage This issue needs to be labelled labels Jan 20, 2023
@hymm
Copy link
Contributor

hymm commented Jan 20, 2023

I'm currently working on a change that does something similar.

Rayon uses a significantly different algorithm for generating batches than we do. It splits the data in half. Sends one half to a new task; Checks the remaining half if it is now smaller than the batch size; Then iterates on if if it is or recurses if it's not.

I'm pretty we can do something similar and it'll be faster when we're bottle necked on spawning tasks. Unfortunately my current work borked iteration speed so I need to figure that out.

@james7132
Copy link
Member Author

Rayon uses a significantly different algorithm for generating batches than we do. It splits the data in half. Sends one half to a new task; Checks the remaining half if it is now smaller than the batch size; Then iterates on if if it is or recurses if it's not.

This is a double edged sword: it alleviates the burden of scheduling all of the tasks from one thread and more fluidly sharding the workload, in exchange for either higher contention or higher startup latency. Given that Rayon seems to do better for compute bound workloads that what we current have, that might be worth trying out.

github-merge-queue bot pushed a commit that referenced this issue Apr 4, 2024
…12846)

# Objective

- Fix #7303
- bevy would spawn a lot of tasks in parallel iteration when it matchs a
large storage and many small storage ,it significantly increase the
overhead of schedule.

## Solution

- collect small storage into one task
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ECS Entities, components, systems, and events C-Performance A change motivated by improving speed, memory usage or compile times
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants