-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Merged by Bors] - Skip empty archetypes and tables when iterating over queries #4724
Conversation
Code changes look fine to me. The proof, like always, is in the benchmark :) I'd like to see a new benchmark that looks at iteration performance vs the number of empty archetypes (0, 1k, 5k, 10k maybe?), to see where the break-even point is for this change. |
|
Done. |
Actual benchmark results here, and the differences here make plenty of sense. Tl;dr: it's really only beneficial to
|
Solid work benchmarking. Those numbers make sense to me. Can we do a more targeted change here then? |
There were a few errors with how the prior benchmarks were run. There were 10, 100, 1000, 2000, 5000, and 10000 archetypes made, but not all of them matched the query used in the system. Once all of the created archetypes matched the given query, it seems to show that all of the iteration methods benefit from this in a tangible way. The exception here seems to be with
|
I dont think its right to conclude this has no effect on normal |
Tried this and the results are much more pronounced.
|
This reverts commit 72884a9.
Is it feasible to cull empty archetypes, or re-order archetypes so that the full ones are always at the end (i.e. so we can make the check for emptyness just be 'reached the end of the archetypes list'? I imagine the challenge there is in ensuring that all the IDs are up-to-date, since they 'need' to be used for indexing - we'd probably need a re-mapping phase? I'm not sure how well it would even out. |
This is what the "valid_archetype_ids" naively attempted to do. We'd need to find an opportune time to inject this, since it seems like only |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code itself seems to be correct. As for whether we should do this I will leave that to others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
code seems correct to me.
I don't really think the benchmarks shown give good reason to make the change though. the benches are completely artificial and skip every single archetype never actually doing anything and even then the perf difference is a regression in a fair amount of cases!
Revisiting this, it may make sense to add this only for the parallel case, where the work per non-empty archetype is definitively non-trivial since it needs to allocate for each task that is spawned and the overhead is significantly heavier when the task pools are under contention. |
I think this should be done before merging, which would make this non-controversial IMO. |
This was done before this comment. Should be ready to go. |
Description could use updating now, but I'm not going to block on that. bors r+ |
# Objective Speed up queries that are fragmented over many empty archetypes and tables. ## Solution Add a early-out to check if the table or archetype is empty before iterating over it. This adds an extra branch for every archetype matched, but skips setting the archetype/table to the underlying state and any iteration over it. This may not be worth it for the default `Query::iter` and maybe even the `Query::for_each` implementations, but this definitely avoids scheduling unnecessary tasks in the `Query::par_for_each` case. Ideally, `matched_archetypes` should only contain archetypes where there's actually work to do, but this would add a `O(n)` flat cost to every call to `update_archetypes` that scales with the number of matched archetypes. TODO: Benchmark
Pull request successfully merged into main. Build succeeded: |
…ine#4724) # Objective Speed up queries that are fragmented over many empty archetypes and tables. ## Solution Add a early-out to check if the table or archetype is empty before iterating over it. This adds an extra branch for every archetype matched, but skips setting the archetype/table to the underlying state and any iteration over it. This may not be worth it for the default `Query::iter` and maybe even the `Query::for_each` implementations, but this definitely avoids scheduling unnecessary tasks in the `Query::par_for_each` case. Ideally, `matched_archetypes` should only contain archetypes where there's actually work to do, but this would add a `O(n)` flat cost to every call to `update_archetypes` that scales with the number of matched archetypes. TODO: Benchmark
…ine#4724) # Objective Speed up queries that are fragmented over many empty archetypes and tables. ## Solution Add a early-out to check if the table or archetype is empty before iterating over it. This adds an extra branch for every archetype matched, but skips setting the archetype/table to the underlying state and any iteration over it. This may not be worth it for the default `Query::iter` and maybe even the `Query::for_each` implementations, but this definitely avoids scheduling unnecessary tasks in the `Query::par_for_each` case. Ideally, `matched_archetypes` should only contain archetypes where there's actually work to do, but this would add a `O(n)` flat cost to every call to `update_archetypes` that scales with the number of matched archetypes. TODO: Benchmark
…ine#4724) # Objective Speed up queries that are fragmented over many empty archetypes and tables. ## Solution Add a early-out to check if the table or archetype is empty before iterating over it. This adds an extra branch for every archetype matched, but skips setting the archetype/table to the underlying state and any iteration over it. This may not be worth it for the default `Query::iter` and maybe even the `Query::for_each` implementations, but this definitely avoids scheduling unnecessary tasks in the `Query::par_for_each` case. Ideally, `matched_archetypes` should only contain archetypes where there's actually work to do, but this would add a `O(n)` flat cost to every call to `update_archetypes` that scales with the number of matched archetypes. TODO: Benchmark
Objective
Speed up queries that are fragmented over many empty archetypes and tables.
Solution
Add a early-out to check if the table or archetype is empty before iterating over it. This adds an extra branch for every archetype matched, but skips setting the archetype/table to the underlying state and any iteration over it.
This may not be worth it for the default
Query::iter
and maybe even theQuery::for_each
implementations, but this definitely avoids scheduling unnecessary tasks in theQuery::par_for_each
case.Ideally,
matched_archetypes
should only contain archetypes where there's actually work to do, but this would add aO(n)
flat cost to every call toupdate_archetypes
that scales with the number of matched archetypes.TODO: Benchmark