-
-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
See if removing a vec allocation in animation code improves performance #11328
Comments
To quote myself
|
I think I might have been misunderstood. the code in question is: animation.path_cache = vec![Vec::new(); animation_clip.paths.len()]; In this, we have to (2) doesn't allocate, while (1) will probably. Replacing the code by: let new_len = animation_clip.paths.len();
animation.path_cache.clear();
animation.path_cache.extend((0..new_len).map(|_| Vec::new())); Will likely result in better code generation. A more "rusty" approach would use |
I'm an idiot, I misunderstood what the comment referred to. I've a PR coming where allocation is minimized as much as possible. What I suggest isn't the optimal solution. |
Not always, but skip it if the new length is smaller. For context, `path_cache` is a `Vec<Vec<Option<Entity>>>`. # Objective Previously, when setting a new length to the `path_cache`, we would: 1. Deallocate all existing `Vec<Option<Entity>>` 2. Deallocate the `path_cache` 3. Allocate a new `Vec<Vec<Option<Entity>>>`, where each item is an empty `Vec`, and would have to be allocated when pushed to. This is a lot of allocations! ## Solution Use [`Vec::resize_with`](https://doc.rust-lang.org/stable/std/vec/struct.Vec.html#method.resize_with). With this change, what occurs is: 1. We `clear` each `Vec<Option<Entity>>`, keeping the allocation, but making the memory of each `Vec` re-usable 2. We only append new `Vec` to `path_cache` when it is too small. * Fixes #11328 ### Note on performance I didn't benchmark it, I just ran a diff on the generated assembly (ran with `--profile stress-test` and `--native`). I found this PR has 20 less instructions in `apply_animation` (out of 2504). Though on a purely abstract level, I can deduce this leads to less allocation. More information on profiling allocations in rust: https://nnethercote.github.io/perf-book/heap-allocations.html ## Future work I think a [jagged vec](https://en.wikipedia.org/wiki/Jagged_array) would be much more pertinent. Because it allocates everything in a single contiguous buffer. This would avoid dancing around allocations, and reduces the overhead of one `*mut T` and two `usize` per row, also removes indirection, improving cache efficiency. I think it would both improve code quality and performance.
Originally posted by @rodolphito in #11306 (comment)
This wasn't part of my PR, so I didn't change it there :)
The text was updated successfully, but these errors were encountered: