Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intern mesh vertex buffer layouts so that we don't have to compare them over and over. #12216

Merged

Conversation

pcwalton
Copy link
Contributor

@pcwalton pcwalton commented Feb 29, 2024

Although we cached hashes of MeshVertexBufferLayout, we were paying the cost of PartialEq on InnerMeshVertexBufferLayout for every entity, every frame. This patch changes that logic to place MeshVertexBufferLayouts in Arcs so that they can be compared and hashed by pointer. This results in a 28% speedup in the queue_material_meshes phase of many_cubes, with frustum culling disabled.

Additionally, this patch contains two minor changes:

  1. This commit flattens the specialized mesh pipeline cache to one level of hash tables instead of two. This saves a hash lookup.

  2. The example many_cubes has been given a --no-frustum-culling flag, to aid in benchmarking.

See the Tracy profile:

Screenshot 2024-02-29 144406

Migration guide

  • Duplicate MeshVertexBufferLayouts are now combined into a single object, MeshVertexBufferLayoutRef, which contains an atomically-reference-counted pointer to the layout. Code that was using MeshVertexBufferLayout may need to be updated to use MeshVertexBufferLayoutRef instead.

over and over.

Although we cached hashes of `MeshVertexBufferLayout`, we were paying
the cost of `PartialEq` on `InnerMeshVertexBufferLayout` for every
entity, every frame. This patch changes that logic to place
`MeshVertexBufferLayout`s in `Arc`s so that they can be compared and
hashed by pointer. This results in a 28% speedup in the
`queue_material_meshes` phase of `many_cubes`, with frustum culling
disabled.

Additionally, this patch contains two minor changes:

1. This commit flattens the specialized mesh pipeline cache to one level
   of hash tables instead of two. This saves a hash lookup.

2. The example `many_cubes` has been given a `--no-frustum-culling`
   flag, to aid in benchmarking.
@alice-i-cecile alice-i-cecile added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times labels Feb 29, 2024
@james7132 james7132 added the C-Breaking-Change A breaking change to Bevy's public API that needs to be noted in a migration guide label Mar 1, 2024
Copy link
Contributor

github-actions bot commented Mar 1, 2024

It looks like your PR is a breaking change, but you didn't provide a migration guide.

Could you add some context on what users should update when this change get released in a new version of Bevy?
It will be used to help writing the migration guide for the version. Putting it after a ## Migration Guide will help it get automatically picked up by our tooling.

Copy link
Member

@james7132 james7132 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm seeing a 17.3% improvement with queue_material_meshes in many_cubes with frustrum culling turned on. This saves a bit of memory on every mesh with a duplicate layout, and the best thing is that this is largely transparent to the user. This looks good to me.

crates/bevy_render/src/mesh/mod.rs Show resolved Hide resolved
pub struct MeshVertexBufferLayoutRef(pub Arc<MeshVertexBufferLayout>);

#[derive(Clone, Default, Resource)]
pub struct MeshVertexBufferLayouts(HashSet<MeshVertexBufferLayoutRef>);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need to be pub? For all intents and purposes, this is an implementation detail.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does because otherwise rustc complains that it's being used in a public interface (specifically, <Mesh as RenderAsset>::Param).

@james7132 james7132 added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Mar 1, 2024
// the `MeshVertexBufferLayoutRef`, this compares the mesh vertex buffer
// layout structurally, not by pointer.
self.0
.get_or_insert_with(&layout, |layout| {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was momentarily concerned that get_or_insert_with was nightly-only but we are using hashbrown, where it isn't. Still, something to note for possible migration to core Rust HashSet/HashMap at some point.

Copy link
Contributor

@superdump superdump left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small request to add a comment about not adding Deref and why.

@superdump
Copy link
Contributor

Also CI failed:

error: unused import: `MeshVertexBufferLayout`
  --> examples/shader/shader_instancing.rs:15:31
   |
15 |         mesh::{GpuBufferInfo, MeshVertexBufferLayout, MeshVertexBufferLayoutRef},
   |                               ^^^^^^^^^^^^^^^^^^^^^^
   |
   = note: `-D unused-imports` implied by `-D warnings`
   = help: to override `-D warnings` add `#[allow(unused_imports)]`

error: could not compile `bevy` (example "shader_instancing") due to 1 previous error

You can run cargo run -p ci locally to run those checks. Just in case iteration is needed.

@pcwalton pcwalton requested a review from superdump March 1, 2024 15:02
@superdump
Copy link
Contributor

error: unused import: `MeshVertexBufferLayout`
 --> examples/3d/lines.rs:8:16
  |
8 |         mesh::{MeshVertexBufferLayout, MeshVertexBufferLayoutRef, PrimitiveTopology},
  |                ^^^^^^^^^^^^^^^^^^^^^^
  |
  = note: `-D unused-imports` implied by `-D warnings`
  = help: to override `-D warnings` add `#[allow(unused_imports)]`

@alice-i-cecile alice-i-cecile added this pull request to the merge queue Mar 1, 2024
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Mar 1, 2024
@alice-i-cecile alice-i-cecile removed the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Mar 1, 2024
@alice-i-cecile
Copy link
Member

CI failures look real, could you check them out and then ping us once you have a fix?

table.

The trick with `Borrow` was too clever and ended up actually hashing the
pointer, causing lookups to sometimes spuriously fail. This manifested
itself as a crash in `alien_cake_addict`. Removing the
`MeshVertexBufferLayoutRef` wrapper from the entries in the store
prevents its implementation of `Hash` from being called, solving the
problem.
@pcwalton
Copy link
Contributor Author

pcwalton commented Mar 1, 2024

Should be fixed now.

@alice-i-cecile alice-i-cecile added this pull request to the merge queue Mar 1, 2024
@alice-i-cecile alice-i-cecile added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Mar 1, 2024
Merged via the queue into bevyengine:main with commit f9cc91d Mar 1, 2024
26 checks passed
spectria-limina pushed a commit to spectria-limina/bevy that referenced this pull request Mar 9, 2024
…em over and over. (bevyengine#12216)

Although we cached hashes of `MeshVertexBufferLayout`, we were paying
the cost of `PartialEq` on `InnerMeshVertexBufferLayout` for every
entity, every frame. This patch changes that logic to place
`MeshVertexBufferLayout`s in `Arc`s so that they can be compared and
hashed by pointer. This results in a 28% speedup in the
`queue_material_meshes` phase of `many_cubes`, with frustum culling
disabled.

Additionally, this patch contains two minor changes:

1. This commit flattens the specialized mesh pipeline cache to one level
of hash tables instead of two. This saves a hash lookup.

2. The example `many_cubes` has been given a `--no-frustum-culling`
flag, to aid in benchmarking.

See the Tracy profile:

<img width="1064" alt="Screenshot 2024-02-29 144406"
src="https://github.com/bevyengine/bevy/assets/157897/18632f1d-1fdd-4ac7-90ed-2d10306b2a1e">

## Migration guide

* Duplicate `MeshVertexBufferLayout`s are now combined into a single
object, `MeshVertexBufferLayoutRef`, which contains an
atomically-reference-counted pointer to the layout. Code that was using
`MeshVertexBufferLayout` may need to be updated to use
`MeshVertexBufferLayoutRef` instead.
ChristopherBiscardi added a commit to ChristopherBiscardi/bevy_ecs_tilemap that referenced this pull request Jun 7, 2024
[12216](bevyengine/bevy#12216) introduced an argument `&mut MeshVertexBufferLayouts` to `get_mesh_vertex_buffer_layout`, which bevy_ecs_tilemap calls in `RenderChunk2d::prepare`
rparrett added a commit to StarArawn/bevy_ecs_tilemap that referenced this pull request Jul 5, 2024
* Update to 0.14.0-rc.2

* [12997](bevyengine/bevy#12997): rename `multi-threaded` to `multi_threaded`

* RenderAssets<Image> is now RenderAssets<GpuImage>

Implemented in [12827](bevyengine/bevy#12827)

* FloatOrd is now in bevy_math

implemented in [12732](bevyengine/bevy#12732)

* convert Transparent2d::dynamic_offset to extra_index

[12889](bevyengine/bevy#12889) Gpu Frustum Culling removed the dynamic_offset of Transparent2d and it became `extra_index` with the special value `PhaseItemExtraIndex::NONE`, which indicates the `None` that was here previously

* RenderPhase<Transparent2d> -> ViewSortedRenderPhases<Transparent2d>

[12453](https://github.com/StarArawn/bevy_ecs_tilemap/pull/bevyengine/bevy#12453): Render phases are now binned or sorted.

Following the changes in the `mesh2d_manual` [example](https://github.com/bevyengine/bevy/blob/ecdd1624f302c5f71aaed95b0984cbbecf8880b7/examples/2d/mesh2d_manual.rs#L357-L358): use the `ViewSortedRenderPhases` resource.

* get_sub_app_mut is now an Option

in [9202](https://github.com/StarArawn/bevy_ecs_tilemap/pull/bevyengine/bevy/pull/9202) SubApp access has changed

* GpuImage::size f32 -> u32 via UVec2

[11698](bevyengine/bevy#11698) changed `GpuImage::size` to `UVec2`.

Right above this, `Extent3d` does the same thing, so I'm taking a small leap and assuming can `as`.

* GpuMesh::primitive_topology -> key_bits/BaseMeshPipeline

[12791](bevyengine/bevy#12791) the `primitive_topology` field on `GpuMesh` was removed in favor of `key_bits` which can be constructed using `BaseMeshPipeline::from_primitive_topology`

* RenderChunk2d::prepare requires &mut MeshVertexBufferLayouts now

[12216](bevyengine/bevy#12216) introduced an argument `&mut MeshVertexBufferLayouts` to `get_mesh_vertex_buffer_layout`, which bevy_ecs_tilemap calls in `RenderChunk2d::prepare`

* into_linear_f32 -> color.0.linear().to_f32_array(),

[12163](bevyengine/bevy#12163) bevy_color was created and Color handling has changed. Specifically Color::as_linear_rgba_f32 has been removed.

LinearRgba is now its own type that can be accessed via [`linear()`](https://docs.rs/bevy/0.14.0-rc.2/bevy/color/enum.Color.html#method.linear) and then converted.

* Must specify type of VisibleEntities when accessing

[12582](bevyengine/bevy#12582) divided `VisibleEntities` into separate lists. So now we have to specify which kind of entity we want. I think we want the Mesh here, and I think we can get rid of the `.index` calls on Entity since Entity [already compares bits](https://docs.rs/bevy_ecs/0.14.0-rc.2/src/bevy_ecs/entity/mod.rs.html#173) for optimized codegen purposes. Waiting to do that until the other changes are in though so as to not change functionality until post-upgrade.

* app.world access is functions now

- [9202](bevyengine/bevy#9202) changed world access to functions. [relevent line](https://github.com/bevyengine/bevy/pull/9202/files#diff-b2fba3a0c86e496085ce7f0e3f1de5960cb754c7d215ed0f087aa556e529f97fR640)
- This also surfaced [12655](bevyengine/bevy#12655) which removed `Into<AssetId<T>>` for `Handle<T>`. using a reference or .id() is the solution here.

* We don't need `World::cell`, and it doesn't exist anymore

In [12551](bevyengine/bevy#12551) `WorldCell` was removed.

...but it turns out we don't need it or its replacement anyway.

* examples error out unless this bevy bug is addressed with these features being added

bevyengine/bevy#13728

* check_visibility is required for the entity that is renderable

As a result of [12582](bevyengine/bevy#12582) `check_visibility` must be implemented for the "renderable" tilemap entities. Doing this is trivial by taking advantage of the
existing `check_visibility` type arguments, which accept a [`QF: QueryFilter + 'static`](https://docs.rs/bevy/0.14.0-rc.2/bevy/render/view/fn.check_visibility.html).

The same `QueryFilter`` is used when checking `VisibleEntities`. I've chosen `With<TilemapRenderSettings` because presumably if the entity doesn't have a `TilemapRenderSettings` then it will not be rendering, but this could be as sophisticated or simple as we want.

For example `WithLight` is currently implemented as

```rust
pub type WithLight = Or<(With<PointLight>, With<SpotLight>, With<DirectionalLight>)>;
```

* view.view_proj -> view.clip_from_world

[13289](bevyengine/bevy#13489) introduced matrix naming changes, including `view_proj` which becomes `clip_from_world`

* color changes to make tests runnable

* clippy fix

* Update Cargo.toml

Co-authored-by: Rob Parrett <robparrett@gmail.com>

* Update Cargo.toml

Co-authored-by: Rob Parrett <robparrett@gmail.com>

* final clippy fixes

* Update Cargo.toml

Co-authored-by: Rob Parrett <robparrett@gmail.com>

* Simplify async loading in ldtk/tiled helpers

See Bevy #12550

* remove second allow lint

* rc.3 bump

* bump version for major release

* remove unused features

---------

Co-authored-by: Rob Parrett <robparrett@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Breaking-Change A breaking change to Bevy's public API that needs to be noted in a migration guide C-Performance A change motivated by improving speed, memory usage or compile times S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants