Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async pipeline compilation #10812

Merged
merged 26 commits into from
Feb 5, 2024

Conversation

JMS55
Copy link
Contributor

@JMS55 JMS55 commented Nov 30, 2023

Objective

Solution

  • Compile pipelines in a Task on the AsyncComputeTaskPool

Changelog

  • Render/compute pipeline compilation is now done asynchronously over multiple frames when the multi-threaded feature is enabled and on non-wasm and non-macOS platforms
  • Added CachedPipelineState::Creating
  • Added PipelineCache::block_on_render_pipeline()
  • Added bevy_utils::futures::check_ready
  • Added bevy_render/multi-threaded cargo feature

Migration Guide

  • Match on the new Creating variant for exhaustive matches of CachedPipelineState

@JMS55 JMS55 added this to the 0.13 milestone Nov 30, 2023
@JMS55 JMS55 added A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide S-Blocked This cannot move forward until something else changes labels Nov 30, 2023
@alice-i-cecile
Copy link
Member

Shame we couldn't get this one out in time for the jam. Darn!

@JMS55
Copy link
Contributor Author

JMS55 commented Nov 30, 2023

Shame we couldn't get this one out in time for the jam. Darn!

Wouldn't be possible, as we need wgpu 0.19 first for this to actually work.

I'll also probably have to go back and redo/change this once gfx-rs/wgpu#3794 gets implemented, as this PR will only help on native (we don't have threads on WASM/WebGPU).

@JMS55 JMS55 marked this pull request as ready for review December 1, 2023 19:50
@JMS55
Copy link
Contributor Author

JMS55 commented Dec 1, 2023

Sadly causing an error log message (but no crash) when the app first loads, as it renders nothing while waiting for pipelines to compile 😬

We might need some logic to block that specific log message for the first few frames of the app, or something else hacky.

2023-12-01T19:50:50.742368Z ERROR present_frames: wgpu_core::present: No work has been submitted for this frame

@JMS55 JMS55 marked this pull request as draft December 31, 2023 06:46
@alice-i-cecile alice-i-cecile removed this from the 0.13 milestone Jan 24, 2024
@JMS55 JMS55 added this to the 0.13 milestone Jan 24, 2024
@JMS55
Copy link
Contributor Author

JMS55 commented Jan 24, 2024

I'd like to try and land this for 0.13. We can cut it if it's not ready, it's semi-low priority, but it's a nice improvement if we can land it.

github-merge-queue bot pushed a commit that referenced this pull request Jan 26, 2024
# Objective

Keep core dependencies up to date.

## Solution

Update the dependencies.

wgpu 0.19 only supports raw-window-handle (rwh) 0.6, so bumping that was
included in this.

The rwh 0.6 version bump is just the simplest way of doing it. There
might be a way we can take advantage of wgpu's new safe surface creation
api, but I'm not familiar enough with bevy's window management to
untangle it and my attempt ended up being a mess of lifetimes and rustc
complaining about missing trait impls (that were implemented). Thanks to
@MiniaczQ for the (much simpler) rwh 0.6 version bump code.

Unblocks #9172 and
#10812

~~This might be blocked on cpal and oboe updating their ndk versions to
0.8, as they both currently target ndk 0.7 which uses rwh 0.5.2~~ Tested
on android, and everything seems to work correctly (audio properly stops
when minimized, and plays when re-focusing the app).

---

## Changelog

- `wgpu` has been updated to 0.19! The long awaited arcanization has
been merged (for more info, see
https://gfx-rs.github.io/2023/11/24/arcanization.html), and Vulkan
should now be working again on Intel GPUs.
- Targeting WebGPU now requires that you add the new `webgpu` feature
(setting the `RUSTFLAGS` environment variable to
`--cfg=web_sys_unstable_apis` is still required). This feature currently
overrides the `webgl2` feature if you have both enabled (the `webgl2`
feature is enabled by default), so it is not recommended to add it as a
default feature to libraries without putting it behind a flag that
allows library users to opt out of it! In the future we plan on
supporting wasm binaries that can target both webgl2 and webgpu now that
wgpu added support for doing so (see
#11505).
- `raw-window-handle` has been updated to version 0.6.

## Migration Guide

- `bevy_render::instance_index::get_instance_index()` has been removed
as the webgl2 workaround is no longer required as it was fixed upstream
in wgpu. The `BASE_INSTANCE_WORKAROUND` shaderdef has also been removed.
- WebGPU now requires the new `webgpu` feature to be enabled. The
`webgpu` feature currently overrides the `webgl2` feature so you no
longer need to disable all default features and re-add them all when
targeting `webgpu`, but binaries built with both the `webgpu` and
`webgl2` features will only target the webgpu backend, and will only
work on browsers that support WebGPU.
- Places where you conditionally compiled things for webgl2 need to be
updated because of this change, eg:
- `#[cfg(any(not(feature = "webgl"), not(target_arch = "wasm32")))]`
becomes `#[cfg(any(not(feature = "webgl") ,not(target_arch = "wasm32"),
feature = "webgpu"))]`
- `#[cfg(all(feature = "webgl", target_arch = "wasm32"))]` becomes
`#[cfg(all(feature = "webgl", target_arch = "wasm32", not(feature =
"webgpu")))]`
- `if cfg!(all(feature = "webgl", target_arch = "wasm32"))` becomes `if
cfg!(all(feature = "webgl", target_arch = "wasm32", not(feature =
"webgpu")))`
- `create_texture_with_data` now also takes a `TextureDataOrder`. You
can probably just set this to `TextureDataOrder::default()`
- `TextureFormat`'s `block_size` has been renamed to `block_copy_size`
- See the `wgpu` changelog for anything I might've missed:
https://github.com/gfx-rs/wgpu/blob/trunk/CHANGELOG.md

---------

Co-authored-by: François <mockersf@gmail.com>
@JMS55 JMS55 removed the S-Blocked This cannot move forward until something else changes label Jan 27, 2024
@JMS55 JMS55 marked this pull request as ready for review January 27, 2024 02:51
@JMS55
Copy link
Contributor Author

JMS55 commented Jan 27, 2024

The last thing remaining for this PR is fixing compiling without multi-threaded or on wasm.

crates/bevy_utils/src/futures.rs Outdated Show resolved Hide resolved
@JMS55
Copy link
Contributor Author

JMS55 commented Feb 2, 2024

Is there any way for an app to request that pipelines be synchronously built: i.e. to turn this behavior off? I worry that just having objects silently not appear for a few frames is not the behavior that every app wants.

I'd like to leave that to a followup. We can put it in the RenderPlugin settings, but it's a bit of a pain.

It'd also be nice if apps could preload pipelines, but that can be done as a follow-up, as I'm not sure what the API for that would look like.

#10871

@JMS55 JMS55 added the S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it label Feb 2, 2024
@alice-i-cecile
Copy link
Member

Works for me locally. I'm content with the level of review and testing for this, and no crimes have been committed in the code base. Merging now: I'd much rather have to revert just before release than find out it's broken for users after launch.

@alice-i-cecile alice-i-cecile added this pull request to the merge queue Feb 5, 2024
Merged via the queue into bevyengine:main with commit 9f7e61b Feb 5, 2024
27 of 28 checks passed
lynn-lumen pushed a commit to lynn-lumen/bevy that referenced this pull request Feb 5, 2024
# Objective

- Pipeline compilation is slow and blocks the frame
- Closes bevyengine#8224

## Solution

- Compile pipelines in a Task on the AsyncComputeTaskPool

---

## Changelog

- Render/compute pipeline compilation is now done asynchronously over
multiple frames when the multi-threaded feature is enabled and on
non-wasm and non-macOS platforms
- Added `CachedPipelineState::Creating` 
- Added `PipelineCache::block_on_render_pipeline()`
- Added `bevy_utils::futures::check_ready`
- Added `bevy_render/multi-threaded` cargo feature

## Migration Guide

- Match on the new `Creating` variant for exhaustive matches of
`CachedPipelineState`
tjamaan pushed a commit to tjamaan/bevy that referenced this pull request Feb 6, 2024
# Objective

Keep core dependencies up to date.

## Solution

Update the dependencies.

wgpu 0.19 only supports raw-window-handle (rwh) 0.6, so bumping that was
included in this.

The rwh 0.6 version bump is just the simplest way of doing it. There
might be a way we can take advantage of wgpu's new safe surface creation
api, but I'm not familiar enough with bevy's window management to
untangle it and my attempt ended up being a mess of lifetimes and rustc
complaining about missing trait impls (that were implemented). Thanks to
@MiniaczQ for the (much simpler) rwh 0.6 version bump code.

Unblocks bevyengine#9172 and
bevyengine#10812

~~This might be blocked on cpal and oboe updating their ndk versions to
0.8, as they both currently target ndk 0.7 which uses rwh 0.5.2~~ Tested
on android, and everything seems to work correctly (audio properly stops
when minimized, and plays when re-focusing the app).

---

## Changelog

- `wgpu` has been updated to 0.19! The long awaited arcanization has
been merged (for more info, see
https://gfx-rs.github.io/2023/11/24/arcanization.html), and Vulkan
should now be working again on Intel GPUs.
- Targeting WebGPU now requires that you add the new `webgpu` feature
(setting the `RUSTFLAGS` environment variable to
`--cfg=web_sys_unstable_apis` is still required). This feature currently
overrides the `webgl2` feature if you have both enabled (the `webgl2`
feature is enabled by default), so it is not recommended to add it as a
default feature to libraries without putting it behind a flag that
allows library users to opt out of it! In the future we plan on
supporting wasm binaries that can target both webgl2 and webgpu now that
wgpu added support for doing so (see
bevyengine#11505).
- `raw-window-handle` has been updated to version 0.6.

## Migration Guide

- `bevy_render::instance_index::get_instance_index()` has been removed
as the webgl2 workaround is no longer required as it was fixed upstream
in wgpu. The `BASE_INSTANCE_WORKAROUND` shaderdef has also been removed.
- WebGPU now requires the new `webgpu` feature to be enabled. The
`webgpu` feature currently overrides the `webgl2` feature so you no
longer need to disable all default features and re-add them all when
targeting `webgpu`, but binaries built with both the `webgpu` and
`webgl2` features will only target the webgpu backend, and will only
work on browsers that support WebGPU.
- Places where you conditionally compiled things for webgl2 need to be
updated because of this change, eg:
- `#[cfg(any(not(feature = "webgl"), not(target_arch = "wasm32")))]`
becomes `#[cfg(any(not(feature = "webgl") ,not(target_arch = "wasm32"),
feature = "webgpu"))]`
- `#[cfg(all(feature = "webgl", target_arch = "wasm32"))]` becomes
`#[cfg(all(feature = "webgl", target_arch = "wasm32", not(feature =
"webgpu")))]`
- `if cfg!(all(feature = "webgl", target_arch = "wasm32"))` becomes `if
cfg!(all(feature = "webgl", target_arch = "wasm32", not(feature =
"webgpu")))`
- `create_texture_with_data` now also takes a `TextureDataOrder`. You
can probably just set this to `TextureDataOrder::default()`
- `TextureFormat`'s `block_size` has been renamed to `block_copy_size`
- See the `wgpu` changelog for anything I might've missed:
https://github.com/gfx-rs/wgpu/blob/trunk/CHANGELOG.md

---------

Co-authored-by: François <mockersf@gmail.com>
tjamaan pushed a commit to tjamaan/bevy that referenced this pull request Feb 6, 2024
# Objective

- Pipeline compilation is slow and blocks the frame
- Closes bevyengine#8224

## Solution

- Compile pipelines in a Task on the AsyncComputeTaskPool

---

## Changelog

- Render/compute pipeline compilation is now done asynchronously over
multiple frames when the multi-threaded feature is enabled and on
non-wasm and non-macOS platforms
- Added `CachedPipelineState::Creating` 
- Added `PipelineCache::block_on_render_pipeline()`
- Added `bevy_utils::futures::check_ready`
- Added `bevy_render/multi-threaded` cargo feature

## Migration Guide

- Match on the new `Creating` variant for exhaustive matches of
`CachedPipelineState`
github-merge-queue bot pushed a commit that referenced this pull request Feb 7, 2024
# Objective

- This aims to fix #11755
- After #10812 some pipeline compilation can take more time than before
and all call to `get_render_pipeline` should check the result.

## Solution

- Check `get_render_pipeline` call result for msaa_writeback
- I checked that no other call to `get_render_pipeline` in bevy code
base is missng the checking on the result.
Subserial pushed a commit to Subserial/bevy_winit_hook that referenced this pull request Feb 20, 2024
# Objective

Keep core dependencies up to date.

## Solution

Update the dependencies.

wgpu 0.19 only supports raw-window-handle (rwh) 0.6, so bumping that was
included in this.

The rwh 0.6 version bump is just the simplest way of doing it. There
might be a way we can take advantage of wgpu's new safe surface creation
api, but I'm not familiar enough with bevy's window management to
untangle it and my attempt ended up being a mess of lifetimes and rustc
complaining about missing trait impls (that were implemented). Thanks to
@MiniaczQ for the (much simpler) rwh 0.6 version bump code.

Unblocks bevyengine/bevy#9172 and
bevyengine/bevy#10812

~~This might be blocked on cpal and oboe updating their ndk versions to
0.8, as they both currently target ndk 0.7 which uses rwh 0.5.2~~ Tested
on android, and everything seems to work correctly (audio properly stops
when minimized, and plays when re-focusing the app).

---

## Changelog

- `wgpu` has been updated to 0.19! The long awaited arcanization has
been merged (for more info, see
https://gfx-rs.github.io/2023/11/24/arcanization.html), and Vulkan
should now be working again on Intel GPUs.
- Targeting WebGPU now requires that you add the new `webgpu` feature
(setting the `RUSTFLAGS` environment variable to
`--cfg=web_sys_unstable_apis` is still required). This feature currently
overrides the `webgl2` feature if you have both enabled (the `webgl2`
feature is enabled by default), so it is not recommended to add it as a
default feature to libraries without putting it behind a flag that
allows library users to opt out of it! In the future we plan on
supporting wasm binaries that can target both webgl2 and webgpu now that
wgpu added support for doing so (see
bevyengine/bevy#11505).
- `raw-window-handle` has been updated to version 0.6.

## Migration Guide

- `bevy_render::instance_index::get_instance_index()` has been removed
as the webgl2 workaround is no longer required as it was fixed upstream
in wgpu. The `BASE_INSTANCE_WORKAROUND` shaderdef has also been removed.
- WebGPU now requires the new `webgpu` feature to be enabled. The
`webgpu` feature currently overrides the `webgl2` feature so you no
longer need to disable all default features and re-add them all when
targeting `webgpu`, but binaries built with both the `webgpu` and
`webgl2` features will only target the webgpu backend, and will only
work on browsers that support WebGPU.
- Places where you conditionally compiled things for webgl2 need to be
updated because of this change, eg:
- `#[cfg(any(not(feature = "webgl"), not(target_arch = "wasm32")))]`
becomes `#[cfg(any(not(feature = "webgl") ,not(target_arch = "wasm32"),
feature = "webgpu"))]`
- `#[cfg(all(feature = "webgl", target_arch = "wasm32"))]` becomes
`#[cfg(all(feature = "webgl", target_arch = "wasm32", not(feature =
"webgpu")))]`
- `if cfg!(all(feature = "webgl", target_arch = "wasm32"))` becomes `if
cfg!(all(feature = "webgl", target_arch = "wasm32", not(feature =
"webgpu")))`
- `create_texture_with_data` now also takes a `TextureDataOrder`. You
can probably just set this to `TextureDataOrder::default()`
- `TextureFormat`'s `block_size` has been renamed to `block_copy_size`
- See the `wgpu` changelog for anything I might've missed:
https://github.com/gfx-rs/wgpu/blob/trunk/CHANGELOG.md

---------

Co-authored-by: François <mockersf@gmail.com>
) -> CachedPipelineState {
#[cfg(all(
not(target_arch = "wasm32"),
not(target_os = "macos"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any documentation on why this isn't available on macOS - https://docs.rs/wgpu/latest/wgpu/struct.Device.html#method.create_compute_pipeline doesn't document why this would be avoided

(I'm looking at this for an external project, not Bevy, but I'm trying to understand prior art)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't work on metal, and iirc it ended up making shader compilation slower on macOS.

To get working async shader compilation with metal you have to pass in a callback at shader creation time (https://developer.apple.com/documentation/metal/mtldevice/1433363-newrenderpipelinestatewithdescri). Webgpu spec has createRenderPipelineAsync, but wgpu hasn't implemented it (and I don't think wgpu wants to add an executor, as no matter what they choose people won't be happy).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-Rendering Drawing game state to the screen C-Performance A change motivated by improving speed, memory usage or compile times M-Needs-Migration-Guide A breaking change to Bevy's public API that needs to be noted in a migration guide S-Ready-For-Final-Review This PR has been approved by the community. It's ready for a maintainer to consider merging it
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Parallel pipeline compliation
10 participants