Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CpuWriteGpuRead belt should use less memory #1962

Open
5 tasks
Wumpf opened this issue Apr 25, 2023 · 1 comment
Open
5 tasks

CpuWriteGpuRead belt should use less memory #1962

Wumpf opened this issue Apr 25, 2023 · 1 comment
Labels
📉 performance Optimization, memory use, etc 🔺 re_renderer affects re_renderer itself

Comments

@Wumpf
Copy link
Member

Wumpf commented Apr 25, 2023

Especially on platforms where re-using buffers may take considerable amount of time we're seeing high memory usage. This is so far most visible on WebGPU builds on Chrome.
(ignore WebGL builds on this! We have a hack in place that allows for very quick reclamation of the buffers which are under the hood cpu only)

Bunch of things we need to go through:

  • improve tractability in the viewer: We could actually extend the memory profiler with a table of all buffers! At least show which portion are on the belts!
  • check buffer reclaim logic, can scheduling be optimized?
  • more clever buffer sizing, adapt to needs
    • free unused buffers after usage bursts
  • consider using write_buffer on the web, see How to properly write to a buffer every frame? gfx-rs/wgpu#1438 (reply in thread)
@Wumpf Wumpf added 🔺 re_renderer affects re_renderer itself 📉 performance Optimization, memory use, etc labels Apr 25, 2023
@Wumpf
Copy link
Member Author

Wumpf commented Apr 25, 2023

More reading material regarding the last point on how we should do uploads in the first place, in particular on the Web: gpuweb/gpuweb#2388

It seems that on one hand the immediate move to staging belt might have been a hasty one based on slightly outdated knowledge (wgpu's relatively new write_buffer_with removes some of the original pain points!), on the other hand having our uploads centralized in something that is not just wgpu::Queue is very useful. In the future we probably want to employ different strategies depending on the upload workload, native vs web, native backend and even driver/hardware (on native we might use transient & permanently mapped buffers even!)

The topic is important to us since we want highly dynamic scenes. But as we don't have a lot of issues in that area yet it's too early to dive too deep on this end.
For now we should continue centralizing all upload operations, i.e. not using queue based uploads directly and do a brief check on the belts memory consumption as outlined - it seems likely that whatever we're doing going forward, there will still be belt-like mechanisms around.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📉 performance Optimization, memory use, etc 🔺 re_renderer affects re_renderer itself
Projects
None yet
Development

No branches or pull requests

1 participant