Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for growing the memory block size #235

Open
nical opened this issue Jun 24, 2024 · 4 comments · May be fixed by #254
Open

Support for growing the memory block size #235

nical opened this issue Jun 24, 2024 · 4 comments · May be fixed by #254

Comments

@nical
Copy link
Contributor

nical commented Jun 24, 2024

Firefox potentially creates a lot of devices and we can't know in advance the type and scale of content that the page will want to run. So it is difficult to come up with a good default memory block size that will work well with very complex web apps without making the cost of very simple ones prohibitive. To address that, one way could be to start with small block size and double it every time a new memory block (for a particular memory type) is needed.

Instead of a single device_memblock_size, the allocator's configuration would let the user specify device_minimum_memblock_size and device_maximum_memblock_size, starting allocations with the former and doubling until the latter is reached. Setting both to the same value would preserve the current behavior.

It would also help with switching wgpu's vulkan backend from gpu-alloc to gpu-allocator since gpu-alloc implements the described behavior.

If this is something you would be OK with supporting, I'd like to find some time to implement it at some point in the next few months.

@Jasper-Bekkers
Copy link
Member

I think this could be a good addition, though one needs to be quite careful not to go to small or too large with these. WDDM (The windows memory manager) for example migrates memory between host and device based on these blocks so for the real-time applications it makes more sense to pick a reasonable default (also to keep the architecture a bit simpler).

I would suggest making the min / max block size configurable, since sensible defaults are anywhere between 64MB and 256MB (which is what we landed on because it's the same size as the PCIe aperture - something that'll become irrelevant when ReBAR is enabled which most devices don't have).

I think with the block size choice one has to respect a few things of these things and for this library to work well I think we need to make sure that the copies are done outside of this library (when blocks are resized) so that they can be scheduled at an appropriate point in the frame.

So to summarize this change would require;

  • User specified min/max block sizes (or a user specified list of block sizes)
  • A callback interface for copies (that we can potentially use for defrag later on as well)
  • Support for all backends (Metal, Vulkan and Dx12)
  • Also, even though allocations are fallible in gpu-allocator, I don't think it would be wise to use that mechanism to indicate block-resizes.

The sum of these requirements may make it that this becomes something that's tricky to design, so it would be nice to see a proposal of this work before committing to it.

@nical
Copy link
Contributor Author

nical commented Jun 24, 2024

Sorry, I realize it was unclear in my initial explanation: I am proposing to grow the block size (as in start with small blocks but the next ones we allocate get larger and larger up to a user-specified limit), without actually growing existing allocations in the "std::Vec" sense.

Things should remain fairly simple that way.

@Jasper-Bekkers
Copy link
Member

Sorry, I realize it was unclear in my initial explanation: I am proposing to grow the block size (as in start with small blocks but the next ones we allocate get larger and larger up to a user-specified limit), without actually growing existing allocations in the "std::Vec" sense.

Things should remain fairly simple that way.

I realized I never got back to you on this, but this sounds fine IMHO and I would be open to these changes.

Could you explain a bit why this particular pattern is useful to you, mostly to satisfy my own curiosity, since it's not something you see often (at least i havent).

@nical
Copy link
Contributor Author

nical commented Jul 8, 2024

Sure! The idea is that the wgpu implementation powering Firefox should have a low-ish memory footprint per device initially and grow as needed. The browser is a bit particular in that there is an expectation that it shouldn't consume more resources than it needs and it should be fine to have a browser on the side of whatever other apps is running on the system. The browser may also create a lot of devices that don't allocate much (for example browsing shadertoy).

I suppose that the desire for low initial memory footprint also generalizes to non-game app/UI toolkits.

We can support that by simply using a small block size, but when running a serious workload (like a unity game exported to the web), we'd still like to organically grow the block size to avoid the worst aspects of having only small memory blocks for performance in a workload that uses a large amount of vram.

The way we want to expose this in wgpu is to have per device "performance" and "memory usage" presets. The browser would always opt into favoring memory usage, but games would typically stay on the performance preset which would be gpu_allocator's current default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants