-
Notifications
You must be signed in to change notification settings - Fork 935
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline Caching (between program executions) #5293
Comments
Note that pipeline caches have the potential of having an even greater impact for those that are using the d3d12 backend and are stuck with FXC for one reason or another. I don't think that a pipeline caching API should involve the file system. For an application that wants to tighten its process sandboxing, having low level graphics middleware assume it has file system access is a bit of a nightmare. And more generally the app author should be in control of how the cache is stored. Instead it should take/produce a binary blob (which is what the file system version would have to work with under the hood anyway).
If an application author is not consistently picking the same adapter and device then caching won't work regardless of the backend. So loading a cache built from the wrong backend is equivalent to loading a cache built from the wrong device or driver version. |
Interesting note on d3d12, although I'm not likely to implement the support on that backend myself. So perhaps the correct key for the cache selection isn't only As I mentioned on the matrix, I agree that having it be assuming a filesystem isn't good, which is why I mentioned variation 4 (which I added slightly later, hence not being the primary option). So to summarise, my proposed api would be something like: pub enum PipelineCacheKey {
Vulkan{ device_id: u32, vendor_id: u32 },
// ...
}
impl Display for PipelineCacheKey {
fn fmt(&self, fmt: &mut fmt::Formatter<'_>) -> fmt::Result {
match self {
PipelineCacheKey::Vulkan { device_id, vendor_id } => write!(fmt, "vulkan_{vendor_id}_{device_id}")
}
}
}
impl Device {
pub unsafe fn create_pipeline_cache<E>(&self, data: impl FnOnce(PipelineCacheKey)->Result<Option<Vec<u8>>, E>) -> Result<PipelineCache, PipelineCacheCreationError<E>>;
}
impl PipelineCache {
pub fn get_data(&self, device: &Device) -> (Vec<u8>, PipelineCacheKey);
} |
Is your feature request related to a problem? Please describe.
In Vello on Android, the time taken to start the app up is unacceptably long (~2 seconds after linebender/vello#455). This is an order of magnitude longer than users would expect app startup to take.
The vast majority of this startup time is spent in calls to
wgpu::Device::create_compute_pipeline
. This is because each shader is being compiled to device microcode from scratch, each run.Describe the solution you'd like
We would like for
wgpu
to provide an unsafe API forVkPipelineCache
objects, to allow reusing device microcode compilation between executions.My proposed API would be for the application to provide a path to a directory for this cache to be stored in/retrieved from.
When creating a pipeline cache object, I would expect wgpu to attempt to read from the file
wgpu_vulkan.cache
(or alternative name) when initialised with a Vulkan backend, then create a pipeline cache from this value.This would also perform sanity checks on device version, probably by including an additional custom header (as discussed in Creating a Robust Pipeline Cache with Vulkan)
A method would then also be added to the cache to write the data from the cache back to disk.
Describe alternatives you've considered
Variations on the proposed solution:
PipelineCacheCreateInfoBuilder::initial_data
directly in the file. This would leave it up to applications to implement any sanity checking, beyond those provided by the drivers.&[u8]
which will be passed as-is toPipelineCacheCreateInfoBuilder::initial_data
.fn(impl FnOnce(Backend)->Result<Vec<u8>, E>)->Result<Vec<u8>, E>
orimpl FnOnce(Backend, Vec<u8>) -> R
. This would alleviate most concerns, and is probably the right approach.Alternative solutions:
wgpu could automatically implement this pipeline caching, without requiring manual implementation from user apps. I believe this is likely to be untenable for several reasons:
wgpu could allow us to pass an
ash::vk::PipelineCache
towgpu::Device::create_compute_pipeline
- either in theComputePipelineDescriptor
or through a new method. I suspect this is untenable, as it would specialise.wgpu could allow us to pass an
ash::vk::PipelineCache
towgpu::hal::vulkan::Device::create_compute_pipeline
, and allow creating awgpu::ComputePipeline
from awgpu::hal::vulkan::ComputePipeline
. I don't know why this second aspect is currently not permitted.Additional context
I have not researched other backends' caching APIs.
I have implemented an experiment to justify the requirement to. The code of that experiment can be found in linebender/vello#459 which depends on #5292.
On my device (a Google Pixel 6), this reduces pipeline creation time (on non-first boots) from ~2s to ~30ms, and empirically makes the test app launch as quickly as I'd expect an app to launch.
I suggest that the full API for the pipeline cache could look something like:
Future possibilities could allow using APIs such as
vkMergePipelineCaches
.I am willing to implement this in wgpu, but need guidance around how adding new resources should look, as well as the expected API
Footnotes
This unsoundness is also unavoidable on the part of any programs using this feature of wgpu, but that's a tradeoff some users (including Vello) are able to justify. ↩
The text was updated successfully, but these errors were encountered: