Task graph [1/10]: resource synchronization state tracking #2540
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the first patch on the road to a task graph, the replacement for the current synchronization. It includes the resource state tracker and the definition of a task. Some form of resource state tracker is at the heart of all task graph implementations I've seen (except Bevy's, as wgpu does synchronization internally) and what everything else should be built upon. Similarly, a task graph can't function without inversion of control: a task must have a callback that records commands into a command buffer and/or do host accesses. This is what the
Task
trait defines.Documentation and testing is sparse at the moment because it's possible that things will change drastically over the coming days.
Design goals
Going forward I would like to establish some design goals for following work on the task graph. In order of importance:
The problems of the current synchronization
First of all it should be noted that vulkano's synchronization dates back to when Vulkan was in its infancy and no one knew how to best abstract the enormous amount of details that comes with an API as low level as this, so it's only natural that the current system has its problems. It's also, I believe, the last remaining piece of tech debt, at least in a public API, and why a rewrite is in order.
The current synchronization falls short on all of the above design goals. In my opinion, the single biggest factor in all its problems is that the synchronization is immediate-mode and just-in-time. To quote Hans-Kristian Arntzen in his "Render graphs and Vulkan — a deep dive" article:
That sums it up very nicely. The current system is very hard to use correctly, and many common use cases are not possible to express at all, because
GpuFuture
s must be chained in just the right way (because everything is JIT) and your usage of the API is validated rather than incorrect usage being ruled out by design. The error messages are a constant source of frustration and very hard to debug. When there are no errors, there are many instances of the synchronization working incorrectly and cauing validation errors instead, or plain data races, becauseGpuFuture
was never safe either. And while it would be possible to fix some of these issues, it would still be a subpar system as summed up in the above quote. There are also many glaring issues in terms of performance both on the host and device, for instance because of the resource tracking that each command buffer and descriptor set do on each recording, all the clones and allocations going along with that, all the locking of resources, etc.Enter the task graph
As mentioned, one of the fundamental building blocks of any task graph is a global state tracker and that the task has a callback to record commands. This means that:
GpuFuture
andAutoCommandBufferBuilder
are missing is much easier as well:Prior art
Changelog: