-
-
Notifications
You must be signed in to change notification settings - Fork 21.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Direct3D 12 Rendering Driver (SPIRV-Cross + DXC approach) #64304
Conversation
Do you have a plan to try Godot using Hololens Emulator? |
It would be interesting, but I have other priorities on my horizon. |
@Lucrecious yeah |
I guess it's time to get proper For the reference, using https://github.com/mstorsjo/llvm-mingw tool chain, it's possible to build current |
If someone has an HDR display (preferably OLED or miniLED) and Windows 11, it might be worth trying to enable Auto HDR on Godot while using the Direct3D 12 renderer. This could address godotengine/godot-proposals#1004 on Windows until there is a native implementation 🙂 |
Are additional SDKs required for compilation? (Windows 11, VS 2019) Errors
|
Is there a particular feature set you need at which point you'd consider that translator mature? |
Hmm, UWP is deprecated, not? |
@jenatali, on the one hand, it's stability; last time I checked (months ago) I got the impression that the functions I needed were still changing a lot in the recent history of the repo. On the other hand, it's really about having enough features to replace one or both of SPIRV-Cross and DXC-LLVM's IR handling. The latter (IR parse/serialize) may never be a thing, I guess. |
@RandomShaper Understood. If you do take a closer look, and find it lacking for whatever reason, please reach out and let me know what's missing or what you'd need to make it viable for this use case. |
@DarkMessiah, my first impression is that the |
@jenatali, thank you very much. I'll do as soon as possible. |
For new ones joining this discussion to get other related perspectives. It is great to see @jenatali is here to offer his experience.
|
Wouldn't adding DX restrict the platforms that games using it can target? |
I thought Godot already supported UWP? Or is there something I've missed? |
@John-Gdi, I believe that's already fixed. But I have first to fix a compile error before you can try again. |
@jenatali, I've created a couple of issues (feature requests) in the Mesa repo so I can track the progress of the two remaining features without having to check periodically when they have been added. |
a6d2291
to
8926a13
Compare
@lmurray, please test again. I've made a few changes to optimize the usage of the descriptor heap. I have to admit I wasn't entirely happy with the way it was being done before. Now there's a recycling mechanism, plus the determination of how much space is needed in the heaps has been made more accurate. |
@RandomShaper Thanks for addressing this. I'll test out the changes when I'm next available which is looking like sometime the week after next. |
Superseded by #70315. |
I've finally managed to test the new changes, sorry for the delay. I can confirm that the descriptor heap changes have fixed the issues that I was encountering. As I'm going to be keeping this DXC approach in my project (I trust it more) I won't be able to test the Mesa branch going forward, however I did backport some of the Mesa branch changes locally back into this branch. The SRV/UAV ambiguity feature and the general logic changes that were compatible with DXC seem to work fine in my project although I honestly don't know how to properly test them with the shader combinations that I have available (it's a 2D project). The new descriptor table strategy used in the Mesa branch doesn't seem to be compatible with DXC so I'll live with using potentially twice the amount of root signature space compared to the other branch. |
This is very interesting. Thanks for the update. In order to be able to use what the NIR approach does and so simplify the structure and population of the root signature, you would need to force your own binding assignments before the shader reaches DXC. Maybe the easiest would be to get the set and binding from SPIR-V and inject them as In fact, I was about to do something like that since I hadn't thought of such kind of simplification until @reduz hinted the possibility to me, but then we found that NIR was already viable, so I didn't make it here in the end. May I ask why you need a D3D12 renderer for your project? |
This comment was marked as off-topic.
This comment was marked as off-topic.
@RandomShaper Thanks for the pointers. If I ever need to optimize I'll consider that approach. In regards to your question, I am a commercial game dev targeting Windows and the big three consoles only, and as I have a personal preference for using "official" tools and libraries where possible to do so I intend to ship with D3D12 (maybe exclusively) on Windows instead of shipping Vulkan. This personal preference is also why I'm preferring DXC over Mesa. My project isn't that intense visually so I don't really need the utmost performance and can afford to leave some on the table, but I do need something robust so I don't end up wasting time debugging exotic micro-optimizations when a deadline looms near. Since I've already tested this branch thoroughly over the past four months and as it works fine there is no real need for me to replace it with something else entirely that is known to be more experimental and less battle tested. |
Superseded by #70315.
Direct3D 12 Rendering Driver (via SPIRV-Cross + DXC)
This is a feature-complete Direct3D 12 RenderingDevice implementation for Godot Engine. It works as a drop-in replacement for the Vulkan one. It is selectable in the project settings as an alternative to use on Windows.
By supporting Direct3D 12, Godot gains support for multiple new platforms, such as:
This PR includes some preparatory changes, to uncouple the
RenderingDevice
from Vulkan, that is, abstracting the modern Godot rendering architecture from whatever rendering API is used. Moreover, instead of a monolithic commit, the code of the driver itself is split into three, much more manageable commits.Highlights
Performance
Depending on the complexity of the scene, effects used, etc., this first version of the renderer performs generally worse than the Vulkan one. In some tests, D3D12 has not been able to deliver more than 75% of the Vulkan frames per second. In some other, D3D12 has been able to outperform Vulkan by a small margin. Performance improvements will be ironed out over time.
Homogeneity
The D3D12 rendering driver has been written taking the Vulkan one as a basis and keeping as much as possible from the original. This effort gives two-fold benefits: on the one hand, the overall structure of the code files, including auxiliary structures and other elements, is very similar, which makes maintenance easier; on the other hand, both renderers are more similar at the functional level. An example of this is that the D3D12 renderer will be as picky as the Vulkan one when it comes to validation and error checking, even in areas where the Microsft API wouldn't impose such strict constraints.
Specialization Constants
In Vulkan it is possible to create multiple variations of a pipeline with different values for certain parameters that end up as compile-time constants in the shader generated under the hood. Those parameters are called specialization constants.
In Direct3D there's no counterpart of that mechanism. However, Godot rendering relies on it for some of its shaders. A way to have specialization constants in the Direct3D/DXIL world had to be researched. It was finally found and is used in this code. The technique is explained in this Twitter thread: https://twitter.com/RandomPedroJ/status/1532725156623286272.
Code Comments
To avoid making this PR description unnecessarily long, the reader is advised to find additional insight in the comments.
Assertions
Given that some data crosses many stages from its inception to where it's finally used, the code is full of dev-only checks ensure the sanity of many different data structures at different points in time. The expectation is that this will make easier to catch bugs —even subtle ones— in areas of high complexity.
Known Issues
Compilation & Distribution
d3d12=yes DXC_PATH=<...>
plusPIX_PATH=<...>
if you want PIX.NOTE: The build process will copy
dxcompiler.dll
anddxil.dll
from thebin/x64/
directory in the DXC zipfile to the Godot binary directory. D3D12-enabled Godot packages for distribution to end users must include those files, both for the editor and games.Future Work
Besides fixing the known issues described in another section, there are many options for potential improvement, the most important of which are described below. The code also has a number of TODO items that refer to these and other, generally smaller, potential enhancements or nice-to-haves.
Render Pass API
The D3D12 renderer uses what in the Vulkan world is called dynamic rendering. In other words, it doesn't use render pass —and subpass— APIs. This was done to make things simpler, but came with a couple of downsides.
Actionable item: Re-work render pass management with the proper APIs, which may be needed to squeeze performance from certain kind of devices.
Enhanced Barriers
Direct3D 12 was released with a way to synchronize the GPU work consisting in resource barriers. In short, they are not nearly as fine-grained as Vulkan's memory and pipeline barriers are, the biggest consequence of this being comparatively worse performance. Microsoft has later powered Direct3D with the so-called enhanced barriers, which are the same that Vulkan has. Recent GPU drivers and Windows versions already support them.
Actionable item: Re-work synchronization based on enhanced barriers, which will give more performance and make the code more similar to the one in the Vulkan renderer.
More Reasonable Dependencies
Currently, this is using SPIRV-Cross for shader translation to HLSL and an important chunk of DXC for the specialization constants hack. When the Microsoft provided support for DXIL in Mesa is mature —when checked for the purpose of this work it wasn't yet—, we may be able to use it —via NIR— instead of that two other dependencies for those purposes. Microsoft is donating engineer time to Mesa for this effort, so we hope it will be in an usable state soon for us.
Actionable item: Watch the status of DXIL in Mesa and replace SPIRV-Cross and the DXC source code as soon as feasible.
Deprecate Texture Aliasing
In Vulkan it is possible to tell upfront which formats a texture will be interpreted as, and it'll just work. In Direct3D 12 there was traditionally no way to do the same. Therefore, there are limitations on which reinterpretations one can do.
Godot needs to do two of them that are illegal in D3D12: write as
R32
and read asR9G9B9E5
, and write asR16
and read asR4B4G4A4
. The Direct3D 12 renderer code works around that limitation by abusing texture aliases, which, according to some tests across different GPUs, seems to work fine in practice.The legal approach would be to make copies of the textures when the time to read comes. However, that won't still work for the
R4B4G4A4
case. Therefore, the aliasing workaround is used for every case by now.Luckily, Direct3D has recently added a new API
CreateCommittedResource3()
that provides the same nicety as Vulkan, but it's still not widely available and, at the time of this writing, the D3D12 Memory Allocator library still doesn't support it (there's a PR, though: GPUOpen-LibrariesAndSDKs/D3D12MemoryAllocator#44).Thanks go to Matías N. Goldberg, which was of great help in this investigation.
Actionable item: Add check support and prefer
CreateCommittedResource3()
to the aliasing hack where possible.Further Homogeneity
Actionable item: Fuse as much as possible the elements that Vulkan and D3D12 have in common —staging buffer, static arrays of data format names, etc.—. This should reduce the codebase size and make it easier to maintain (and eventually add more platforms).
More
Just to make it complete, there are a few more potential improvements that may or may not be already in a TODO in the comments:
More sensible use of the shared heap (i.e., track which resources/samplers are already bound and reuse somehow).Update: Done.p_post_barrier
parameters as a hint somehow?)fsr_upscale.h
directly, given the appropriate defines.material_samplers
) to static samplers and/or descriptors to root descriptors when possible.D3D12_FEATURE_DATA_ARCHITECTURE
being UMA, useWriteToSubresource()
instead ofmemcpy()
.D3D12_BUFFER_SRV_FLAG_RAW
for CBV, or another usage.🍀 This work has been financed and kindly donated to the Godot Engine project by W4 Games. 🍀