-
-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Backends: DX9: programmable rendering pipeline version implement (Dear @Demonese) #3844
Conversation
It's the same problem as BGRA color packing, here is D3DFMT_A8R8G8B8, so you need the IMGUI_USE_BGRA_PACKED_COLOR. To create a staging texture to upload texture is necessary, it can change the linear texture to optimal texture (tiling texture). |
|
You found a bug here since imgui support Color Emoji. |
5. "code clean"Remove unused DirectInput header 6. Texture pixel formatif defined IMGUI_USE_BGRA_PACKED_COLOR, using BGRA32 pixel format to create font altas texture. #ifndef IMGUI_USE_BGRA_PACKED_COLOR
D3DFORMAT texture_format = D3DFMT_A8B8G8R8; // RGBA32, be aware that this format is not widely supported on Windows platfrom
#else
D3DFORMAT texture_format = D3DFMT_A8R8G8B8; // BGRA32
#endif 7. New Direct3D9 backed implement: programmable rendering pipelineWhy? |
backends/imgui_impl_d3d9.cpp
Outdated
if (D3D_OK != g.d3d9Device->CreateVertexDeclaration(layout, &g.d3d9InputLayout)) | ||
return false; | ||
|
||
// Create vertex shader (vs_3_0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use vs_1_1 instead of vs_3_0, also ps_2_0 instead of ps_3_0.
There are not using Shader Model 3.0 Features here.
Also you can see my assembly shader code that more readable and editable without shader compiling.
https://github.com/metarutaiga/xxGraphic/blob/master/xxGraphicD3DAsm.cpp
If the vertex layout is not match in the fixed-function, we can define below in IMGUI_USER_CONFIG.
But we can't set Z of ImDrawVert without modified imgui, we must use custom allocator to clear it.
And set the function before initialing imgui.
|
Hello, There are a lots of things here so it's going to be a bit difficult to catch on everything.
OK, will merge.
I think this is not desirable. We don't want to "silently fail". If the texture or device are not set there is a problem with your setup. You can perform the same test before calling
I'd like you to confirm you are 1000% sure this makes a difference. Could you make a video showing with/without the bug and with only this code difference?
I would like this to come with proof/comparison of a net effect in a recent driver + to be implemented without the COM helper. We will however transition to a dynamic atlas this year and therefore I don't think this part will be as meaningful.
OK will merge, But your commit has other/unrelated changes there, so I manually applied it over another commit. I also applied the changes in the three
This was indeed an issue but instead of merging your solution I have used 23ab497 +
That's not worth the extra backend and maintenance effort IHMO, so I don't think I am going to merge this. With the 3 pushed commits this should conflict. I'm particularly interested in (3). Thank you. |
Thanks @ocornut .
I understand. But, ImGui_ImplDX9_RenderDrawData didn't return any value, nor any other API to do validate. Should we export a API to validate device, or return result in ImGui_ImplDX9_RenderDrawData?
This is a complex problem... Just like #3857 , maybe a d3d9 bug. In fact,
Ok, dynamic atlas often using staging texture to upload texture data, this still provide some performance improvement. I will move this commit to another branch. It's useful for projects that use older versions of Dear ImGui.
Ok, I will move this commit to another branch. It's useful for projects that use older versions of Dear ImGui. @Demonese It's time for you to get to work. |
7d600fd
to
c44587a
Compare
I think the state block is a lazy state backup / restore controller.
In my opinion, the all backends are just templates, and the all samples are just standalone programs. |
It should crash if the device is not setup.
I would accept this could be a bug and would half-blindly merge if you can show me a video running both versions with only that change in the code and absolutely nothing else. I just want to make sure you didn't make that change along with other changes and were confused by other side effects/bugs. I'm not sure I understand this comment:
I am not sure I understand this either. Your PR now still add a new file. My suggestion was that we can only add this if it's added to existing backend with minimal change. Thanks :) |
Next time I update this branch, they will move to imgui_impl_dx9.cpp. |
I honestly don't know either at this point. This is the first time I even hear of this DX9 feature. Since this is a relatively obscure feature (it seems?) I would suggest to add flags to |
c44587a
to
00c9d82
Compare
well, as you can see, It does have a little performance boost (from about 500us to 400us, during ImGui_ImplDX9_RenderDrawData) Outdated test resultstest code: #include <stdio.h>
#include <Windows.h>
LARGE_INTEGER freq, t1, t2;
int timer = 0;
double times = 0.0f;
// Render function.
void ImGui_ImplDX9_RenderDrawData(ImDrawData* draw_data)
{
::QueryPerformanceCounter(&t1);
...
::QueryPerformanceCounter(&t2);
times += (double)(t2.QuadPart - t1.QuadPart) * 1000000.0 / (double)freq.QuadPart;
timer += 1;
if ((timer % 60) == 0)
{
static char buffer[32];
snprintf(buffer, 32, "%.3f\n", times / 60.0);
OutputDebugStringA(buffer);
times = 0.0f;
}
}
bool ImGui_ImplDX9_Init(IDirect3DDevice9* device)
{
QueryPerformanceFrequency(&freq);
...
}
In my design, it can automatically enable support for programmable rendering pipelines without more consideration. (API compatible!) However, the disadvantage is that I modify a lot of places. I realized that if we were to develop dynamic font atlas feature, this pr would take a long time for me to merge others changes. Thanks! |
When g_IsShaderPipeline is true, we need to backup shader constant, and don't need to backup Fixed-Function transforms. imgui/backends/imgui_impl_dx9.cpp Lines 486 to 489 in 00c9d82
imgui/backends/imgui_impl_dx9.cpp Lines 531 to 534 in 00c9d82
When g_IsShaderPipeline is true, we can ignore fixed-function render states. Except for D3DRS_ALPHATESTENABLE and D3DRS_FOGENABLE(Shader Model 2.0). imgui/backends/imgui_impl_dx9.cpp Lines 313 to 319 in 00c9d82
It seems that D3DRS_SHADEMODE is lost. imgui/backends/imgui_impl_dx9.cpp Line 80 in 25679a4
Can you use DWORD instead of BYTE in the D3DASM bytecode, like SPIR-V bytecode in imgui_impl_vulkan.cpp? imgui/backends/imgui_impl_dx9.cpp Lines 436 to 439 in 00c9d82
imgui/backends/imgui_impl_dx9.cpp Line 56 in 00c9d82
imgui/backends/imgui_impl_vulkan.cpp Line 265 in 00c9d82
|
I don't understand what I am looking at. Both are 60 FPS. Try to disable vsync and compare? |
see bottom output (us), vsync doesn't affect the execution time |
D3DRS_SHADEMODE is using for vertex normal interpolation (fixed pipeline lighting), we didn't need it. (Yes, we've been using it the wrong way.)
Sorry... this is the compile result of VS2019. We didn't need to convert to DWORD. Although it's actually 4 bytes wide. |
Considering all going on in the drivers i wouldn’t trust a printf metrics omitting the swap time, would prefer to see unthrottled measurements
|
This only effect the rendering, not present. Rendering and present is different, so I only measured the rendering time.
I now do a extreme test: a large number of vertex test code:
#include <stdio.h>
#include <Windows.h>
LARGE_INTEGER freq, t1, t2;
int timer = 0;
double times = 0.0f;
// Render function.
void ImGui_ImplDX9_RenderDrawData(ImDrawData* draw_data)
{
::QueryPerformanceCounter(&t1);
...
::QueryPerformanceCounter(&t2);
times += (double)(t2.QuadPart - t1.QuadPart) * 1000000.0 / (double)freq.QuadPart;
timer += 1;
if ((timer % 60) == 0)
{
static char buffer[32];
snprintf(buffer, 32, "%.3f\n", times / 60.0);
OutputDebugStringA(buffer);
times = 0.0f;
}
}
bool ImGui_ImplDX9_Init(IDirect3DDevice9* device)
{
::QueryPerformanceFrequency(&freq);
...
} main.cpp ...
ImDrawList* renderer = ImGui::GetBackgroundDrawList();
for (float x = 0.0f; x < 2048.0f; x += 8.0f)
{
for (float y = 0.0f; y < 2048.0f; y += 8.0f)
{
renderer->AddRectFilled(ImVec2(x, y), ImVec2(x + 4.0f, y + 4.0f), IM_COL32(255, 255, 255, 128), 2.0f, 64);
}
}
... Now it will convert more vertices. It did have a little performace improvement. (if IMGUI_USE_BGRA_PACKED_COLOR defined) |
Submitting the contents != rendering. Present is the best way to guarantee everything has been fully processed. Unfortunately looking at the state of discussions and this PR I am not entirely sure I want to take on the burden of this extra complexity for a largely dead API. I still encourage you to make the best and neatest PR/patch and then it can be maintained over the DX9 backend and we will add links to your work, but I suggest that you host it. |
00c9d82
to
9e53531
Compare
1. BGRA color packing (also fix a bug)
DX9 using B8G8R8A8 color packing (D3DCOLOR). I notice that imgui_impl_dx9 do color packing converting but in core library there is a IMGUI_USE_BGRA_PACKED_COLOR macro.
So I think we can avoid color packing converting:
This change will also fix a bug (default dark theme):
2. Bug fix
I think we forget to validate device and texture before drawing. My game crashes in some cases because of this.
3.
The ordering of state recoveryI don't known why, but this did fix a bug in my game.(see commits)
4. Static texture
We can create static texture to improve performance.
Thanks Dear-ImGui