-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor postshader handling to apply for softgpu / D3D9 #12883
Refactor postshader handling to apply for softgpu / D3D9 #12883
Conversation
Doesn't work with DarkStalkers even after disabling it's compat hacks, through it does work with some other games with software renderer. |
As for shaders failing in D3d9 I tried to figure out what fails as the error is very generic and only says it failed to build and it seems that calling texture2D with texcoord changed by different values more than 9 times or so are causing the failure. If I try to call the function many times, but with same value, it doesn't fail, so I'm guessing the compiler optimizes that situation, but it can't optimize the calls to different coordinates. Not sure how, but maybe this could be detected and instead of generic failure error, show something like "This effect doesn't work under D3D9"? Edit: actually there's seems to be more to it, previously I tried by modifying bloom shader which calls it a lot, but I tried to reproduce it with simplest shader possible and there's something more to it, or at least my simplest attempt could still be optimized by drivers while it could not for bloom shader. Edit2: Actually the limit was much greater it seems, maybe I had some other effect active as I while I based it on bloom I did it by modifying my custom shader with other effects possibly active, the simplest effect to reproduce D3d9 problem:
Meh if we're using basic d3d9 it's limited to ps 2.0 and it's limitations are very poor ~ https://en.wikipedia.org/wiki/High-Level_Shading_Language |
Yeah, we indeed use ps 2.0:
Updated this with some of the things implemented, like the pixel changes. Now it should work with homebrew etc. and the RB swizzle thing might be better. -[Unknown] |
In Darkstalkers menu using post process shaders now causes: Also in other games using 5xBR shows artifacts in software renderer, kind of like the coords, pixel or texel size was wrong or at least less accurate. Edit: And yeah upgrade to PS 3.0 solved the broken shaders in D3d9, they will still fail if someone is limited to 2.0, but at GPU's limited to that, complex post process effects would be too heavy anyway. |
Ah, the no-16 bit support path was just broken, even before these changes. It should work now at least for the menu. -[Unknown] |
In-game still black screen and on the boot screen it shows enlarged upper left of the screen blinking, unfortunately non of that shows in dumps, but I noticed it's caused by the (speed)hack used for this game, without it, it works correctly, in fact better than in other games now as it's accurate there:). The lack of accuracy in other games persist through, try using any 5xBR variant on this dump with software renderer: |
Hm, I get artifacts (but much less) even on OpenGL (non-software) with that dump. Also, for some reason Direct3D 11 is not showing anything.. hmm. Latest master has this artifact in all backends, and Direct3D 11 works fine (with artifact.) The formula for pixelDelta in softgpu for that dump is:
But, that seems wrong? What does the buffer (texture) width have to do with pixelDelta? In hardware mode, fb textures will usually but not always be a multiple of 480 wide, and bufferWidth will almost always be 480. Maybe this just wasn't noticed, but was always wrong? It looks like hardware if I set it to 480. But #8016 says I'm wrong and that this is very intentional. The buffer is definitely 512 wide, though... note that the 481 thing is fixed separately. -[Unknown] |
I can improve the xBR in software renderer by changing the one u_pixelDelta to u_texelDelta, this will break it for hardware renderers so not sure what happens there. I also have a version which uses only u_pixelDelta to work in opposite and much more wasteful way to make use of the increased rendering res as well as support custom stretch values to perfectly match ports from older consoles which had different pixel sizes and a lot of blur introduced, that seems to work fine, but it was made for SFA3Max weird pixel sizes and needs to be tweaked. |
Whoa this change grew :) I'll try to review it tomorrow but looking very promising. The u_pixelDelta thing looks suspicious indeed. |
Heh, sorry. If needed, I can break out the Draw changes and maybe some of the non-postshader stuff. I haven't written any post shaders, but from the names, here's what I'd imagine those variables to do:
I suspect maybe texelDelta is wrong, and this motivated us to make pixelDelta also wrong to compensate? -[Unknown] |
See d6d1fe6 - this makes the uniforms match my expectation. I think it should merge separately, because the current behavior is in line with what it's been doing. I don't know what this would break... -[Unknown] |
Looks right - we could merge that one first actually. I think it's more likely to fix stuff than to break stuff, actually. |
There's a conflict, but tried building directly from your branch and it crashes in GPUCommon::ReInitialize during SoftGPU::SoftGPU, calling textureCache_->Clear(true); We need to initialize textureCache_ to nullptr, and check for it here to make sure we don't do anything with it if it's not set. Dunno if that really was introduced here though... Anyway, that should be trivial, and I'll look through code now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really a pretty great refactoring, and nice fixes to thin3d stuff!
GPU/Common/FramebufferCommon.cpp
Outdated
if (preferredPixelsFormat_ == Draw::DataFormat::B8G8R8A8_UNORM) | ||
ConvertRGB565ToBGRA8888(dst, src16, width); | ||
else | ||
ConvertRGBA565ToRGBA8888(dst, src16, width); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Heh, just noticed, we should rename ConvertRGBA565ToRGBA8888 to ConvertRGB565ToRGBA8888 (removed an A).
SetViewport2D(0, 0, pixelWidth_, pixelHeight_); | ||
DrawActiveTexture(x, y, w, h, (float)pixelWidth_, (float)pixelHeight_, u0, v0, u1, v1, uvRotation, flags); | ||
// DrawActiveTexture reverses these, probably to match "up". | ||
if (GetGPUBackend() == GPUBackend::DIRECT3D9 || GetGPUBackend() == GPUBackend::DIRECT3D11) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll clean up this later by fixing the shaders.
DarkStalkers is not working as-is (glitches, blackness) but I'll help fix that up once this is ready (which it very nearly is). |
Mainly happened when we had wide textures and split them up between GPUs.
Only some things moved over so far. FramebufferCommon does too much, we want to share it with softgpu without all the buffer management stuff.
Now this works on softgpu as well. Some hacks for backend differences...
This is also relevant where softgpu is forced, i.e. Darkstalkers.
This way they are not always "thin3d" when using Draw.
Only because we might have float a[2]; float b[2];...
To match/support postshaders.
Currently, Vulkan is not working properly and direct (RAM -> output) is not hooked up. But in general, it works.
Since they work now - at least, some of the shaders do.
I'm not sure we need it, but having a param used only on one backend is confusing.
This was causing a lost device.
Was silently ignoring them. Caused stretch in postshaders.
So that it can post-process correctly.
This makes it easier to do things with it.
Also centralize the pixel texture code while at it.
So that postshaders can work. Downside is, if we write a UI shader that requires 2.0, we may not notice... but I think the risk is not that high.
This also fixes rendering on Windows 7 Direct3D 11.
This may have been uploading twice before, so there may be a small performance improvement in some games now.
These will be used when a game does multiple transfers from RAM to the screen, rather than one big one. Wasn't clearing some state, though.
This cleans up the postshader update code.
38ebf32
to
203da7d
Compare
Rebased - yeah, I must've broke softgpu in centralizing DestroyAllFBOs(), oops. Fixed now. Not sure what's happening in Darkstalkers - it should be right, unless something is wrong with the UVs? -[Unknown] |
DarkStalkersPresentHack is causing the failure, it works without it just fine. Althrough when I disabled it and tried to take a screenshot I noticed one other bad thing about software renderer - the screenshots will always be x1 res, so if I want to show results of upscaling shader I'd have to take screenshot externally without using the built-in function. |
I know, but outside the smaller UVs, that code path is very similar... Tried again to force the path, it draws fine in for example Ridge Racer... not sure how I could've made the hack not trigger like it did before? -[Unknown] |
This allows the callback to say "I didn't copy, please copy for me." This helps when additional conversions will be applied during the copy, or a copy can be skipped. Also, fixed a couple cases of the wrong mip level being used.
I'm not sure either, but I'll look into it tonight. It looks like it should work. |
Weird thing is it bugs out even with my empty effect that I used in the past just for adding the 60fps flag. So just setting texture2D(sampler0, v_texcoord0.xy) into output color is enough to break things and it breaks differently on different screens as well. |
Alright, I found the problem. In DarkStalkers we directly blit from a specific address where the game draws the buffer to be displayed. The function ConvertTextureDescFrom16 did not take that into account but directly used displayFramebuf_, so added a parameter to override. Will push that after merging - which I will do now. |
The fix commit, for reference: 864d138 |
This splits out the FramebufferManagerCommon class, creating a PresentationCommon to handle things like post-processing shaders, Google Cardboard, and anything else with the final display.
With that split out, it's now used everywhere and more consistent. Benefits:
Also some cleanup and code centralization.
Some remaining concerns:
-[Unknown]