Rework D3D9 float emulation #2294

doitsujin · 2021-09-14T13:42:17Z

Replaces almost every occurence of a * b with (b == 0 ? 0 : a) * (a == 0 ? 0 : b) if we know that both operands may be zero. I've omitted the TexBem instructions for now since I don't know what those do exactly and if there are any NaNs to filter out there, but everythig else should be handled.

The existing d3d9.floatEmulation option still works, and setting it to false effectively disables this.

Fixes stuff like #2107, but will obviously need a fair bit of testing.

DadSchoorse · 2021-09-14T14:48:49Z

This really needs some work in the drivers for low end hardware, I'm losing up to 20% performance on my renoir apu.

pendingchaos · 2021-09-14T14:50:21Z

Not sure if it matters but (b == 0 ? 0 : a) * (a == 0 ? 0 : b) is -0.0 for a,b=-0.0,-1.0, unlike AMD's legacy multiplication opcode, which is 0.0 IIRC

doitsujin · 2021-09-14T14:55:43Z

I don't think there's anything in D3D9 that relies on signed zeroes being correct. D3D9 does not follow IEEE rules.

Does the instruction always return positive zero or does it return the correct sign? That's not clear to me right now.

doitsujin · 2021-09-14T15:10:57Z

We can change this to (b == 0 ? b : a) * (a == 0 ? a : b) if that more closely emulates v_mul_legacy behaviour.

misyltoad · 2021-09-14T15:37:24Z

fwiw we do not enabled Signed Zero Preserve in D3D9

pendingchaos · 2021-09-14T16:32:41Z

Does the instruction always return positive zero or does it return the correct sign? That's not clear to me right now.

I tested this a while ago and IIRC it's always positive zero if either source is negative or positive zero (EDIT: checked again, and -0.0 * +0.0 is +0.0)

We can change this to (b == 0 ? b : a) * (a == 0 ? a : b) if that more closely emulates v_mul_legacy behaviour.

Doesn't work with -0.0 * 0.0, but it probably doesn't matter if you don't enable Signed Zero Preserve. RADV should be able to optimize either if it's not enabled

DadSchoorse · 2021-09-14T17:55:15Z

This is probably more a theoretical problem, but doesn't this need SignedZeroInfNanPreserve to ensure correctness? Without it it's allowed to optimize (b == 0 ? 0 : a) * (a == 0 ? 0 : b) to a * b, I think.

pendingchaos · 2021-09-14T18:07:40Z

This is probably more a theoretical problem, but doesn't this need SignedZeroInfNanPreserve to ensure correctness? Without it it's allowed to optimize (b == 0 ? 0 : a) * (a == 0 ? 0 : b) to a * b, I think.

I don't think that's a valid optimization for the compiler to do. If one of the operands is 0 and the other is nan/inf, the expression has no multiplication by nan/inf

DadSchoorse · 2021-09-14T18:19:26Z

Okay, you're the compiler expert not me. 😄 My understanding of the default vulkan floating point rules is that the compiler is allowed to assume that no arguments are NaN or Inf tho.

pendingchaos · 2021-09-27T13:45:59Z

nmin(abs(a), abs(b)) == 0.0 ? 0.0 : a * b might be faster on devices where min can take two abs modifiers (most devices?), though the size of the spirv might be larger. It should also match v_mul_legacy_f32 exactly

pendingchaos · 2021-10-01T17:25:37Z

I think this (work in progress) branch should mostly restore RADV performance: https://gitlab.freedesktop.org/pendingchaos/mesa/-/commits/radv_zerowins_misc

I haven't actually tested this with DXVK though

src/dxso/dxso_compiler.cpp

Should be obsolete now.

misyltoad · 2021-10-15T10:03:04Z

src/dxso/dxso_compiler.cpp

        break;
      case DxsoOpcode::Rsq: 
        result.id = m_module.opFAbs(typeId,
          emitRegisterLoad(src[0], mask).id);

        result.id = m_module.opInverseSqrt(typeId,
          result.id);
-
-        if (m_moduleInfo.options.d3d9FloatEmulation) {


Can we keep the old path around as an option by default for now until we get support for this in a stable release of Mesa and on NV?

also, I'd like a way to turn this off, we don't rely on this behaviour in the DXVK Native titles at all.

the way to turn this off is disabling d3d9FloatEmulation, just like before?

I really don't want to have to ship two different code paths and enable the broken one by default. It'll just lead to a maintenance shitshow where we have to keep track of every single game that runs into the problem, and if we enable this conditionally based on the driver version or whatever it'll just lead to weird bug reports.

We can wait until the mesa optimization lands I guess, but once that happens I'd rather just have things work out of the box.

Then what do we do about Intel?

ignore and move on?

If their hardware has no way to support mul_legacy other than some global flag that we can't enable then there's not much we can do on our end.

They support flushing NaN/INF to FLT_MAX like we emulate rn.

sigh, fine, maintenance shitshow it is. I still hate of having to default to an option that is literally broken though.

DadSchoorse · 2021-10-15T10:36:58Z

I think this (work in progress) branch should mostly restore RADV performance: https://gitlab.freedesktop.org/pendingchaos/mesa/-/commits/radv_zerowins_misc

I haven't actually tested this with DXVK though

These patches fully restore performance in my handful of test cases.

K0bin · 2021-12-05T17:52:14Z

Merged with #2359

doitsujin requested review from K0bin and misyltoad September 14, 2021 13:42

K0bin approved these changes Sep 14, 2021

View reviewed changes

K0bin added the d3d9 label Sep 17, 2021

pendingchaos reviewed Oct 14, 2021

View reviewed changes

src/dxso/dxso_compiler.cpp Outdated Show resolved Hide resolved

doitsujin added 8 commits October 14, 2021 17:18

[dxso] Correctly handle multiplication by zero

9fb2091

[dxso] Handle multiplication by zero in dst instruction

3b98391

[dxso] Handle multiplication by zero in cross product

9fe83ff

[dxso] Handle multiplication by zero in matrix ALU instructions

8d28dec

[dxso] Handle multiplication by zero in TexM*Tex instructions

882a291

[dxso] Handle multiplication by zero when emitting clip distances

4ab6a03

[dxso] Remove old floatEmulation hacks

dc72b22

[d3d9] Do not replace NaN when uploading shader constants

5eafe80

Should be obsolete now.

doitsujin force-pushed the d3d9-float-memes branch from 88dbd45 to 5eafe80 Compare October 14, 2021 15:18

misyltoad requested changes Oct 15, 2021

View reviewed changes

doitsujin mentioned this pull request Oct 18, 2021

[d3d9] Flickering rectangles in Starcraft 2 when post-processing is set to ultra. #2338

Closed

K0bin mentioned this pull request Nov 12, 2021

Rework D3D9 float emulation - as an option #2359

Merged

K0bin closed this Dec 5, 2021

doitsujin deleted the d3d9-float-memes branch June 30, 2022 12:38

Blisto91 mentioned this pull request Jan 9, 2024

[dxvk] Optimize for the d3d9 Strict float emulation path GPUOpen-Drivers/AMDVLK#346

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework D3D9 float emulation #2294

Rework D3D9 float emulation #2294

doitsujin commented Sep 14, 2021 •

edited

Loading

DadSchoorse commented Sep 14, 2021

pendingchaos commented Sep 14, 2021

doitsujin commented Sep 14, 2021 •

edited

Loading

doitsujin commented Sep 14, 2021

misyltoad commented Sep 14, 2021

pendingchaos commented Sep 14, 2021 •

edited

Loading

DadSchoorse commented Sep 14, 2021

pendingchaos commented Sep 14, 2021

DadSchoorse commented Sep 14, 2021

pendingchaos commented Sep 27, 2021

pendingchaos commented Oct 1, 2021

misyltoad Oct 15, 2021

misyltoad Oct 15, 2021

doitsujin Oct 16, 2021 •

edited

Loading

doitsujin Oct 16, 2021

misyltoad Oct 16, 2021

doitsujin Oct 16, 2021 •

edited

Loading

misyltoad Oct 16, 2021

doitsujin Oct 17, 2021

DadSchoorse commented Oct 15, 2021

K0bin commented Dec 5, 2021

Rework D3D9 float emulation #2294

Rework D3D9 float emulation #2294

Conversation

doitsujin commented Sep 14, 2021 • edited Loading

DadSchoorse commented Sep 14, 2021

pendingchaos commented Sep 14, 2021

doitsujin commented Sep 14, 2021 • edited Loading

doitsujin commented Sep 14, 2021

misyltoad commented Sep 14, 2021

pendingchaos commented Sep 14, 2021 • edited Loading

DadSchoorse commented Sep 14, 2021

pendingchaos commented Sep 14, 2021

DadSchoorse commented Sep 14, 2021

pendingchaos commented Sep 27, 2021

pendingchaos commented Oct 1, 2021

misyltoad Oct 15, 2021

Choose a reason for hiding this comment

misyltoad Oct 15, 2021

Choose a reason for hiding this comment

doitsujin Oct 16, 2021 • edited Loading

Choose a reason for hiding this comment

doitsujin Oct 16, 2021

Choose a reason for hiding this comment

misyltoad Oct 16, 2021

Choose a reason for hiding this comment

doitsujin Oct 16, 2021 • edited Loading

Choose a reason for hiding this comment

misyltoad Oct 16, 2021

Choose a reason for hiding this comment

doitsujin Oct 17, 2021

Choose a reason for hiding this comment

DadSchoorse commented Oct 15, 2021

K0bin commented Dec 5, 2021

doitsujin commented Sep 14, 2021 •

edited

Loading

doitsujin commented Sep 14, 2021 •

edited

Loading

pendingchaos commented Sep 14, 2021 •

edited

Loading

doitsujin Oct 16, 2021 •

edited

Loading

doitsujin Oct 16, 2021 •

edited

Loading