Improve float to int truncation precision #835

dmitrykos · 2023-09-07T19:00:46Z

Improve float to int truncation precision by forcing floating-point rounding mode to rounding ~~towards zero or by using CPU SIMD which truncates by rounding towards zero~~ to nearest.

This PR addresses issues discussed in #390 and PR #403 which removed lrint in favor of C-style cast of floating point value to integer.

Currently, conversion via C-style cast works as expected, i.e. truncates floating point value to integer with rounding ~~towards zero~~ to nearest, only if CPU is set to rounding ~~towards zero~~ to nearest mode. But, if user code or compilation flags set some other rounding mode the conversion via C-style cast ~~becomes erroneous~~ results in unexpected result.

To circumvent it this PR proposes to set ~~needed~~ required rounding mode via fesetround API and reset it back to previous user-set rounding mode when conversion operation completes.

Besides fesetround API it is also possible to use specialized CPU SIMD which truncates with rounding towards zero, for example it can be x86 SSE and SSE2, ARMv8. By using CPU SIMD it is possible to avoid calling fesetround API completely that improves run-time performance.

This PR implements ~~these 2 approaches~~:

Use fesetround to set required rounding mode when no CPU SIMD is present for the truncation of float and double to int for the required rounding mode
~~Use CPU SIMD for truncation of float and double to int and in this case code related fesetround is optimized away by the compiler in Release build, so no overhead happens during run-time~~

philburk

Thanks for doing this. Nice work.

src/common/pa_converters.c

RossBencina · 2023-09-08T22:38:01Z

Hi Dmitry,

Phil and I have reviewed your change together. Improving the converter code is definitely something we want to do. We have the following critical questions which need to be covered first:

I know it changes the current behavior, but we think that round-to-nearest is the best choice numerically, mostly because it doesn't introduce zero-crossing distortion. Is it easy and/or possible to change your code to round-to-nearest?
What tests have you run? Have you tested on ARM? if so, which compilers and platforms?

Then there are lesser issues that will need to be discussed/addressed prior merge:

C89 and MSVC backward compatibility considerations:
- inline keyword
- fesetround
Compiler compatibility of SSE intrinsics?
Compiler compatibility of ARM inline asm syntax (this isn't going to work in MSVC is it? does it work in Clang?)
Function naming:
- Functions should follow our naming convention: start with a capital letter. Maybe use underscores consistent with the existing converter functions in that file e.g. Round_Float_To_Int32 otherwise RoundFloatToInt32
- Improve naming to clearer, e.g. 'Trunc' and 'Priv'

RossBencina · 2023-09-08T22:58:54Z

note so I don't forget: any change here may affect the relevance of pa_x86_plain_converters.c/.h

dmitrykos · 2023-09-09T11:18:12Z

Hi Ross!

Thank you for your comments.

I know it changes the current behavior, but we think that round-to-nearest is the best choice numerically, mostly because it doesn't introduce zero-crossing distortion. Is it easy and/or possible to change your code to round-to-nearest?

Yes it is indeed possible by setting fesetround to FE_TONEAREST but in this case the optimization with SIMD will be lost because SSE and ARMv8 SIMD is truncating always towards zero. So SIMD part of implementation needs to be removed. Also, I did not change scaling of float to integer but for FE_TOWARDZERO scaling has to be updated to: 0x80000000 for int32, 0x800000 for int24, 0x8000 for int16, 0x80 for int8 in order to reach full scale of integer representation for -1 and 1.

What tests have you run? Have you tested on ARM? if so, which compilers and platforms?

Windows platform but PA GitHub actions also compile for Ubuntu and OSX. fesetround has consistent behavior across platforms, so one would be sufficient. In case of FE_TONEAREST we would just guarantee C-style cast happens with default FE_TONEAREST rounding mode.

I did not implement PA tests for assembler inserts for SSE and ARMv8 but that code is taken from my project with >10 years of operation including ARMv8 on Linux and iOS. It makes sense to make tests if we agree on FE_TOWARDZERO with SIMD optimization.

C89 and MSVC backward compatibility considerations

I am able to compile for Windows XP with MinGW but do not have older MSVC than VS2019. MSDN has this API available for vs140 tools which if I am not mistaken can be used to target Windows XP builds. If anybody reports incompatibility I could add workaround in the future.

Compiler compatibility of SSE intrinsics?

Must be fairly portable as that intrinsic is available on all compilers.

Compiler compatibility of ARM inline asm syntax (this isn't going to work in MSVC is it? does it work in Clang?)

I updated implementation to limit to GCC and Clang but utilized platforms flags were set by GCC or Clang only, so it was fine already.

Function naming

Yes, adjusted.

To preserve backwards compatibility with older PA versions I probably need to change implementation to FE_TONEAREST and remove SIMD implementation. Or, update scaling to integer as I commented earlier and rely on current implementation. I will try to find time and implement zero crossing distortion comparison of current PA's implementation + FE_TONEAREST with a newly proposed implementation + FE_TOWARDZERO.

…rea by forcing floating-point rounding mode to rounding to nearest.

dmitrykos · 2023-10-30T19:37:37Z

After double checking PA implementation and specifically pa_x86_plain_converters.c I realized that expected by the library rounding mode is actually rounding to nearest and not rounding towards zero as I proposed initialy.

Therefore, I updated description and implementation to set rounding to nearest mode and removed all SIMD-related code which was doing rounding towards zero. That simplified PR and made it similar to pa_x86_plain_converters.c which forces rounding to nearest mode via fpuControlWord_

portaudio/src/os/win/pa_x86_plain_converters.c

Line 129 in 7e62dfc

    
           static const short fpuControlWord_ = 0x033F; /*round to nearest, 64 bit precision, all exceptions masked*/

@RossBencina, @philburk would you please check implementation again as it looks quite straight forward now.

src/common/pa_converters.c

philburk · 2024-10-28T22:56:01Z

I tried this out with C++ on a Mac:

// Compare casts
std::cout << "fegetround = " << fegetround() << std::endl;
std::cout << "((int32_t) 0.99) = " << ((int32_t) 0.99) << std::endl;
std::cout << "(nearbyint(0.99) = " << nearbyint(0.99) << std::endl;

fesetround(FE_TONEAREST);
std::cout << "FE_TONEAREST = " << FE_TONEAREST << std::endl;
std::cout << "((int32_t) 0.99) = " << ((int32_t) 0.99) << std::endl;
std::cout << "(nearbyint(0.99) = " << nearbyint(0.99) << std::endl;

fesetround(FE_DOWNWARD);
std::cout << "FE_DOWNWARD = " << FE_DOWNWARD << std::endl;
std::cout << "((int32_t) 0.99) = " << ((int32_t) 0.99) << std::endl;
std::cout << "(nearbyint(0.99) = " << nearbyint(0.99) << std::endl;

and got:

fegetround = 0
 ((int32_t) 0.99) = 0
(nearbyint(0.99) = 1
FE_TONEAREST = 0
((int32_t) 0.99) = 0
(nearbyint(0.99) = 1
FE_DOWNWARD = 8388608
((int32_t) 0.99) = 0
(nearbyint(0.99) = 0

RossBencina · 2024-10-28T22:57:17Z

TruncateFloatToInt32 and TruncateDoubleToInt32 seem to me to be unnecessary, we already have a standard syntax in C for the conversion from float to int. If the purpose is to abstract the conversion process with the intention that we may try different conversion methods in future, then they should be named ConvertFloatToInt32 and ConvertDoubleToInt32

Additionally, I thought the goal was to round to nearest, not to truncate. Also, inline is not a C89 keyword.

philburk · 2024-10-28T22:57:35Z

So it seems that fesetround() does not affect casting.
So this code may not have the desired effect.

Maybe we should be calling nearbyint() instead.

RossBencina · 2024-10-28T23:15:01Z

We're bumping this for a future release. It's important to fix but it's not ready.

RossBencina · 2024-10-28T23:38:44Z

The MS docs are vague on explicit casts, but I presume that they are saying that that fesetround doesn't affect float-to-int casts. In any case we have Phil's test results. https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fegetround-fesetround2?view=msvc-170

dmitrykos · 2024-10-29T05:44:52Z

According to Phil's results setting FE_DOWNWARD changes the behavior of float to int conversions. The purpose of this PR was to make behavior of PA's converting functions standardized, i.e. they rely on FE_TONEAREST currently. User code can use other rounding mode resulting in unexpected behavior of PA converting functions if FE_TONEAREST is not set.

TruncateFloatToInt32 and TruncateDoubleToInt32 seem to me to be unnecessary, we already have a standard syntax in C for the conversion from float to int. If the purpose is to abstract the conversion process

Yes, those functions had more work but the code changed during our discussion. I propose to leave just C-style casts as it was before, i.e. I will undo TruncateFloatToInt32 and TruncateDoubleToInt32 as they add unnecessary complexity for the reader of the code.

So it seems that fesetround() does not affect casting. So this code may not have the desired effect.
Maybe we should be calling nearbyint() instead.

nearbyint() will likely be a function call (at least in MSVC headers I found it as a function) and that will add an overhead (potentially, if compiler does not replace it with some inline code, but what if not and how platform-dependent it is?). To my view leaving just SetRoundingMode() with C-style cast would be the cleanest and fastest solution.

Remove inline keyword.

dmitrykos · 2024-10-29T06:12:55Z

I made changes, now implementation is limited to only setting/resetting the rounding mode around conversion code.

dmitrykos · 2024-10-30T11:03:39Z

The MS docs are vague on explicit casts, but I presume that they are saying that that fesetround doesn't affect float-to-int casts.

@RossBencina I executed Phil's code on MSVC and got the same result:

fegetround = 0
((int32_t) 0.99) = 0
(nearbyint(0.99) = 1
FE_TONEAREST = 0
((int32_t) 0.99) = 0
(nearbyint(0.99) = 1
FE_DOWNWARD = 256
((int32_t) 0.99) = 0
(nearbyint(0.99) = 0

Therefore the proposed PR is indeed useful to enforce FE_TONEAREST for the PA converters. Please check my recent changes, I cleanup the code by leaving only rounding mode toggling. I think it is completely safe to merge.

dmitrykos · 2024-10-30T11:21:21Z

I further debugged statement in Microsoft's documentation (https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/fegetround-fesetround2?view=msvc-170) and ~~can't confirm this~~:

Floating-point to integer implicit casts and conversions, which always round towards zero.

~~It must be a mistake of the documentation because implicit casts resulted in expected behavior of this code doing implicit casts~~:

volatile float fv = 0.99;

	// Compare casts
std::cout << "fegetround = " << fegetround() << std::endl;
volatile int32_t v1 = fv;
std::cout << "((int32_t) 0.99) = " << v1 << std::endl;
std::cout << "(nearbyint(0.99) = " << nearbyint(0.99) << std::endl;

fesetround(FE_TONEAREST);
std::cout << "FE_TONEAREST = " << FE_TONEAREST << std::endl;
volatile int32_t v2 = fv;
std::cout << "((int32_t) 0.99) = " << v2 << std::endl;
std::cout << "(nearbyint(0.99) = " << nearbyint(0.99) << std::endl;

fesetround(FE_DOWNWARD);
std::cout << "FE_DOWNWARD = " << FE_DOWNWARD << std::endl;
volatile int32_t v3 = fv;
std::cout << "((int32_t) 0.99) = " << v3 << std::endl;
std::cout << "(nearbyint(0.99) = " << nearbyint(0.99) << std::endl;

result:

fegetround = 0
((int32_t) 0.99) = 0
(nearbyint(0.99) = 1
FE_TONEAREST = 0
((int32_t) 0.99) = 0
(nearbyint(0.99) = 1
FE_DOWNWARD = 256
((int32_t) 0.99) = 0
(nearbyint(0.99) = 0

Checking the assembly (x86-64) reveals that in all 3 cases compiler is using the same instruction cvttss2si that means that fsetround() works as documented for implicit and explicit casts:

01C33DE3  movss       xmm0,dword ptr [fv]  
01C33DE8  cvttss2si   eax,xmm0  
01C33DEC  mov         dword ptr [v1],eax  

01C33ED6  movss       xmm0,dword ptr [fv]  
01C33EDB  cvttss2si   eax,xmm0  
01C33EDF  mov         dword ptr [v2],eax  

01C33FCF  movss       xmm0,dword ptr [fv]  
01C33FD4  cvttss2si   eax,xmm0  
01C33FD8  mov         dword ptr [v3],eax

philburk

I am still concerned about the new function names. See unresolved comments above.

philburk · 2024-11-08T13:36:37Z

src/common/pa_converters.c

@@ -358,18 +383,21 @@ static void Float32_To_Int32_Dither(
 {
    float *src = (float*)sourceBuffer;
    PaInt32 *dest =  (PaInt32*)destinationBuffer;
+    int prevMode = SetRoundingMode();

    while( count-- )
    {
        /* REVIEW */
        double dither  = PaUtil_GenerateFloatTriangularDither( ditherGenerator );
        /* use smaller scaler to prevent overflow when we add the dither */
        double dithered = ((double)*src * (2147483646.0)) + dither;


Now that we are rounding to nearest, will this smaller scaler be small enough to prevent numeric overflow during the cast?

This PR does not change rounding mode because rounding to nearest is the default rounding mode when process starts and therefore de facto PA is already assuming this rounding mode. Proposed change enforces this rounding mode explicitly to avoid situation that user sets for example FE_DOWNWARD which actually changes rounding mode and thus the result of PA converters will be different if we do not set FE_TONEAREST explicitly.

RossBencina · 2024-11-15T23:39:01Z

Floating-point to integer implicit casts and conversions, which always round towards zero.

It must be a mistake of the documentation because implicit casts resulted in expected behavior of this code doing implicit casts:

result:
fegetround = 0
((int32_t) 0.99) = 0
(nearbyint(0.99) = 1
FE_TONEAREST = 0
((int32_t) 0.99) = 0
(nearbyint(0.99) = 1
FE_DOWNWARD = 256
((int32_t) 0.99) = 0
(nearbyint(0.99) = 0

your test code is doing explicit casts
the documentation does not mention explicit casts.
your test code shows that explicit casts are always truncating to zero. this is seen by the test output, and also by the use of cvttss2si (Convert with Truncation: https://www.felixcloutier.com/x86/cvttss2si)

None of this surprises me.

My conclusion is that setting the rounding mode to FE_TONEAREST has no impact on explicit casts and so this PR as it stands is pointless. Am I missing something?

dmitrykos · 2024-11-16T06:36:16Z

Ross, you are right! My attention was somehow misled by result of nearbyint from Phil's example which is affected by rounding mode.

While in my test I was doing implicit casts:

volatile float fv = 0.99;
// Implicit cast of fv to v1
volatile int32_t v1 = fv;

those tests gave the same result (truncation towards zero) with implicit and explicit casts. Therefore we can conclude that if PA converters do not use math functions which are affected by rounding mode and rely on explicit or implicit floating-point to integer casts then they are unaffected by rounding mode and do not require any explicit rounding mode to be set.

I am closing this PR, feel free to reopen it for some reason.

dmitrykos added the src-common Common sources in /src/common label Sep 7, 2023

dmitrykos requested review from RossBencina and philburk September 7, 2023 19:00

dmitrykos self-assigned this Sep 7, 2023

dmitrykos force-pushed the fix_converters_float_to_int branch 2 times, most recently from c58eb96 to 8bde04d Compare September 7, 2023 19:32

philburk requested changes Sep 8, 2023

View reviewed changes

src/common/pa_converters.c Outdated Show resolved Hide resolved

dmitrykos force-pushed the fix_converters_float_to_int branch from 8bde04d to fed6f7e Compare September 9, 2023 09:14

Improve float to int truncation precision in generic implementation a…

bc52043

…rea by forcing floating-point rounding mode to rounding to nearest.

dmitrykos force-pushed the fix_converters_float_to_int branch from fed6f7e to bc52043 Compare October 30, 2023 19:19

dmitrykos requested a review from philburk October 30, 2023 19:37

philburk added the P1 Priority: Highest label May 24, 2024

philburk added this to the V19.8 milestone May 24, 2024

RossBencina assigned RossBencina and philburk and unassigned dmitrykos Jul 5, 2024

philburk requested changes Oct 28, 2024

View reviewed changes

src/common/pa_converters.c Outdated Show resolved Hide resolved

src/common/pa_converters.c Outdated Show resolved Hide resolved

RossBencina modified the milestones: V19.8, V19.9 Oct 28, 2024

RossBencina added P2 Priority: High and removed P1 Priority: Highest labels Oct 28, 2024

RossBencina mentioned this pull request Oct 28, 2024

Float32 -> integer sample conversions (e.g. Float32_To_Int32()) needs to be documented and tested #977

Open

Simplify code by removing TruncateFloatToInt32, TruncateDoubleToInt32.

9ffe667

Remove inline keyword.

philburk reviewed Nov 8, 2024

View reviewed changes

Modified SetRoundingMode to SetRoundingModeToNearest.

7483d6b

dmitrykos closed this Nov 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve float to int truncation precision #835

Improve float to int truncation precision #835

dmitrykos commented Sep 7, 2023 •

edited

Loading

philburk left a comment

RossBencina commented Sep 8, 2023

RossBencina commented Sep 8, 2023

dmitrykos commented Sep 9, 2023 •

edited

Loading

dmitrykos commented Oct 30, 2023 •

edited

Loading

philburk commented Oct 28, 2024

RossBencina commented Oct 28, 2024

philburk commented Oct 28, 2024

RossBencina commented Oct 28, 2024

RossBencina commented Oct 28, 2024

dmitrykos commented Oct 29, 2024 •

edited

Loading

dmitrykos commented Oct 29, 2024

dmitrykos commented Oct 30, 2024

dmitrykos commented Oct 30, 2024 •

edited

Loading

philburk left a comment

philburk Nov 8, 2024

dmitrykos Nov 8, 2024

RossBencina commented Nov 15, 2024

dmitrykos commented Nov 16, 2024 •

edited

Loading

Improve float to int truncation precision #835

Improve float to int truncation precision #835

Conversation

dmitrykos commented Sep 7, 2023 • edited Loading

philburk left a comment

Choose a reason for hiding this comment

RossBencina commented Sep 8, 2023

RossBencina commented Sep 8, 2023

dmitrykos commented Sep 9, 2023 • edited Loading

dmitrykos commented Oct 30, 2023 • edited Loading

philburk commented Oct 28, 2024

RossBencina commented Oct 28, 2024

philburk commented Oct 28, 2024

RossBencina commented Oct 28, 2024

RossBencina commented Oct 28, 2024

dmitrykos commented Oct 29, 2024 • edited Loading

dmitrykos commented Oct 29, 2024

dmitrykos commented Oct 30, 2024

dmitrykos commented Oct 30, 2024 • edited Loading

philburk left a comment

Choose a reason for hiding this comment

philburk Nov 8, 2024

Choose a reason for hiding this comment

dmitrykos Nov 8, 2024

Choose a reason for hiding this comment

RossBencina commented Nov 15, 2024

dmitrykos commented Nov 16, 2024 • edited Loading

dmitrykos commented Sep 7, 2023 •

edited

Loading

dmitrykos commented Sep 9, 2023 •

edited

Loading

dmitrykos commented Oct 30, 2023 •

edited

Loading

dmitrykos commented Oct 29, 2024 •

edited

Loading

dmitrykos commented Oct 30, 2024 •

edited

Loading

dmitrykos commented Nov 16, 2024 •

edited

Loading