Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Improve {u}int/long.ToString/TryFormat throughput by pre-computing the length #17432

Merged
merged 2 commits into from
Apr 5, 2018

Conversation

stephentoub
Copy link
Member

The first commit just moves the Count{Hex}Digits methods from https://github.com/dotnet/corefx/blob/master/src/System.Memory/src/System/Buffers/Text/Utf8Formatter/FormattingHelpers.cs into a partial FormattingHelpers.CountDigits.cs file in the shared partition. Once those changes replicate to corefx, I'll dedup the code there.

The second commit then uses Count{Hex}Digits in the ToString and TryFormat methods of int, uint, long, and ulong, in particular for the default D format (and some G configurations) as well as the X format. Currently we create a temporary buffer on the stack, format into it, and then copy from that stack buffer into either the target span (for TryFormat) or into a new string (for ToString. Following the approach (and sharing the same code) from Utf8Formatter, where it first counts the number of digits in the output in order to determine an exact length, this commit changes the implementation to skip the temporary buffer and just format directly into the destination span or string.

Contributes to https://github.com/dotnet/coreclr/issues/15364
cc: @jkotas, @ahsonkhan, @danmosemsft

System.Runtime.Performance.Tests.dll Before After Diff
System.Tests.Perf_Int32.ToString(value: 0) 12.75 10.85 1.18x
System.Tests.Perf_Int32.ToString(value: 1) 12.89 10.87 1.19x
System.Tests.Perf_Int32.ToString(value: -1) 21.77 18.36 1.19x
System.Tests.Perf_Int32.ToString(value: 1283) 13.80 12.62 1.09x
System.Tests.Perf_Int32.ToString(value: -1283) 24.29 19.91 1.22x
System.Tests.Perf_Int32.ToString(value: 12837467) 16.09 15.29 1.05x
System.Tests.Perf_Int32.ToString(value: -12837467) 28.17 23.59 1.19x
System.Tests.Perf_Int32.ToString(value: 2147483647) 17.83 17.17 1.04x
System.Tests.Perf_Int32.ToString(value: -2147483648) 28.93 24.75 1.17x
System.Tests.Perf_Int32.TryFormat(value: 0) 13.46 7.35 1.83x
System.Tests.Perf_Int32.TryFormat(value: 1) 13.57 7.38 1.84x
System.Tests.Perf_Int32.TryFormat(value: -1) 23.64 14.33 1.65x
System.Tests.Perf_Int32.TryFormat(value: 1283) 14.29 9.11 1.57x
System.Tests.Perf_Int32.TryFormat(value: -1283) 25.64 15.95 1.61x
System.Tests.Perf_Int32.TryFormat(value: 12837467) 16.26 11.74 1.39x
System.Tests.Perf_Int32.TryFormat(value: -12837467) 28.19 18.55 1.52x
System.Tests.Perf_Int32.TryFormat(value: 2147483647) 17.55 13.46 1.30x
System.Tests.Perf_Int32.TryFormat(value: -2147483648) 29.63 20.01 1.48x
System.Tests.Perf_Int64.ToString(value: 0) 16.11 11.52 1.40x
System.Tests.Perf_Int64.ToString(value: 2) 15.60 11.72 1.33x
System.Tests.Perf_Int64.ToString(value: -2) 23.14 18.62 1.24x
System.Tests.Perf_Int64.ToString(value: 21) 15.58 12.08 1.29x
System.Tests.Perf_Int64.ToString(value: -21) 23.05 18.74 1.23x
System.Tests.Perf_Int64.ToString(value: 214) 16.41 12.44 1.32x
System.Tests.Perf_Int64.ToString(value: -214) 24.77 19.18 1.29x
System.Tests.Perf_Int64.ToString(value: 2147) 17.20 13.40 1.28x
System.Tests.Perf_Int64.ToString(value: -2147) 24.74 20.36 1.22x
System.Tests.Perf_Int64.ToString(value: 21474) 17.78 13.99 1.27x
System.Tests.Perf_Int64.ToString(value: -21474) 26.44 20.80 1.27x
System.Tests.Perf_Int64.ToString(value: 214748) 19.29 14.63 1.32x
System.Tests.Perf_Int64.ToString(value: -214748) 26.31 21.64 1.22x
System.Tests.Perf_Int64.ToString(value: 2147483) 18.58 15.38 1.21x
System.Tests.Perf_Int64.ToString(value: -2147483) 27.34 22.50 1.22x
System.Tests.Perf_Int64.ToString(value: 21474836) 19.92 15.91 1.25x
System.Tests.Perf_Int64.ToString(value: -21474836) 28.94 23.57 1.23x
System.Tests.Perf_Int64.ToString(value: 214748364) 21.16 16.96 1.25x
System.Tests.Perf_Int64.ToString(value: -214748364) 29.22 24.02 1.22x
System.Tests.Perf_Int64.ToString(value: 2147483647) 20.91 17.52 1.19x
System.Tests.Perf_Int64.ToString(value: -2147483648) 29.79 24.91 1.20x
System.Tests.Perf_Int64.ToString(value: 4294967295000000000) 28.11 25.64 1.10x
System.Tests.Perf_Int64.ToString(value: -4294967295000000000) 38.00 32.82 1.16x
System.Tests.Perf_Int64.ToString(value: 4294967295000000001) 28.29 25.37 1.12x
System.Tests.Perf_Int64.ToString(value: -4294967295000000001) 37.98 32.92 1.15x
System.Tests.Perf_Int64.ToString(value: 92233720368) 23.49 19.77 1.19x
System.Tests.Perf_Int64.ToString(value: -92233720368) 32.46 26.81 1.21x
System.Tests.Perf_Int64.ToString(value: 922337203685) 24.38 20.55 1.19x
System.Tests.Perf_Int64.ToString(value: -922337203685) 33.12 28.09 1.18x
System.Tests.Perf_Int64.ToString(value: 9223372036854) 24.76 21.66 1.14x
System.Tests.Perf_Int64.ToString(value: -9223372036854) 33.59 28.63 1.17x
System.Tests.Perf_Int64.ToString(value: 92233720368547) 25.18 22.44 1.12x
System.Tests.Perf_Int64.ToString(value: -92233720368547) 34.20 28.82 1.19x
System.Tests.Perf_Int64.ToString(value: 922337203685477) 25.86 21.90 1.18x
System.Tests.Perf_Int64.ToString(value: -922337203685477) 35.22 28.95 1.22x
System.Tests.Perf_Int64.ToString(value: 9223372036854775) 26.15 22.77 1.15x
System.Tests.Perf_Int64.ToString(value: -9223372036854775) 35.98 29.56 1.22x
System.Tests.Perf_Int64.ToString(value: 92233720368547758) 26.96 23.60 1.14x
System.Tests.Perf_Int64.ToString(value: -92233720368547758) 37.16 30.31 1.23x
System.Tests.Perf_Int64.ToString(value: 922337203685477580) 27.66 24.16 1.14x
System.Tests.Perf_Int64.ToString(value: -922337203685477580) 37.60 30.96 1.21x
System.Tests.Perf_Int64.ToString(value: 9223372036854775807) 30.28 26.95 1.12x
System.Tests.Perf_Int64.ToString(value: -9223372036854775808) 41.47 33.76 1.23x
System.Tests.Perf_Int64.TryFormat(value: 0) 16.67 8.01 2.08x
System.Tests.Perf_Int64.TryFormat(value: 2) 15.28 8.07 1.89x
System.Tests.Perf_Int64.TryFormat(value: -2) 24.01 14.63 1.64x
System.Tests.Perf_Int64.TryFormat(value: 21) 16.00 9.15 1.75x
System.Tests.Perf_Int64.TryFormat(value: -21) 24.82 14.96 1.66x
System.Tests.Perf_Int64.TryFormat(value: 214) 16.27 9.81 1.66x
System.Tests.Perf_Int64.TryFormat(value: -214) 25.33 15.91 1.59x
System.Tests.Perf_Int64.TryFormat(value: 2147) 16.65 10.52 1.58x
System.Tests.Perf_Int64.TryFormat(value: -2147) 26.83 16.42 1.63x
System.Tests.Perf_Int64.TryFormat(value: 21474) 17.45 11.13 1.57x
System.Tests.Perf_Int64.TryFormat(value: -21474) 26.92 16.59 1.62x
System.Tests.Perf_Int64.TryFormat(value: 214748) 17.71 11.71 1.51x
System.Tests.Perf_Int64.TryFormat(value: -214748) 27.18 17.66 1.54x
System.Tests.Perf_Int64.TryFormat(value: 2147483) 19.02 12.18 1.56x
System.Tests.Perf_Int64.TryFormat(value: -2147483) 28.04 18.34 1.53x
System.Tests.Perf_Int64.TryFormat(value: 21474836) 18.76 13.16 1.43x
System.Tests.Perf_Int64.TryFormat(value: -21474836) 29.07 18.67 1.56x
System.Tests.Perf_Int64.TryFormat(value: 214748364) 19.35 13.65 1.42x
System.Tests.Perf_Int64.TryFormat(value: -214748364) 29.63 19.78 1.50x
System.Tests.Perf_Int64.TryFormat(value: 2147483647) 20.41 14.13 1.44x
System.Tests.Perf_Int64.TryFormat(value: -2147483648) 30.42 20.92 1.45x
System.Tests.Perf_Int64.TryFormat(value: 4294967295000000000) 27.36 21.16 1.29x
System.Tests.Perf_Int64.TryFormat(value: -4294967295000000000) 36.75 27.20 1.35x
System.Tests.Perf_Int64.TryFormat(value: 4294967295000000001) 27.10 21.35 1.27x
System.Tests.Perf_Int64.TryFormat(value: -4294967295000000001) 36.55 27.21 1.34x
System.Tests.Perf_Int64.TryFormat(value: 92233720368) 22.23 17.38 1.28x
System.Tests.Perf_Int64.TryFormat(value: -92233720368) 31.76 22.69 1.40x
System.Tests.Perf_Int64.TryFormat(value: 922337203685) 24.25 16.95 1.43x
System.Tests.Perf_Int64.TryFormat(value: -922337203685) 32.23 23.23 1.39x
System.Tests.Perf_Int64.TryFormat(value: 9223372036854) 23.20 17.66 1.31x
System.Tests.Perf_Int64.TryFormat(value: -9223372036854) 32.78 23.79 1.38x
System.Tests.Perf_Int64.TryFormat(value: 92233720368547) 23.90 18.21 1.31x
System.Tests.Perf_Int64.TryFormat(value: -92233720368547) 32.98 24.20 1.36x
System.Tests.Perf_Int64.TryFormat(value: 922337203685477) 24.49 18.40 1.33x
System.Tests.Perf_Int64.TryFormat(value: -922337203685477) 34.17 24.35 1.40x
System.Tests.Perf_Int64.TryFormat(value: 9223372036854775) 25.04 20.20 1.24x
System.Tests.Perf_Int64.TryFormat(value: -9223372036854775) 35.39 25.29 1.40x
System.Tests.Perf_Int64.TryFormat(value: 92233720368547758) 26.46 19.82 1.33x
System.Tests.Perf_Int64.TryFormat(value: -92233720368547758) 35.27 25.63 1.38x
System.Tests.Perf_Int64.TryFormat(value: 922337203685477580) 26.59 21.72 1.22x
System.Tests.Perf_Int64.TryFormat(value: -922337203685477580) 35.66 26.36 1.35x
System.Tests.Perf_Int64.TryFormat(value: 9223372036854775807) 28.94 22.80 1.27x
System.Tests.Perf_Int64.TryFormat(value: -9223372036854775808) 37.46 29.98 1.25x
System.Tests.Perf_UInt32.ToString(value: 0) 13.21 10.39 1.27x
System.Tests.Perf_UInt32.ToString(value: 1) 12.88 10.61 1.21x
System.Tests.Perf_UInt32.ToString(value: 1283) 13.74 12.67 1.09x
System.Tests.Perf_UInt32.ToString(value: 12837467) 16.11 15.03 1.07x
System.Tests.Perf_UInt32.ToString(value: 4294967295) 17.54 16.14 1.09x
System.Tests.Perf_UInt32.TryFormat(value: 0) 13.38 7.22 1.85x
System.Tests.Perf_UInt32.TryFormat(value: 1) 13.47 7.22 1.87x
System.Tests.Perf_UInt32.TryFormat(value: 1283) 14.88 9.16 1.62x
System.Tests.Perf_UInt32.TryFormat(value: 12837467) 16.36 11.56 1.42x
System.Tests.Perf_UInt32.TryFormat(value: 4294967295) 17.33 13.18 1.31x
System.Tests.Perf_UInt64.ToString(value: 0) 2.99 2.35 1.28x
System.Tests.Perf_UInt64.ToString(value: 1000000000000000000) 5.54 5.16 1.08x
System.Tests.Perf_UInt64.ToString(value: 18446744073709551615) 6.16 6.14 1.00x
System.Tests.Perf_UInt64.ToString(value: 2) 2.97 2.38 1.25x
System.Tests.Perf_UInt64.ToString(value: 21) 3.13 2.46 1.27x
System.Tests.Perf_UInt64.ToString(value: 214) 3.23 2.67 1.21x
System.Tests.Perf_UInt64.ToString(value: 2147) 3.40 2.86 1.19x
System.Tests.Perf_UInt64.ToString(value: 21474) 3.46 2.93 1.18x
System.Tests.Perf_UInt64.ToString(value: 214748) 3.53 3.03 1.17x
System.Tests.Perf_UInt64.ToString(value: 2147483) 3.62 3.17 1.14x
System.Tests.Perf_UInt64.ToString(value: 21474836) 3.80 3.30 1.15x
System.Tests.Perf_UInt64.ToString(value: 214748364) 4.03 3.58 1.13x
System.Tests.Perf_UInt64.ToString(value: 2147483647) 4.13 3.53 1.17x
System.Tests.Perf_UInt64.ToString(value: 4294967295000000000) 5.58 5.48 1.02x
System.Tests.Perf_UInt64.ToString(value: 4294967295000000001) 5.65 5.19 1.09x
System.Tests.Perf_UInt64.ToString(value: 92233720368) 4.62 4.07 1.13x
System.Tests.Perf_UInt64.ToString(value: 922337203685) 4.75 4.17 1.14x
System.Tests.Perf_UInt64.ToString(value: 9223372036854) 4.74 4.47 1.06x
System.Tests.Perf_UInt64.ToString(value: 92233720368547) 4.95 4.77 1.04x
System.Tests.Perf_UInt64.ToString(value: 922337203685477) 5.02 4.49 1.12x
System.Tests.Perf_UInt64.ToString(value: 9223372036854775) 5.17 4.64 1.11x
System.Tests.Perf_UInt64.ToString(value: 92233720368547758) 5.28 4.88 1.08x
System.Tests.Perf_UInt64.ToString(value: 922337203685477580) 5.55 5.44 1.02x
System.Tests.Perf_UInt64.ToString(value: 9223372036854775807) 5.95 5.53 1.08x
System.Tests.Perf_UInt64.TryFormat(value: 0) 3.14 1.67 1.88x
System.Tests.Perf_UInt64.TryFormat(value: 1000000000000000000) 5.49 4.27 1.28x
System.Tests.Perf_UInt64.TryFormat(value: 18446744073709551615) 5.95 4.69 1.27x
System.Tests.Perf_UInt64.TryFormat(value: 2) 3.12 1.64 1.90x
System.Tests.Perf_UInt64.TryFormat(value: 21) 3.19 1.85 1.72x
System.Tests.Perf_UInt64.TryFormat(value: 214) 3.38 1.94 1.74x
System.Tests.Perf_UInt64.TryFormat(value: 2147) 3.40 2.05 1.66x
System.Tests.Perf_UInt64.TryFormat(value: 21474) 3.48 2.19 1.59x
System.Tests.Perf_UInt64.TryFormat(value: 214748) 3.54 2.35 1.50x
System.Tests.Perf_UInt64.TryFormat(value: 2147483) 3.70 2.40 1.54x
System.Tests.Perf_UInt64.TryFormat(value: 21474836) 3.69 2.65 1.39x
System.Tests.Perf_UInt64.TryFormat(value: 214748364) 3.90 2.67 1.46x
System.Tests.Perf_UInt64.TryFormat(value: 2147483647) 4.06 2.77 1.47x
System.Tests.Perf_UInt64.TryFormat(value: 4294967295000000000) 5.71 4.33 1.32x
System.Tests.Perf_UInt64.TryFormat(value: 4294967295000000001) 5.45 4.26 1.28x
System.Tests.Perf_UInt64.TryFormat(value: 92233720368) 4.47 3.35 1.34x
System.Tests.Perf_UInt64.TryFormat(value: 922337203685) 4.57 3.42 1.34x
System.Tests.Perf_UInt64.TryFormat(value: 9223372036854) 5.05 3.55 1.42x
System.Tests.Perf_UInt64.TryFormat(value: 92233720368547) 4.80 3.73 1.29x
System.Tests.Perf_UInt64.TryFormat(value: 922337203685477) 4.92 3.73 1.32x
System.Tests.Perf_UInt64.TryFormat(value: 9223372036854775) 5.10 3.82 1.34x
System.Tests.Perf_UInt64.TryFormat(value: 92233720368547758) 5.14 3.95 1.30x
System.Tests.Perf_UInt64.TryFormat(value: 922337203685477580) 5.29 4.09 1.30x
System.Tests.Perf_UInt64.TryFormat(value: 9223372036854775807) 5.79 4.61 1.26x

Currently we create a temporary buffer on the stack, format into it, and then copy from that stack buffer into either the target span (for TryFormat) or into a new string (for ToString.

Following the approach as (and sharing the same code from) Utf8Formatter, where it first counts the number of digits in the output in order to determine an exact length, this commit changes the implementation to skip the temporary buffer and just format directly into the destination span or string.

This results in a very measurable performance boost.
@benaadams
Copy link
Member

benaadams commented Apr 5, 2018

Unrelated, but wondering if some of the pointer arithmetic data dependency could be broken in some of these loops?

Like in Int32ToNumber

int i = (int)(buffer + Int32Precision - p);
 	 
number.scale = i;
 
char* dst = number.digits;
while (--i >= 0)
     *dst++ = *p++;

There's a result dependency on inc i for the loop; can't do much about but also will likely hit a result dependency on both inc dst and inc p for the address of the data.

So could change to only depending on the result of i and not change dst or p?

int count = (int)(buffer + Int32Precision - p);
 	 
number.scale = count;
 
char* dst = number.digits;
for (int i = 0; i < count; i++)
{
    // *(dst + i) = *(p + i);
    dst[i] = p[i];
}

Not sure if the above is completely correct; easily confused by postfix vs prefix operators, and that uses both!

@stephentoub
Copy link
Member Author

@dotnet-bot test Ubuntu arm Cross Checked Innerloop Build and Test please

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@stephentoub stephentoub merged commit 1e6b28c into dotnet:master Apr 5, 2018
@stephentoub stephentoub deleted the portnumericperf branch April 5, 2018 15:55
dotnet-bot pushed a commit to dotnet/corert that referenced this pull request Apr 5, 2018
Improve {u}int/long.ToString/TryFormat throughput by pre-computing the length

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
stephentoub added a commit to dotnet/corert that referenced this pull request Apr 5, 2018
Improve {u}int/long.ToString/TryFormat throughput by pre-computing the length

Signed-off-by: dotnet-bot <dotnet-bot@microsoft.com>
@@ -52,6 +52,7 @@
<Compile Include="$(MSBuildThisFileDirectory)System\Buffers\MemoryManager.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Buffers\TlsOverPerCoreLockedStacksArrayPool.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Buffers\Utilities.cs" />
<Compile Include="$(MSBuildThisFileDirectory)System\Buffers\Text\FormattingHelpers.CountDigits.cs" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: sort order

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's wrong with the sort order? Don't we normally put files in a directory before folders in that directory?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know that was what we were doing.

Looking at this file, we seem to be following alphabetical order (only):

    <Compile Include="$(MSBuildThisFileDirectory)System\Globalization\UnicodeCategory.cs" />
    <Compile Include="$(MSBuildThisFileDirectory)System\Guid.cs" />

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW VS inserts items with simple string sort.

Of course it won't ever insert into an imported file like this.

@ahsonkhan
Copy link
Member

Awesome! Perf improvement across the board :)

@stephentoub
Copy link
Member Author

@Anipik, is the mirror to corefx running? I haven't seen the relevant pieces here mirrored yet. Thanks.

dotnet-bot pushed a commit to dotnet/corefx that referenced this pull request Apr 7, 2018
Improve {u}int/long.ToString/TryFormat throughput by pre-computing the length

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
@Anipik
Copy link

Anipik commented Apr 7, 2018

Done started the mirror

@stephentoub
Copy link
Member Author

Thanks

stephentoub added a commit to dotnet/corefx that referenced this pull request Apr 9, 2018
Improve {u}int/long.ToString/TryFormat throughput by pre-computing the length

Signed-off-by: dotnet-bot-corefx-mirror <dotnet-bot@microsoft.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants