Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BigInteger parsing optimization for large decimal string #51953

Closed
wants to merge 3 commits into from
Closed

BigInteger parsing optimization for large decimal string #51953

wants to merge 3 commits into from

Conversation

key-moon
Copy link
Contributor

@key-moon key-moon commented Apr 27, 2021

Current BigNumer.NumberToBigInteger method is implemented using naive algorithm. It runs in Θ(N^2) time where N is number of digits. I implemented faster method known as divide-and-conquer algorithm. It runs Θ(N (log(N))^2). Since this algorithms running time has large constant factor, naive method is faster when N is small. So This method is only apply when N is large enough. (specifically, use divide-and-conquer method when N is more than 20000.)

I created branch from #47842 as it looks like #47842 will be merged shortly.

benchmark result

Previous method
BenchmarkDotNet=v0.12.1.1528-nightly, OS=Windows 10.0.19042.928 (20H2/October2020Update)
Intel Core i7-7500U CPU 2.70GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores
.NET SDK=6.0.100-preview.3.21202.5
  [Host]     : .NET 6.0.0 (6.0.21.20104), X64 RyuJIT
  Job-CHCQPG : .NET 6.0.0 (42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  Toolchain=CoreRun  
IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15  
WarmupCount=1  
Method numberString Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
Parse 12345678901(...)01234567890 [20000] 2.611 ms 0.1786 ms 0.2057 ms 2.661 ms 2.208 ms 2.996 ms 26.7857 - - 56 KB
Parse 12345678901(...)01234567890 [40000] 9.028 ms 0.7436 ms 0.8265 ms 8.816 ms 8.072 ms 10.909 ms 31.2500 - - 96 KB
Parse 12345678901(...)01234567890 [60000] 18.521 ms 0.5179 ms 0.5756 ms 18.552 ms 17.349 ms 19.541 ms 71.4286 - - 151 KB
Parse 12345678901(...)01234567890 [80000] 32.904 ms 1.1486 ms 1.2290 ms 32.761 ms 30.984 ms 34.686 ms - - - 191 KB
Parse 12345678901(...)01234567890 [100000] 51.271 ms 1.7700 ms 2.0384 ms 51.186 ms 48.447 ms 54.856 ms - - - 246 KB
Parse 12345678901(...)01234567890 [120000] 74.285 ms 3.6392 ms 3.8939 ms 73.489 ms 68.604 ms 82.719 ms - - - 286 KB
Parse 12345678901(...)01234567890 [140000] 100.248 ms 5.2825 ms 5.8715 ms 98.712 ms 92.597 ms 113.046 ms - - - 341 KB
Parse 12345678901(...)01234567890 [160000] 127.443 ms 4.8226 ms 5.5537 ms 125.749 ms 120.504 ms 138.583 ms - - - 381 KB
Parse 12345678901(...)01234567890 [180000] 162.028 ms 4.9852 ms 5.7409 ms 162.142 ms 151.841 ms 171.640 ms - - - 436 KB
Parse 12345678901(...)01234567890 [200000] 224.511 ms 16.7864 ms 19.3313 ms 224.392 ms 192.558 ms 255.098 ms - - - 476 KB
Parse 12345678901(...)01234567890 [220000] 240.451 ms 10.9087 ms 12.1250 ms 234.971 ms 226.956 ms 265.072 ms - - - 531 KB
Parse 12345678901(...)01234567890 [240000] 300.378 ms 25.1560 ms 27.9608 ms 296.348 ms 267.565 ms 357.131 ms - - - 572 KB
Parse 12345678901(...)01234567890 [260000] 346.397 ms 26.8156 ms 29.8054 ms 335.024 ms 320.430 ms 412.120 ms - - - 626 KB
Parse 12345678901(...)01234567890 [280000] 384.442 ms 12.8365 ms 14.2678 ms 381.489 ms 367.624 ms 420.767 ms - - - 666 KB
Parse 12345678901(...)01234567890 [300000] 433.914 ms 9.1847 ms 10.5772 ms 433.046 ms 414.254 ms 452.394 ms - - - 721 KB
Implemented method
BenchmarkDotNet=v0.12.1.1528-nightly, OS=Windows 10.0.19042.928 (20H2/October2020Update)
Intel Core i7-7500U CPU 2.70GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores
.NET SDK=6.0.100-preview.3.21202.5
  [Host]     : .NET 6.0.0 (6.0.21.20104), X64 RyuJIT
  Job-YASIZJ : .NET 6.0.0 (42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  Toolchain=CoreRun  
IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15  
WarmupCount=1  
Method numberString Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
Parse 12345678901(...)01234567890 [20000] 2.017 ms 0.0559 ms 0.0643 ms 2.024 ms 1.921 ms 2.127 ms 140.6250 - - 288 KB
Parse 12345678901(...)01234567890 [40000] 6.237 ms 0.1116 ms 0.0932 ms 6.218 ms 6.127 ms 6.434 ms 437.5000 83.3333 - 994 KB
Parse 12345678901(...)01234567890 [60000] 15.522 ms 1.3853 ms 1.5953 ms 15.153 ms 13.686 ms 18.949 ms 1062.5000 218.7500 - 2,831 KB
Parse 12345678901(...)01234567890 [80000] 20.554 ms 0.4685 ms 0.5208 ms 20.429 ms 19.880 ms 21.552 ms 1181.8182 181.8182 - 3,572 KB
Parse 12345678901(...)01234567890 [100000] 39.146 ms 1.8050 ms 1.8536 ms 38.681 ms 36.992 ms 44.189 ms 3000.0000 333.3333 - 7,680 KB
Parse 12345678901(...)01234567890 [120000] 47.999 ms 1.5866 ms 1.8271 ms 47.429 ms 46.147 ms 51.705 ms 3750.0000 250.0000 - 10,447 KB
Parse 12345678901(...)01234567890 [140000] 41.915 ms 1.3341 ms 1.3700 ms 41.657 ms 40.299 ms 44.621 ms 2166.6667 500.0000 - 6,532 KB
Parse 12345678901(...)01234567890 [160000] 67.668 ms 1.4404 ms 1.6010 ms 67.296 ms 65.587 ms 70.763 ms 4750.0000 250.0000 - 12,897 KB
Parse 12345678901(...)01234567890 [180000] 103.943 ms 2.4616 ms 2.8347 ms 103.285 ms 100.249 ms 109.843 ms 6500.0000 500.0000 - 18,962 KB
Parse 12345678901(...)01234567890 [200000] 136.343 ms 3.9093 ms 4.0146 ms 135.607 ms 131.575 ms 148.198 ms 9500.0000 500.0000 - 28,922 KB
Parse 12345678901(...)01234567890 [220000] 154.695 ms 3.6123 ms 3.7096 ms 154.182 ms 149.801 ms 163.375 ms 17500.0000 500.0000 - 40,999 KB
Parse 12345678901(...)01234567890 [240000] 184.792 ms 20.8938 ms 24.0613 ms 173.449 ms 163.513 ms 234.269 ms 16000.0000 1000.0000 - 39,244 KB
Parse 12345678901(...)01234567890 [260000] 160.434 ms 3.0439 ms 3.5053 ms 160.026 ms 155.602 ms 167.424 ms 12500.0000 500.0000 - 33,322 KB
Parse 12345678901(...)01234567890 [280000] 134.838 ms 2.4207 ms 2.2644 ms 134.531 ms 132.314 ms 139.587 ms 9500.0000 500.0000 - 22,696 KB
Parse 12345678901(...)01234567890 [300000] 140.690 ms 2.4686 ms 2.4245 ms 141.285 ms 134.758 ms 144.021 ms 7500.0000 1500.0000 1000.0000 24,318 KB

@ghost
Copy link

ghost commented Apr 27, 2021

Tagging subscribers to this area: @tannergooding, @pgovind
See info in area-owners.md if you want to be subscribed.

Issue Details

Current BigNumer.NumberToBigInteger method is implemented using naive algorithm. It runs in Θ(N^2) time where N is number of digits. I implemented faster method known as divide-and-conquer algorithm. It runs Θ(N (log(N))^2). Since this algorithms running time has large constant factor, naive method is faster when N is small. So This method is only apply when N is large enough. (specifically, use divide-and-conquer method when N is more than 20000.)

I created branch from #47842 because #47872 looks like merged soon. sorry for inconvenience.

benchmark result

Previous method
BenchmarkDotNet=v0.12.1.1528-nightly, OS=Windows 10.0.19042.928 (20H2/October2020Update)
Intel Core i7-7500U CPU 2.70GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores
.NET SDK=6.0.100-preview.3.21202.5
  [Host]     : .NET 6.0.0 (6.0.21.20104), X64 RyuJIT
  Job-CHCQPG : .NET 6.0.0 (42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  Toolchain=CoreRun  
IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15  
WarmupCount=1  
Method numberString Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
Parse 12345678901(...)01234567890 [20000] 2.611 ms 0.1786 ms 0.2057 ms 2.661 ms 2.208 ms 2.996 ms 26.7857 - - 56 KB
Parse 12345678901(...)01234567890 [40000] 9.028 ms 0.7436 ms 0.8265 ms 8.816 ms 8.072 ms 10.909 ms 31.2500 - - 96 KB
Parse 12345678901(...)01234567890 [60000] 18.521 ms 0.5179 ms 0.5756 ms 18.552 ms 17.349 ms 19.541 ms 71.4286 - - 151 KB
Parse 12345678901(...)01234567890 [80000] 32.904 ms 1.1486 ms 1.2290 ms 32.761 ms 30.984 ms 34.686 ms - - - 191 KB
Parse 12345678901(...)01234567890 [100000] 51.271 ms 1.7700 ms 2.0384 ms 51.186 ms 48.447 ms 54.856 ms - - - 246 KB
Parse 12345678901(...)01234567890 [120000] 74.285 ms 3.6392 ms 3.8939 ms 73.489 ms 68.604 ms 82.719 ms - - - 286 KB
Parse 12345678901(...)01234567890 [140000] 100.248 ms 5.2825 ms 5.8715 ms 98.712 ms 92.597 ms 113.046 ms - - - 341 KB
Parse 12345678901(...)01234567890 [160000] 127.443 ms 4.8226 ms 5.5537 ms 125.749 ms 120.504 ms 138.583 ms - - - 381 KB
Parse 12345678901(...)01234567890 [180000] 162.028 ms 4.9852 ms 5.7409 ms 162.142 ms 151.841 ms 171.640 ms - - - 436 KB
Parse 12345678901(...)01234567890 [200000] 224.511 ms 16.7864 ms 19.3313 ms 224.392 ms 192.558 ms 255.098 ms - - - 476 KB
Parse 12345678901(...)01234567890 [220000] 240.451 ms 10.9087 ms 12.1250 ms 234.971 ms 226.956 ms 265.072 ms - - - 531 KB
Parse 12345678901(...)01234567890 [240000] 300.378 ms 25.1560 ms 27.9608 ms 296.348 ms 267.565 ms 357.131 ms - - - 572 KB
Parse 12345678901(...)01234567890 [260000] 346.397 ms 26.8156 ms 29.8054 ms 335.024 ms 320.430 ms 412.120 ms - - - 626 KB
Parse 12345678901(...)01234567890 [280000] 384.442 ms 12.8365 ms 14.2678 ms 381.489 ms 367.624 ms 420.767 ms - - - 666 KB
Parse 12345678901(...)01234567890 [300000] 433.914 ms 9.1847 ms 10.5772 ms 433.046 ms 414.254 ms 452.394 ms - - - 721 KB
Implemented method
BenchmarkDotNet=v0.12.1.1528-nightly, OS=Windows 10.0.19042.928 (20H2/October2020Update)
Intel Core i7-7500U CPU 2.70GHz (Kaby Lake), 1 CPU, 4 logical and 2 physical cores
.NET SDK=6.0.100-preview.3.21202.5
  [Host]     : .NET 6.0.0 (6.0.21.20104), X64 RyuJIT
  Job-YASIZJ : .NET 6.0.0 (42.42.42.42424), X64 RyuJIT

PowerPlanMode=00000000-0000-0000-0000-000000000000  Arguments=/p:DebugType=portable  Toolchain=CoreRun  
IterationTime=250.0000 ms  MaxIterationCount=20  MinIterationCount=15  
WarmupCount=1  
Method numberString Mean Error StdDev Median Min Max Gen 0 Gen 1 Gen 2 Allocated
Parse 12345678901(...)01234567890 [20000] 2.017 ms 0.0559 ms 0.0643 ms 2.024 ms 1.921 ms 2.127 ms 140.6250 - - 288 KB
Parse 12345678901(...)01234567890 [40000] 6.237 ms 0.1116 ms 0.0932 ms 6.218 ms 6.127 ms 6.434 ms 437.5000 83.3333 - 994 KB
Parse 12345678901(...)01234567890 [60000] 15.522 ms 1.3853 ms 1.5953 ms 15.153 ms 13.686 ms 18.949 ms 1062.5000 218.7500 - 2,831 KB
Parse 12345678901(...)01234567890 [80000] 20.554 ms 0.4685 ms 0.5208 ms 20.429 ms 19.880 ms 21.552 ms 1181.8182 181.8182 - 3,572 KB
Parse 12345678901(...)01234567890 [100000] 39.146 ms 1.8050 ms 1.8536 ms 38.681 ms 36.992 ms 44.189 ms 3000.0000 333.3333 - 7,680 KB
Parse 12345678901(...)01234567890 [120000] 47.999 ms 1.5866 ms 1.8271 ms 47.429 ms 46.147 ms 51.705 ms 3750.0000 250.0000 - 10,447 KB
Parse 12345678901(...)01234567890 [140000] 41.915 ms 1.3341 ms 1.3700 ms 41.657 ms 40.299 ms 44.621 ms 2166.6667 500.0000 - 6,532 KB
Parse 12345678901(...)01234567890 [160000] 67.668 ms 1.4404 ms 1.6010 ms 67.296 ms 65.587 ms 70.763 ms 4750.0000 250.0000 - 12,897 KB
Parse 12345678901(...)01234567890 [180000] 103.943 ms 2.4616 ms 2.8347 ms 103.285 ms 100.249 ms 109.843 ms 6500.0000 500.0000 - 18,962 KB
Parse 12345678901(...)01234567890 [200000] 136.343 ms 3.9093 ms 4.0146 ms 135.607 ms 131.575 ms 148.198 ms 9500.0000 500.0000 - 28,922 KB
Parse 12345678901(...)01234567890 [220000] 154.695 ms 3.6123 ms 3.7096 ms 154.182 ms 149.801 ms 163.375 ms 17500.0000 500.0000 - 40,999 KB
Parse 12345678901(...)01234567890 [240000] 184.792 ms 20.8938 ms 24.0613 ms 173.449 ms 163.513 ms 234.269 ms 16000.0000 1000.0000 - 39,244 KB
Parse 12345678901(...)01234567890 [260000] 160.434 ms 3.0439 ms 3.5053 ms 160.026 ms 155.602 ms 167.424 ms 12500.0000 500.0000 - 33,322 KB
Parse 12345678901(...)01234567890 [280000] 134.838 ms 2.4207 ms 2.2644 ms 134.531 ms 132.314 ms 139.587 ms 9500.0000 500.0000 - 22,696 KB
Parse 12345678901(...)01234567890 [300000] 140.690 ms 2.4686 ms 2.4245 ms 141.285 ms 134.758 ms 144.021 ms 7500.0000 1500.0000 1000.0000 24,318 KB
Author: key-moon
Assignees: -
Labels:

area-System.Numerics

Milestone: -

@dnfadmin
Copy link

dnfadmin commented Apr 27, 2021

CLA assistant check
All CLA requirements met.

@jeffhandley
Copy link
Member

@pgovind Could you review this when you get a chance please?

@pgovind
Copy link

pgovind commented May 19, 2021

Would you mind rebasing this PR on the latest main? I spent a little bit of time looking at the PR today and the changes from #47842 making it hard to discern this PR's changes.

@key-moon
Copy link
Contributor Author

Thanks for reviewing. I rebased onto current main branch. Since it requires history modification, I force-pushed the change. Sorry for inconvenience.

}
else
{
if (numberScale < 0)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just adding a note to myself: This is the real change in the PR. The block above is just refactoring inside an if. I'll review this PR this week. It's taking some time since I need to go through the algorithm first and then review the implementation here.

@@ -494,35 +494,242 @@ private static bool HexNumberToBigInteger(ref BigNumberBuffer number, out BigInt
}
}

private static int s_naiveThreshold = 20000;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit:

private const int NaiveThreshold = 20_000;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it'd be helpful to add a comment about how this value was selected.

Co-authored-by: Stephen Toub <stoub@microsoft.com>
@jeffhandley
Copy link
Member

@key-moon and @dotnet/area-system-numerics, I just realized that this PR was still open, but the originating branch has been deleted. I'm going to close the PR, but if you want to proceed with it, we could still carry forward starting with the last commit.

@jeffhandley jeffhandley closed this Jul 3, 2021
@key-moon
Copy link
Contributor Author

key-moon commented Jul 3, 2021

I accidentally deleted a forked repository yesterday. I want to continue from the last commit. What do I need to do? I apologize for the inconvenience caused.

@jeffhandley
Copy link
Member

No problem; I'm sorry we let the PR sit as long as we did.

I tried to see if I could recover the commit, but I'm not sure if that will work. I think the best bet here is to create a new branch and reapply the changes that are still shown in this PRs files changed tab. Since the changes look pretty well-contained, I think that'll be easier than trying to pull off some GitHub/git magic to recover the actual previous commits. From there, you can create a new PR and link to this one in the new PR description.

@ghost ghost locked as resolved and limited conversation to collaborators Aug 2, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants