Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor of sieve in Gen #610

Merged
merged 4 commits into from
May 6, 2020
Merged

Refactor of sieve in Gen #610

merged 4 commits into from
May 6, 2020

Conversation

ashawley
Copy link
Contributor

Code cherry-picked from #575

This is part of work on Gen performance improvements.
@ashawley
Copy link
Contributor Author

This first revision of removing sieveCopy doubles the performance of benchmarks of string generators, and triples Gen.frequency, and makes Gen.filter a 15x improvement.

> bench/jmh:run -wi 3 -i 3 -f 1 -t 1 -w 2 -r 2 org.scalacheck.bench.GenBench
Benchmark                   (genSize)  (seedCount)  Mode  Cnt     Score      Error  Units
GenBench.arbitraryString          100          100  avgt    3  1853.995 ±  221.492  us/op
GenBench.asciiPrintableStr        100          100  avgt    3   801.799 ±  327.992  us/op
GenBench.const_                   100          100  avgt    3     2.963 ±    0.301  us/op
GenBench.double_                  100          100  avgt    3    12.221 ±    3.006  us/op
GenBench.dynamicFrequency         100          100  avgt    3   685.853 ±  117.333  us/op
GenBench.eitherIntInt             100          100  avgt    3    47.075 ±   10.873  us/op
GenBench.identifier               100          100  avgt    3  2775.786 ±  673.908  us/op
GenBench.int_                     100          100  avgt    3    14.234 ±    2.510  us/op
GenBench.listOfInt                100          100  avgt    3   637.542 ±  329.091  us/op
GenBench.mapOfIntInt              100          100  avgt    3  2493.999 ±  298.004  us/op
GenBench.oneOf                    100          100  avgt    3    21.700 ±    2.521  us/op
GenBench.optionInt                100          100  avgt    3    60.328 ±   14.836  us/op
GenBench.sequence                 100          100  avgt    3   268.916 ±   99.119  us/op
GenBench.staticFrequency          100          100  avgt    3   631.528 ±   41.398  us/op
GenBench.testFilter               100          100  avgt    3  1343.060 ±   96.549  us/op
GenBench.zipIntInt                100          100  avgt    3    32.363 ±   10.349  us/op

In particular, this adds retrying to the infiniteStream, stringOf, and
stringOfN combinators. It also slightly optimizes the loop conditions
on buildableOfN, removes lazy from most arbitrary definitions, and
cleans up a few other things.

This commit does appear to have made some benchmarks slower, although
it's possible my machine is just more busy than it was. I've also
added a few more benchmarks.
@ashawley
Copy link
Contributor Author

Despite the improvements, the Travis build here still takes 57 min. Although, the majority of that time is compilation and sbt runtime. I collected some sbt task times (e.g. "Total time: 226 s (03:46), completed Jan 10, 2020 9:10:47 PM") from the Travis build. It seems the test suite takes about 25 mins across the build matrix. This improvement was a 90s improvement, or 6%.

@ashawley
Copy link
Contributor Author

One of the todo list items Erik had was:

  1. Try restoring the sieves and see how much of the improvement we lose, if any.

This should answer that question. I'll cherry-pick the subsequent commits and see if the performance changes any further.

@ashawley
Copy link
Contributor Author

The subsequent revision to buildableOfN and infiniteStream in 3cbcf07, here as 48e6511, doesn't show any change in performance.

> bench/jmh:run -wi 3 -i 3 -f 1 -t 1 -w 2 -r 2 org.scalacheck.bench.GenBench
Benchmark                   (genSize)  (seedCount)  Mode  Cnt     Score      Error  Units
GenBench.arbitraryString          100          100  avgt    3  1848.030 ±  275.967  us/op
GenBench.asciiPrintableStr        100          100  avgt    3   821.699 ±  295.769  us/op
GenBench.const_                   100          100  avgt    3     3.039 ±    0.361  us/op
GenBench.double_                  100          100  avgt    3    12.804 ±    3.896  us/op
GenBench.dynamicFrequency         100          100  avgt    3   712.341 ±  108.979  us/op
GenBench.eitherIntInt             100          100  avgt    3    46.519 ±    7.378  us/op
GenBench.identifier               100          100  avgt    3  2898.351 ± 1129.931  us/op
GenBench.int_                     100          100  avgt    3    14.417 ±    1.448  us/op
GenBench.listOfInt                100          100  avgt    3   644.638 ±  131.945  us/op
GenBench.mapOfIntInt              100          100  avgt    3  2516.313 ± 1336.180  us/op
GenBench.oneOf                    100          100  avgt    3    22.042 ±    3.030  us/op
GenBench.optionInt                100          100  avgt    3    59.512 ±    8.184  us/op
GenBench.sequence                 100          100  avgt    3   271.875 ±   17.812  us/op
GenBench.staticFrequency          100          100  avgt    3   656.798 ±   53.329  us/op
GenBench.testFilter               100          100  avgt    3  1435.407 ±  128.368  us/op
GenBench.zipIntInt                100          100  avgt    3    34.030 ±   11.320  us/op

@ashawley
Copy link
Contributor Author

The last small revision to buildableOfN in 7f46242, here as 7e1e523, was just a change for correctness so no performance change.

> bench/jmh:run -wi 3 -i 3 -f 1 -t 1 -w 2 -r 2 org.scalacheck.bench.GenBench
Benchmark                   (genSize)  (seedCount)  Mode  Cnt     Score     Error  Units
GenBench.arbitraryString          100          100  avgt    3  1814.315 ± 480.772  us/op
GenBench.asciiPrintableStr        100          100  avgt    3   819.163 ± 311.835  us/op
GenBench.const_                   100          100  avgt    3     3.032 ±   0.816  us/op
GenBench.double_                  100          100  avgt    3    12.910 ±   3.473  us/op
GenBench.dynamicFrequency         100          100  avgt    3   725.013 ±  36.428  us/op
GenBench.eitherIntInt             100          100  avgt    3    49.231 ±  47.973  us/op
GenBench.identifier               100          100  avgt    3  2911.446 ± 468.532  us/op
GenBench.int_                     100          100  avgt    3    14.499 ±   1.528  us/op
GenBench.listOfInt                100          100  avgt    3   644.523 ±  53.210  us/op
GenBench.mapOfIntInt              100          100  avgt    3  2603.009 ± 762.334  us/op
GenBench.oneOf                    100          100  avgt    3    22.792 ±   6.595  us/op
GenBench.optionInt                100          100  avgt    3    61.232 ±  20.916  us/op
GenBench.sequence                 100          100  avgt    3   279.771 ± 150.628  us/op
GenBench.staticFrequency          100          100  avgt    3   665.220 ± 111.553  us/op
GenBench.testFilter               100          100  avgt    3  1418.581 ± 434.058  us/op
GenBench.zipIntInt                100          100  avgt    3    34.623 ±  29.684  us/op

@ashawley ashawley marked this pull request as ready for review January 15, 2020 15:54
@ashawley ashawley added this to the 1.15.0 milestone Feb 5, 2020
Co-authored-by: Filippo Mariotti <barambani@users.noreply.github.com>
@non non merged commit c5ddde0 into typelevel:master May 6, 2020
@ashawley ashawley deleted the sieve-refactor branch May 22, 2020 21:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants