Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove allocations from StringIn when returning a string #368

Merged
merged 16 commits into from
Feb 20, 2022

Conversation

johnynek
Copy link
Collaborator

@johnynek johnynek commented Feb 3, 2022

stringIn currently reallocates the string is returning, however, we know at the outset all possible matches, so we don't need to reallocate. This PR changes so that we return a match without allocating.

Along the way it also refactors the oneOf merging which would need a considerable refactor anyway to deal with this change. I hope the new implementation is more direct.

@codecov-commenter
Copy link

codecov-commenter commented Feb 3, 2022

Codecov Report

Merging #368 (74cf33b) into main (e28ba0a) will decrease coverage by 0.02%.
The diff coverage is 97.11%.

❗ Current head 74cf33b differs from pull request most recent head 4590a69. Consider uploading reports for the commit 4590a69 to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #368      +/-   ##
==========================================
- Coverage   96.29%   96.27%   -0.03%     
==========================================
  Files           9        9              
  Lines        1135     1263     +128     
  Branches      108      124      +16     
==========================================
+ Hits         1093     1216     +123     
- Misses         42       47       +5     
Impacted Files Coverage Δ
core/shared/src/main/scala/cats/parse/Parser.scala 96.15% <97.11%> (-0.12%) ⬇️
core/jvm/src/main/scala/cats/parse/BitSet.scala 88.23% <0.00%> (+5.88%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e28ba0a...4590a69. Read the comment docs.

@johnynek
Copy link
Collaborator Author

johnynek commented Feb 7, 2022

Unfortunately, this turned out to be relatively challenging.

  1. Removing the allocation for StringIn is not that hard, but it does change how oneOf merging is done
  2. I had to refactor the merging, but then realized that many merges were not often done. When I added more tests to check that it was working well, I realized it wasn't. Upon making it work better, then many of the laws needed to be relaxed.

Ideally we want a | (b | c) == (a | b) | c. We definitely want this to be true in the weaker sense that all parses are equivalent. But normalizing everything equivalently turns out to be very hard and I gave up. Also, the exact error classes are not the same although the offsets of all the failures are the same.

This was a very frustrating PR. I have been working on and off for over a week. I am looking forward to having it behind me.

@johnynek johnynek requested a review from regadas February 11, 2022 05:00
@johnynek
Copy link
Collaborator Author

@regadas if you have time, this is ready.

Sorry it is kind of big. I recommend reading the changes to the tests first, then see if you agree that the RadixNode isn't allocating in matchOrNull, finally review def merge to see how we are merging parsers together. It's pretty error prone to merge these things, but notice that we actually have very strong tests (that assert that composition works as we expect).

During coding I was running with scalacheck doing 20k iterations to make sure I wasn't getting lucky (which is how I found so many regressions). I got it passing reliably set at 20k and then turned it back down (although we build on 3 versions so still we get lots of chances to find errors in CI).

With this and #373 I would say we are ready to publish a new version.

@regadas
Copy link
Collaborator

regadas commented Feb 11, 2022

Hi, @johnynek! I'll have a look later today; It's not forgotten 😄

Copy link
Collaborator

@regadas regadas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @johnynek if I didn’t miss anything this looks great! Really just a few small nit's.

matchAtOrNullLooks good … allocation free.

Did you had time to run some Benchmarks? I'm a bit curious. I might run some on my side.

Regarding release will we include the parser string interpolator given that we are a bit on the fence with it? Tbf I haven’t thought about it that much since the last PR and your comments.

core/shared/src/test/scala/cats/parse/ParserTest.scala Outdated Show resolved Hide resolved
core/shared/src/main/scala/cats/parse/Parser.scala Outdated Show resolved Hide resolved
core/shared/src/main/scala/cats/parse/Parser.scala Outdated Show resolved Hide resolved
core/shared/src/main/scala/cats/parse/Parser.scala Outdated Show resolved Hide resolved
@johnynek
Copy link
Collaborator Author

Thanks for the review @regadas I think I addressed your comments.

Here are benchmarks:

no allocation

[info] Benchmark                          (test)  Mode  Cnt      Score     Error  Units
[info] StringInBenchmarks.linearMatchIn      foo  avgt    5     78.672 ±   0.770  ns/op
[info] StringInBenchmarks.linearMatchIn    broad  avgt    5  93533.393 ± 257.622  ns/op
[info] StringInBenchmarks.oneOfParse         foo  avgt    5     97.350 ±   0.093  ns/op
[info] StringInBenchmarks.oneOfParse       broad  avgt    5   1097.726 ±  61.983  ns/op
[info] StringInBenchmarks.radixMatchIn       foo  avgt    5     59.636 ±   0.185  ns/op
[info] StringInBenchmarks.radixMatchIn     broad  avgt    5    743.822 ±   1.432  ns/op
[info] StringInBenchmarks.stringInSParse     foo  avgt    5     71.939 ±   0.085  ns/op
[info] StringInBenchmarks.stringInSParse   broad  avgt    5   1008.110 ±   0.224  ns/op
[info] StringInBenchmarks.stringInVParse     foo  avgt    5     97.414 ±   0.209  ns/op
[info] StringInBenchmarks.stringInVParse   broad  avgt    5   1388.293 ±   2.132  ns/op
[success] Total time: 810 s (13:30), completed Feb 13, 2022 10:47:36 AM

main (eb4f11237f054482125057dc44f95863b99916dd)
[info] Benchmark                          (test)  Mode  Cnt      Score     Error  Units
[info] StringInBenchmarks.linearMatchIn      foo  avgt    5     78.374 ±   0.025  ns/op
[info] StringInBenchmarks.linearMatchIn    broad  avgt    5  93333.843 ± 321.174  ns/op
[info] StringInBenchmarks.oneOfParse         foo  avgt    5     91.034 ±   0.269  ns/op
[info] StringInBenchmarks.oneOfParse       broad  avgt    5   1029.260 ±   2.056  ns/op
[info] StringInBenchmarks.radixMatchIn       foo  avgt    5     59.986 ±   0.038  ns/op
[info] StringInBenchmarks.radixMatchIn     broad  avgt    5    756.810 ±   2.474  ns/op
[info] StringInBenchmarks.stringInSParse     foo  avgt    5    134.580 ±   0.752  ns/op
[info] StringInBenchmarks.stringInSParse   broad  avgt    5    974.916 ±   2.229  ns/op
[info] StringInBenchmarks.stringInVParse     foo  avgt    5     91.282 ±   2.460  ns/op
[info] StringInBenchmarks.stringInVParse   broad  avgt    5    917.049 ±   2.559  ns/op

It seems that the voided performance in stringIn is slightly worse, but the unvoided (where you return the captured string) is better for the smaller string case (labeled foo here).

I'm actually a bit confused. In master, stringIn.void is not wrapped at all, so calling parse just calls that method. In this PR the void has to be wrapped with a Void(StringIn(_)) which also does some work to set capture to false and then reset it. But more confusingly, in master stringIn(s) produces a StringP(StringIn(_)) which winds up wrapping and then having to allocate a copy of the string (or at least call substring which I thought did a copy after java 8, but before java 8 did a reference into the same string). It's hard to see how that can every be faster, but somehow, due to the magic of the JIT, it seems slightly faster in the broad case (maybe 3% faster), although presumably more memory pressure since it is allocating new objects.

I think getting almost 2x faster for small sets (foo case) and 3% worse for the broad case (which is a bit unrealistic anyway) is probably worth it, but it isn't a killer slam dunk (i.e. there are trade-offs here: some a better, some are a bit worse).

Last note: the current approach to implementing voiding may be suboptimal, it may be that if we implement #379 we would see more of a win on this PR as well.

What do you think @regadas ?

@regadas
Copy link
Collaborator

regadas commented Feb 19, 2022

Hi @johnynek,

I think getting almost 2x faster for small sets (foo case) and 3% worse for the broad case (which is a bit unrealistic anyway) is probably worth it, but it isn't a killer slam dunk (i.e. there are trade-offs here: some a better, some are a bit worse).

I agree with you here. TBH the fact that there's no allocation right now is a big win and I do feel that the "unrealistic" can definitely be targeted on follow-ups. This is a good first step!

That said, I'm +1 in merging this.

@johnynek johnynek merged commit 4c58199 into main Feb 20, 2022
@johnynek johnynek deleted the oscar/string_in_capture branch February 20, 2022 18:33
i10416 pushed a commit to i10416/cats-parse that referenced this pull request Mar 20, 2022
* checkpoint with compiling

* refactor merging

* use DefiniteString more

* try to fix tests

* get 3.0.2 passing

* fix mima, improve oneOf

* checkpoint

* get tests passing

* more tests, format, mima

* remove unused code

* simplify isVoided

* cleanup

* add isUnit

* address review comments

* add non-voided stringIn benchmarks
i10416 pushed a commit to i10416/cats-parse that referenced this pull request Mar 20, 2022
* checkpoint with compiling

* refactor merging

* use DefiniteString more

* try to fix tests

* get 3.0.2 passing

* fix mima, improve oneOf

* checkpoint

* get tests passing

* more tests, format, mima

* remove unused code

* simplify isVoided

* cleanup

* add isUnit

* address review comments

* add non-voided stringIn benchmarks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants