-
Notifications
You must be signed in to change notification settings - Fork 208
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BenchmarkBytes #125
Merged
Add BenchmarkBytes #125
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Add a benchmark that uses cmp.{Equal,Diff} to measure performance of comparing large byte slices. Comparing large slices of primitives is generally considered the most pathological use-case for cmp since the rules of cmp dictate that it must run the filters on every element of the slice even if the filters never apply. In theory, a user of cmp could short-circuit comparison of slices by providing a comparer for slices (e.g., cmp.Comparer(bytes.Equal)), but it would be ideal for cmp to be sufficiently optimized that comparing large slices is not too costly. BenchmarkBytes/4KiB/EqualFilter0-4 300 3792755 ns/op 2.16 MB/s 628702 B/op 8276 allocs/op BenchmarkBytes/4KiB/EqualFilter1-4 200 6058637 ns/op 1.35 MB/s 890920 B/op 16471 allocs/op BenchmarkBytes/4KiB/EqualFilter2-4 200 7498098 ns/op 1.09 MB/s 890952 B/op 16472 allocs/op BenchmarkBytes/4KiB/EqualFilter3-4 100 10094237 ns/op 0.81 MB/s 891016 B/op 16473 allocs/op BenchmarkBytes/4KiB/EqualFilter4-4 100 11128655 ns/op 0.74 MB/s 891016 B/op 16473 allocs/op BenchmarkBytes/4KiB/EqualFilter5-4 100 12952062 ns/op 0.63 MB/s 891144 B/op 16474 allocs/op BenchmarkBytes/4KiB/DiffFilter0-4 300 4015119 ns/op 2.04 MB/s 629425 B/op 8305 allocs/op BenchmarkBytes/4KiB/DiffFilter1-4 200 6457963 ns/op 1.27 MB/s 891738 B/op 16501 allocs/op BenchmarkBytes/4KiB/DiffFilter2-4 200 8371022 ns/op 0.98 MB/s 891770 B/op 16502 allocs/op BenchmarkBytes/4KiB/DiffFilter3-4 200 9420398 ns/op 0.87 MB/s 891834 B/op 16503 allocs/op BenchmarkBytes/4KiB/DiffFilter4-4 100 12493540 ns/op 0.66 MB/s 891835 B/op 16503 allocs/op BenchmarkBytes/4KiB/DiffFilter5-4 100 14208928 ns/op 0.58 MB/s 891965 B/op 16504 allocs/op BenchmarkBytes/64KiB/EqualFilter0-4 20 66146276 ns/op 1.98 MB/s 10826734 B/op 131206 allocs/op BenchmarkBytes/64KiB/EqualFilter1-4 10 102687117 ns/op 1.28 MB/s 15021123 B/op 262281 allocs/op BenchmarkBytes/64KiB/EqualFilter2-4 10 125444269 ns/op 1.04 MB/s 15021174 B/op 262282 allocs/op BenchmarkBytes/64KiB/EqualFilter3-4 10 154392619 ns/op 0.85 MB/s 15021219 B/op 262283 allocs/op BenchmarkBytes/64KiB/EqualFilter4-4 10 183307772 ns/op 0.72 MB/s 15021219 B/op 262283 allocs/op BenchmarkBytes/64KiB/EqualFilter5-4 5 212334761 ns/op 0.62 MB/s 15021340 B/op 262284 allocs/op BenchmarkBytes/64KiB/DiffFilter0-4 20 67147375 ns/op 1.95 MB/s 10828312 B/op 131244 allocs/op BenchmarkBytes/64KiB/DiffFilter1-4 10 105385709 ns/op 1.24 MB/s 15022724 B/op 262319 allocs/op BenchmarkBytes/64KiB/DiffFilter2-4 10 135994379 ns/op 0.96 MB/s 15022763 B/op 262320 allocs/op BenchmarkBytes/64KiB/DiffFilter3-4 10 159346836 ns/op 0.82 MB/s 15022820 B/op 262321 allocs/op BenchmarkBytes/64KiB/DiffFilter4-4 10 190129916 ns/op 0.69 MB/s 15022843 B/op 262321 allocs/op BenchmarkBytes/64KiB/DiffFilter5-4 5 208805873 ns/op 0.63 MB/s 15022996 B/op 262322 allocs/op BenchmarkBytes/1MiB/EqualFilter0-4 2 978302544 ns/op 2.14 MB/s 173184064 B/op 2097346 allocs/op BenchmarkBytes/1MiB/EqualFilter1-4 1 1543190869 ns/op 1.36 MB/s 240293104 B/op 4194502 allocs/op BenchmarkBytes/1MiB/EqualFilter2-4 1 1998443802 ns/op 1.05 MB/s 240293232 B/op 4194504 allocs/op BenchmarkBytes/1MiB/EqualFilter3-4 1 2507293058 ns/op 0.84 MB/s 240293328 B/op 4194506 allocs/op BenchmarkBytes/1MiB/EqualFilter4-4 1 2981132381 ns/op 0.70 MB/s 240293008 B/op 4194502 allocs/op BenchmarkBytes/1MiB/EqualFilter5-4 1 3351177035 ns/op 0.63 MB/s 240293424 B/op 4194506 allocs/op BenchmarkBytes/1MiB/DiffFilter0-4 1 1132136753 ns/op 1.85 MB/s 173185752 B/op 2097384 allocs/op BenchmarkBytes/1MiB/DiffFilter1-4 1 1666196345 ns/op 1.26 MB/s 240294504 B/op 4194537 allocs/op BenchmarkBytes/1MiB/DiffFilter2-4 1 2204467232 ns/op 0.95 MB/s 240294600 B/op 4194539 allocs/op BenchmarkBytes/1MiB/DiffFilter3-4 1 2499107753 ns/op 0.84 MB/s 240294600 B/op 4194539 allocs/op BenchmarkBytes/1MiB/DiffFilter4-4 1 2966222324 ns/op 0.71 MB/s 240295016 B/op 4194544 allocs/op BenchmarkBytes/1MiB/DiffFilter5-4 1 3382045549 ns/op 0.62 MB/s 240294728 B/op 4194540 allocs/op BenchmarkBytes/16MiB/EqualFilter0-4 1 16585516720 ns/op 2.02 MB/s 2967917664 B/op 33554689 allocs/op BenchmarkBytes/16MiB/EqualFilter1-4 1 23980880452 ns/op 1.40 MB/s 4041659536 B/op 67109123 allocs/op BenchmarkBytes/16MiB/EqualFilter2-4 1 30729382462 ns/op 1.09 MB/s 4041659568 B/op 67109124 allocs/op BenchmarkBytes/16MiB/EqualFilter3-4 1 37830223988 ns/op 0.89 MB/s 4041660016 B/op 67109129 allocs/op BenchmarkBytes/16MiB/EqualFilter4-4 1 44731081109 ns/op 0.75 MB/s 4041659536 B/op 67109124 allocs/op BenchmarkBytes/16MiB/EqualFilter5-4 1 52110015114 ns/op 0.64 MB/s 4041659760 B/op 67109126 allocs/op BenchmarkBytes/16MiB/DiffFilter0-4 1 20349410654 ns/op 1.65 MB/s 2967919128 B/op 33554724 allocs/op BenchmarkBytes/16MiB/DiffFilter1-4 1 27073250483 ns/op 1.24 MB/s 4041661320 B/op 67109163 allocs/op BenchmarkBytes/16MiB/DiffFilter2-4 1 32223912220 ns/op 1.04 MB/s 4041661064 B/op 67109160 allocs/op BenchmarkBytes/16MiB/DiffFilter3-4 1 39189759283 ns/op 0.86 MB/s 4041661128 B/op 67109161 allocs/op BenchmarkBytes/16MiB/DiffFilter4-4 1 48344470628 ns/op 0.69 MB/s 4041661256 B/op 67109163 allocs/op BenchmarkBytes/16MiB/DiffFilter5-4 1 51184873999 ns/op 0.66 MB/s 4041661256 B/op 67109162 allocs/op The last benchmark shows that the current implementation allocates an astonishing 3.75GiB just to compare a 16MiB byte slice.
cybrcodr
approved these changes
Feb 27, 2019
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add a benchmark that uses cmp.{Equal,Diff} to measure performance of
comparing large byte slices. Comparing large slices of primitives is
generally considered the most pathological use-case for cmp since the
rules of cmp dictate that it must run the filters on every element
of the slice even if the filters never apply.
In theory, a user of cmp could short-circuit comparison of slices by
providing a comparer for slices (e.g., cmp.Comparer(bytes.Equal)),
but it would be ideal for cmp to be sufficiently optimized that comparing
large slices is not too costly.
The last benchmark shows that the current implementation allocates
an astonishing 3.75GiB just to compare a 16MiB byte slice.