Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BenchmarkBytes #125

Merged
merged 1 commit into from
Feb 27, 2019
Merged

Add BenchmarkBytes #125

merged 1 commit into from
Feb 27, 2019

Conversation

dsnet
Copy link
Collaborator

@dsnet dsnet commented Feb 27, 2019

Add a benchmark that uses cmp.{Equal,Diff} to measure performance of
comparing large byte slices. Comparing large slices of primitives is
generally considered the most pathological use-case for cmp since the
rules of cmp dictate that it must run the filters on every element
of the slice even if the filters never apply.

In theory, a user of cmp could short-circuit comparison of slices by
providing a comparer for slices (e.g., cmp.Comparer(bytes.Equal)),
but it would be ideal for cmp to be sufficiently optimized that comparing
large slices is not too costly.

BenchmarkBytes/4KiB/EqualFilter0-4         	     300	   3792755 ns/op	   2.16 MB/s	  628702 B/op	    8276 allocs/op
BenchmarkBytes/4KiB/EqualFilter1-4         	     200	   6058637 ns/op	   1.35 MB/s	  890920 B/op	   16471 allocs/op
BenchmarkBytes/4KiB/EqualFilter2-4         	     200	   7498098 ns/op	   1.09 MB/s	  890952 B/op	   16472 allocs/op
BenchmarkBytes/4KiB/EqualFilter3-4         	     100	  10094237 ns/op	   0.81 MB/s	  891016 B/op	   16473 allocs/op
BenchmarkBytes/4KiB/EqualFilter4-4         	     100	  11128655 ns/op	   0.74 MB/s	  891016 B/op	   16473 allocs/op
BenchmarkBytes/4KiB/EqualFilter5-4         	     100	  12952062 ns/op	   0.63 MB/s	  891144 B/op	   16474 allocs/op
BenchmarkBytes/4KiB/DiffFilter0-4          	     300	   4015119 ns/op	   2.04 MB/s	  629425 B/op	    8305 allocs/op
BenchmarkBytes/4KiB/DiffFilter1-4          	     200	   6457963 ns/op	   1.27 MB/s	  891738 B/op	   16501 allocs/op
BenchmarkBytes/4KiB/DiffFilter2-4          	     200	   8371022 ns/op	   0.98 MB/s	  891770 B/op	   16502 allocs/op
BenchmarkBytes/4KiB/DiffFilter3-4          	     200	   9420398 ns/op	   0.87 MB/s	  891834 B/op	   16503 allocs/op
BenchmarkBytes/4KiB/DiffFilter4-4          	     100	  12493540 ns/op	   0.66 MB/s	  891835 B/op	   16503 allocs/op
BenchmarkBytes/4KiB/DiffFilter5-4          	     100	  14208928 ns/op	   0.58 MB/s	  891965 B/op	   16504 allocs/op
BenchmarkBytes/64KiB/EqualFilter0-4        	      20	  66146276 ns/op	   1.98 MB/s	10826734 B/op	  131206 allocs/op
BenchmarkBytes/64KiB/EqualFilter1-4        	      10	 102687117 ns/op	   1.28 MB/s	15021123 B/op	  262281 allocs/op
BenchmarkBytes/64KiB/EqualFilter2-4        	      10	 125444269 ns/op	   1.04 MB/s	15021174 B/op	  262282 allocs/op
BenchmarkBytes/64KiB/EqualFilter3-4        	      10	 154392619 ns/op	   0.85 MB/s	15021219 B/op	  262283 allocs/op
BenchmarkBytes/64KiB/EqualFilter4-4        	      10	 183307772 ns/op	   0.72 MB/s	15021219 B/op	  262283 allocs/op
BenchmarkBytes/64KiB/EqualFilter5-4        	       5	 212334761 ns/op	   0.62 MB/s	15021340 B/op	  262284 allocs/op
BenchmarkBytes/64KiB/DiffFilter0-4         	      20	  67147375 ns/op	   1.95 MB/s	10828312 B/op	  131244 allocs/op
BenchmarkBytes/64KiB/DiffFilter1-4         	      10	 105385709 ns/op	   1.24 MB/s	15022724 B/op	  262319 allocs/op
BenchmarkBytes/64KiB/DiffFilter2-4         	      10	 135994379 ns/op	   0.96 MB/s	15022763 B/op	  262320 allocs/op
BenchmarkBytes/64KiB/DiffFilter3-4         	      10	 159346836 ns/op	   0.82 MB/s	15022820 B/op	  262321 allocs/op
BenchmarkBytes/64KiB/DiffFilter4-4         	      10	 190129916 ns/op	   0.69 MB/s	15022843 B/op	  262321 allocs/op
BenchmarkBytes/64KiB/DiffFilter5-4         	       5	 208805873 ns/op	   0.63 MB/s	15022996 B/op	  262322 allocs/op
BenchmarkBytes/1MiB/EqualFilter0-4         	       2	 978302544 ns/op	   2.14 MB/s	173184064 B/op	 2097346 allocs/op
BenchmarkBytes/1MiB/EqualFilter1-4         	       1	1543190869 ns/op	   1.36 MB/s	240293104 B/op	 4194502 allocs/op
BenchmarkBytes/1MiB/EqualFilter2-4         	       1	1998443802 ns/op	   1.05 MB/s	240293232 B/op	 4194504 allocs/op
BenchmarkBytes/1MiB/EqualFilter3-4         	       1	2507293058 ns/op	   0.84 MB/s	240293328 B/op	 4194506 allocs/op
BenchmarkBytes/1MiB/EqualFilter4-4         	       1	2981132381 ns/op	   0.70 MB/s	240293008 B/op	 4194502 allocs/op
BenchmarkBytes/1MiB/EqualFilter5-4         	       1	3351177035 ns/op	   0.63 MB/s	240293424 B/op	 4194506 allocs/op
BenchmarkBytes/1MiB/DiffFilter0-4          	       1	1132136753 ns/op	   1.85 MB/s	173185752 B/op	 2097384 allocs/op
BenchmarkBytes/1MiB/DiffFilter1-4          	       1	1666196345 ns/op	   1.26 MB/s	240294504 B/op	 4194537 allocs/op
BenchmarkBytes/1MiB/DiffFilter2-4          	       1	2204467232 ns/op	   0.95 MB/s	240294600 B/op	 4194539 allocs/op
BenchmarkBytes/1MiB/DiffFilter3-4          	       1	2499107753 ns/op	   0.84 MB/s	240294600 B/op	 4194539 allocs/op
BenchmarkBytes/1MiB/DiffFilter4-4          	       1	2966222324 ns/op	   0.71 MB/s	240295016 B/op	 4194544 allocs/op
BenchmarkBytes/1MiB/DiffFilter5-4          	       1	3382045549 ns/op	   0.62 MB/s	240294728 B/op	 4194540 allocs/op
BenchmarkBytes/16MiB/EqualFilter0-4        	       1	16585516720 ns/op	   2.02 MB/s	2967917664 B/op	33554689 allocs/op
BenchmarkBytes/16MiB/EqualFilter1-4        	       1	23980880452 ns/op	   1.40 MB/s	4041659536 B/op	67109123 allocs/op
BenchmarkBytes/16MiB/EqualFilter2-4        	       1	30729382462 ns/op	   1.09 MB/s	4041659568 B/op	67109124 allocs/op
BenchmarkBytes/16MiB/EqualFilter3-4        	       1	37830223988 ns/op	   0.89 MB/s	4041660016 B/op	67109129 allocs/op
BenchmarkBytes/16MiB/EqualFilter4-4        	       1	44731081109 ns/op	   0.75 MB/s	4041659536 B/op	67109124 allocs/op
BenchmarkBytes/16MiB/EqualFilter5-4        	       1	52110015114 ns/op	   0.64 MB/s	4041659760 B/op	67109126 allocs/op
BenchmarkBytes/16MiB/DiffFilter0-4         	       1	20349410654 ns/op	   1.65 MB/s	2967919128 B/op	33554724 allocs/op
BenchmarkBytes/16MiB/DiffFilter1-4         	       1	27073250483 ns/op	   1.24 MB/s	4041661320 B/op	67109163 allocs/op
BenchmarkBytes/16MiB/DiffFilter2-4         	       1	32223912220 ns/op	   1.04 MB/s	4041661064 B/op	67109160 allocs/op
BenchmarkBytes/16MiB/DiffFilter3-4         	       1	39189759283 ns/op	   0.86 MB/s	4041661128 B/op	67109161 allocs/op
BenchmarkBytes/16MiB/DiffFilter4-4         	       1	48344470628 ns/op	   0.69 MB/s	4041661256 B/op	67109163 allocs/op
BenchmarkBytes/16MiB/DiffFilter5-4         	       1	51184873999 ns/op	   0.66 MB/s	4041661256 B/op	67109162 allocs/op

The last benchmark shows that the current implementation allocates
an astonishing 3.75GiB just to compare a 16MiB byte slice.

Add a benchmark that uses cmp.{Equal,Diff} to measure performance of
comparing large byte slices. Comparing large slices of primitives is
generally considered the most pathological use-case for cmp since the
rules of cmp dictate that it must run the filters on every element
of the slice even if the filters never apply.

In theory, a user of cmp could short-circuit comparison of slices by
providing a comparer for slices (e.g., cmp.Comparer(bytes.Equal)),
but it would be ideal for cmp to be sufficiently optimized that comparing
large slices is not too costly.

BenchmarkBytes/4KiB/EqualFilter0-4         	     300	   3792755 ns/op	   2.16 MB/s	  628702 B/op	    8276 allocs/op
BenchmarkBytes/4KiB/EqualFilter1-4         	     200	   6058637 ns/op	   1.35 MB/s	  890920 B/op	   16471 allocs/op
BenchmarkBytes/4KiB/EqualFilter2-4         	     200	   7498098 ns/op	   1.09 MB/s	  890952 B/op	   16472 allocs/op
BenchmarkBytes/4KiB/EqualFilter3-4         	     100	  10094237 ns/op	   0.81 MB/s	  891016 B/op	   16473 allocs/op
BenchmarkBytes/4KiB/EqualFilter4-4         	     100	  11128655 ns/op	   0.74 MB/s	  891016 B/op	   16473 allocs/op
BenchmarkBytes/4KiB/EqualFilter5-4         	     100	  12952062 ns/op	   0.63 MB/s	  891144 B/op	   16474 allocs/op
BenchmarkBytes/4KiB/DiffFilter0-4          	     300	   4015119 ns/op	   2.04 MB/s	  629425 B/op	    8305 allocs/op
BenchmarkBytes/4KiB/DiffFilter1-4          	     200	   6457963 ns/op	   1.27 MB/s	  891738 B/op	   16501 allocs/op
BenchmarkBytes/4KiB/DiffFilter2-4          	     200	   8371022 ns/op	   0.98 MB/s	  891770 B/op	   16502 allocs/op
BenchmarkBytes/4KiB/DiffFilter3-4          	     200	   9420398 ns/op	   0.87 MB/s	  891834 B/op	   16503 allocs/op
BenchmarkBytes/4KiB/DiffFilter4-4          	     100	  12493540 ns/op	   0.66 MB/s	  891835 B/op	   16503 allocs/op
BenchmarkBytes/4KiB/DiffFilter5-4          	     100	  14208928 ns/op	   0.58 MB/s	  891965 B/op	   16504 allocs/op
BenchmarkBytes/64KiB/EqualFilter0-4        	      20	  66146276 ns/op	   1.98 MB/s	10826734 B/op	  131206 allocs/op
BenchmarkBytes/64KiB/EqualFilter1-4        	      10	 102687117 ns/op	   1.28 MB/s	15021123 B/op	  262281 allocs/op
BenchmarkBytes/64KiB/EqualFilter2-4        	      10	 125444269 ns/op	   1.04 MB/s	15021174 B/op	  262282 allocs/op
BenchmarkBytes/64KiB/EqualFilter3-4        	      10	 154392619 ns/op	   0.85 MB/s	15021219 B/op	  262283 allocs/op
BenchmarkBytes/64KiB/EqualFilter4-4        	      10	 183307772 ns/op	   0.72 MB/s	15021219 B/op	  262283 allocs/op
BenchmarkBytes/64KiB/EqualFilter5-4        	       5	 212334761 ns/op	   0.62 MB/s	15021340 B/op	  262284 allocs/op
BenchmarkBytes/64KiB/DiffFilter0-4         	      20	  67147375 ns/op	   1.95 MB/s	10828312 B/op	  131244 allocs/op
BenchmarkBytes/64KiB/DiffFilter1-4         	      10	 105385709 ns/op	   1.24 MB/s	15022724 B/op	  262319 allocs/op
BenchmarkBytes/64KiB/DiffFilter2-4         	      10	 135994379 ns/op	   0.96 MB/s	15022763 B/op	  262320 allocs/op
BenchmarkBytes/64KiB/DiffFilter3-4         	      10	 159346836 ns/op	   0.82 MB/s	15022820 B/op	  262321 allocs/op
BenchmarkBytes/64KiB/DiffFilter4-4         	      10	 190129916 ns/op	   0.69 MB/s	15022843 B/op	  262321 allocs/op
BenchmarkBytes/64KiB/DiffFilter5-4         	       5	 208805873 ns/op	   0.63 MB/s	15022996 B/op	  262322 allocs/op
BenchmarkBytes/1MiB/EqualFilter0-4         	       2	 978302544 ns/op	   2.14 MB/s	173184064 B/op	 2097346 allocs/op
BenchmarkBytes/1MiB/EqualFilter1-4         	       1	1543190869 ns/op	   1.36 MB/s	240293104 B/op	 4194502 allocs/op
BenchmarkBytes/1MiB/EqualFilter2-4         	       1	1998443802 ns/op	   1.05 MB/s	240293232 B/op	 4194504 allocs/op
BenchmarkBytes/1MiB/EqualFilter3-4         	       1	2507293058 ns/op	   0.84 MB/s	240293328 B/op	 4194506 allocs/op
BenchmarkBytes/1MiB/EqualFilter4-4         	       1	2981132381 ns/op	   0.70 MB/s	240293008 B/op	 4194502 allocs/op
BenchmarkBytes/1MiB/EqualFilter5-4         	       1	3351177035 ns/op	   0.63 MB/s	240293424 B/op	 4194506 allocs/op
BenchmarkBytes/1MiB/DiffFilter0-4          	       1	1132136753 ns/op	   1.85 MB/s	173185752 B/op	 2097384 allocs/op
BenchmarkBytes/1MiB/DiffFilter1-4          	       1	1666196345 ns/op	   1.26 MB/s	240294504 B/op	 4194537 allocs/op
BenchmarkBytes/1MiB/DiffFilter2-4          	       1	2204467232 ns/op	   0.95 MB/s	240294600 B/op	 4194539 allocs/op
BenchmarkBytes/1MiB/DiffFilter3-4          	       1	2499107753 ns/op	   0.84 MB/s	240294600 B/op	 4194539 allocs/op
BenchmarkBytes/1MiB/DiffFilter4-4          	       1	2966222324 ns/op	   0.71 MB/s	240295016 B/op	 4194544 allocs/op
BenchmarkBytes/1MiB/DiffFilter5-4          	       1	3382045549 ns/op	   0.62 MB/s	240294728 B/op	 4194540 allocs/op
BenchmarkBytes/16MiB/EqualFilter0-4        	       1	16585516720 ns/op	   2.02 MB/s	2967917664 B/op	33554689 allocs/op
BenchmarkBytes/16MiB/EqualFilter1-4        	       1	23980880452 ns/op	   1.40 MB/s	4041659536 B/op	67109123 allocs/op
BenchmarkBytes/16MiB/EqualFilter2-4        	       1	30729382462 ns/op	   1.09 MB/s	4041659568 B/op	67109124 allocs/op
BenchmarkBytes/16MiB/EqualFilter3-4        	       1	37830223988 ns/op	   0.89 MB/s	4041660016 B/op	67109129 allocs/op
BenchmarkBytes/16MiB/EqualFilter4-4        	       1	44731081109 ns/op	   0.75 MB/s	4041659536 B/op	67109124 allocs/op
BenchmarkBytes/16MiB/EqualFilter5-4        	       1	52110015114 ns/op	   0.64 MB/s	4041659760 B/op	67109126 allocs/op
BenchmarkBytes/16MiB/DiffFilter0-4         	       1	20349410654 ns/op	   1.65 MB/s	2967919128 B/op	33554724 allocs/op
BenchmarkBytes/16MiB/DiffFilter1-4         	       1	27073250483 ns/op	   1.24 MB/s	4041661320 B/op	67109163 allocs/op
BenchmarkBytes/16MiB/DiffFilter2-4         	       1	32223912220 ns/op	   1.04 MB/s	4041661064 B/op	67109160 allocs/op
BenchmarkBytes/16MiB/DiffFilter3-4         	       1	39189759283 ns/op	   0.86 MB/s	4041661128 B/op	67109161 allocs/op
BenchmarkBytes/16MiB/DiffFilter4-4         	       1	48344470628 ns/op	   0.69 MB/s	4041661256 B/op	67109163 allocs/op
BenchmarkBytes/16MiB/DiffFilter5-4         	       1	51184873999 ns/op	   0.66 MB/s	4041661256 B/op	67109162 allocs/op

The last benchmark shows that the current implementation allocates
an astonishing 3.75GiB just to compare a 16MiB byte slice.
@dsnet dsnet requested a review from cybrcodr February 27, 2019 08:48
@dsnet dsnet merged commit 64cb04e into master Feb 27, 2019
@dsnet dsnet deleted the bench branch February 27, 2019 18:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants