drastically reduce allocations in ring buffer implementation #64

pymq · 2021-10-05T19:00:26Z

Reuse cap of s.b slice by shifting its elements to the left on writes. Before this change Append was allocating a new slice every time because len(s.b) == cap(s.b). In my testings len(s.b) is usually small enough (<= 50, sometimes spikes to 200) even in high-write benchmarks such as BenchmarkSendRecvLarge, often near 0, so copy should not be too expensive. It is also efficient when write rate is roughly equal to read rate.

Tested with

$ go version                                                                 
go version go1.16.4 linux/amd64

$ go test -bench=BenchmarkSendRecv -run ^$ -benchmem -benchtime=10s

Before

goos: linux
goarch: amd64
pkg: github.com/libp2p/go-yamux/v2
cpu: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
BenchmarkSendRecv-12         	 4662453	      3042 ns/op	      24 B/op	       1 allocs/op
BenchmarkSendRecvLarge-12    	     163	 121443667 ns/op	  255279 B/op	   10165 allocs/op
PASS
ok  	github.com/libp2p/go-yamux/v2	55.470s

After this commit

goos: linux
goarch: amd64
pkg: github.com/libp2p/go-yamux/v2
cpu: Intel(R) Core(TM) i7-8750H CPU @ 2.20GHz
BenchmarkSendRecv-12         	 4864972	      2494 ns/op	       0 B/op	       0 allocs/op
BenchmarkSendRecvLarge-12    	     136	  84702011 ns/op	    7162 B/op	       5 allocs/op
PASS
ok  	github.com/libp2p/go-yamux/v2	35.208s

Before this change `Append` was allocating a new slice every time because len(s.b) == cap(s.b). In my testings len(s.b) is usually small enough (<= 50, sometimes spikes to 200) even in high-write benchmarks such as BenchmarkSendRecvLarge, often near 0, so copy should not be too expensive. It is also efficient when write rate is roughly equal to read rate.

Stebalien

Thanks! This is tricky code, but it looks correct. One comment I'd like you to consider, but I don't feel too strongly about it.

But I'd like to get a review from @marten-seemann as well.

Stebalien · 2021-10-12T22:32:32Z

util.go

+				// have no unread chunks, just move pos
+				s.bPos = 0
+				s.b = s.b[:0]
+			} else {


I'd change this to to check if at least half of the slice is free. That'll slightly increase allocations until we hit a steady-state, but should avoid the degenerate case where we slide by one every single time.

This is definitely a good improvement, but I am not sure about "at least half of the slice", I think it should be a bit lower than that, how about 0.25 of the capacity? Also, then it will be equal to append() growth factor (when cap > 1024). Added with 0.25 for now.

That got me thinking, should we limit the maximum capacity of the buffer (recreate slice with default capacity when reach certain maximum and buffer is empty now)? What is the average Stream lifespan?

I tried to test this on the package's benchmark, but it does not grow at all because of in-memory network.

marten-seemann

I haven't had the time to do a thorough review yet, but more documentation (maybe even an example) here would be very useful to understand what the code is doing.

…me case when we shift slice by one every time

pymq · 2021-10-18T16:34:16Z

Sorry for the delay.

I am not sure what kind of documentation this needs, basically this code tries to reuse slice's capacity with shifting values to the start so append() still adds values to the end, like a ring-buffer but simply. I can try to add more inline comments if you prefer.

BigLep · 2021-10-29T06:32:24Z

Assigned to @libp2p/go-libp2p-maintainers to see if anyone else can look.

BigLep · 2021-10-29T15:16:57Z

@marten-seemann : understood there are other things you're currently focused on. When you do reengage, does this comment make things more clear for you?

marten-seemann · 2021-10-29T15:20:49Z

I am not sure what kind of documentation this needs, basically this code tries to reuse slice's capacity with shifting values to the start so append() still adds values to the end, like a ring-buffer but simply. I can try to add more inline comments if you prefer.

Yes, more inline comments would be highly appreciated. I haven't worked with this code for a few months, and I find it quite hard to understand without any comments (this partially applies to the code we had before as well).

pymq · 2021-10-29T19:48:16Z

@marten-seemann added more comments.

iand · 2021-11-08T11:39:10Z

bench_test.go

@@ -54,6 +54,7 @@ func BenchmarkSendRecv(b *testing.B) {
 	recvBuf := make([]byte, 512)

 	doneCh := make(chan struct{})
+	b.ResetTimer()


Suggest you also add b.ReportAllocs() here and in BenchmarkSendRecvLarge

I am not sure this is necessary, you can add -benchmem to go test to achieve the same result.

True, but I'm suggesting it because allocs seem to be of ongoing concern. Up to you.

iand · 2021-11-08T11:40:21Z

Comparison of benchmarks using benchstat (with ReportAllocs added manually):

name             old time/op    new time/op    delta
SendRecv-8         2.39µs ± 1%    2.30µs ± 1%    -3.56%  (p=0.000 n=9+10)
SendRecvLarge-8    77.1ms ± 0%    76.9ms ± 1%      ~     (p=0.456 n=7+7)

name             old alloc/op   new alloc/op   delta
SendRecv-8          24.0B ± 0%      0.0B       -100.00%  (p=0.000 n=10+10)
SendRecvLarge-8     337kB ± 4%     126kB ± 5%   -62.67%  (p=0.000 n=8+8)

name             old allocs/op  new allocs/op  delta
SendRecv-8           1.00 ± 0%      0.00       -100.00%  (p=0.000 n=10+10)
SendRecvLarge-8     8.60k ± 0%     0.14k ±11%   -98.38%  (p=0.000 n=7+10)

Stebalien reviewed Oct 12, 2021

View reviewed changes

marten-seemann self-requested a review October 13, 2021 09:52

marten-seemann reviewed Oct 13, 2021

View reviewed changes

aschmahmann added the need/author-input Needs input from the original author label Oct 15, 2021

shift buffer only if at least ~1/4 of slice is empty to prevent extre…

f7fe806

…me case when we shift slice by one every time

pymq requested a review from Stebalien October 23, 2021 09:50

BigLep requested a review from a team October 29, 2021 06:31

add more doc to segmentedBuffer

453784a

iand reviewed Nov 8, 2021

View reviewed changes

aschmahmann added need/maintainer-input Needs input from the current maintainer(s) and removed need/author-input Needs input from the original author labels Nov 19, 2021

aschmahmann requested a review from marten-seemann November 19, 2021 16:31

marten-seemann approved these changes Nov 20, 2021

View reviewed changes

marten-seemann changed the title ~~drastically reduce allocations~~ drastically reduce allocations in ring buffer implementation Nov 20, 2021

marten-seemann merged commit d6101de into libp2p:master Nov 20, 2021

aschmahmann mentioned this pull request Dec 1, 2021

Release v0.11 ipfs/kubo#8343

Closed

80 tasks

amirylm mentioned this pull request Dec 20, 2021

Upgrade Libp2p ssvlabs/ssv#481

Merged

amirylm mentioned this pull request Jan 12, 2022

Peers Limit (part 2) ssvlabs/ssv#487

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

drastically reduce allocations in ring buffer implementation #64

drastically reduce allocations in ring buffer implementation #64

pymq commented Oct 5, 2021

Stebalien left a comment

Stebalien Oct 12, 2021

pymq Oct 18, 2021

marten-seemann left a comment

pymq commented Oct 18, 2021

BigLep commented Oct 29, 2021

BigLep commented Oct 29, 2021

marten-seemann commented Oct 29, 2021

pymq commented Oct 29, 2021

iand Nov 8, 2021

pymq Nov 8, 2021

iand Nov 8, 2021

iand commented Nov 8, 2021

drastically reduce allocations in ring buffer implementation #64

drastically reduce allocations in ring buffer implementation #64

Conversation

pymq commented Oct 5, 2021

Stebalien left a comment

Choose a reason for hiding this comment

Stebalien Oct 12, 2021

Choose a reason for hiding this comment

pymq Oct 18, 2021

Choose a reason for hiding this comment

marten-seemann left a comment

Choose a reason for hiding this comment

pymq commented Oct 18, 2021

BigLep commented Oct 29, 2021

BigLep commented Oct 29, 2021

marten-seemann commented Oct 29, 2021

pymq commented Oct 29, 2021

iand Nov 8, 2021

Choose a reason for hiding this comment

pymq Nov 8, 2021

Choose a reason for hiding this comment

iand Nov 8, 2021

Choose a reason for hiding this comment

iand commented Nov 8, 2021