-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cmd/compile, bytes: bootstrap array causes bytes.Buffer to always be heap-allocated #7921
Comments
Another thing to note: If I change "NewBuffer" in playground/bytes2 to return &Buffer{} then buf (in main()) escapes to the heap even though NewBuffer is inlined: ./bytes2.go:9: inlining call to bytes2.NewBuffer ./bytes2.go:13: inlining call to Read ./bytes2.go:18: inlining call to Bytes ./bytes2.go:8: make([]byte, 4) escapes to heap ./bytes2.go:9: <S> &bytes2.Buffer literal does not escape ./bytes2.go:11: main []byte literal does not escape ./bytes2.go:12: main make([]byte, 2) does not escape ./bytes2.go:14: main []byte literal does not escape Otherwise if I return Buffer{} directly nothing gets allocated to the heap, unless a resize happens. |
Escape analysis treats everything assigned to OIND/ODOTPTR as escaping. As the result b escapes in the following code: func (b *Buffer) Foo() { n, m := ... b.buf = b.buf[n:m] } This change recognizes such assignments and ignores them. Update issue #9043. Update issue #7921. There are two similar cases in std lib that benefit from this optimization. First is in archive/zip: type readBuf []byte func (b *readBuf) uint32() uint32 { v := binary.LittleEndian.Uint32(*b) *b = (*b)[4:] return v } Second is in time: type data struct { p []byte error bool } func (d *data) read(n int) []byte { if len(d.p) < n { d.p = nil d.error = true return nil } p := d.p[0:n] d.p = d.p[n:] return p } benchmark old ns/op new ns/op delta BenchmarkCompressedZipGarbage 32431724 32217851 -0.66% benchmark old allocs new allocs delta BenchmarkCompressedZipGarbage 153 143 -6.54% Change-Id: Ia6cd32744e02e36d6d8c19f402f8451101711626 Reviewed-on: https://go-review.googlesource.com/3162 Reviewed-by: Keith Randall <khr@golang.org> Reviewed-by: Russ Cox <rsc@golang.org>
With Go 1.8, the provided sample code (the playground/bytes package) no longer escapes:
However, a bytes.Buffer still seems to always escape if you call its WriteXxx methods, so the root issue persists. |
It's the use of the bootstrap array that's forcing bytes.Buffer to always escape now, from what I can tell. Here's a very simple repro: package main
type B struct {
buf []byte
storage [64]byte
}
func (b *B) X() {
b.buf = b.storage[:10]
}
func main() {
var b B
b.X()
}
Is it the case that any self-referencing pointers foil escape analysis? For instance, it also escapes if you do this:
cc @randall77 for escape analysis thoughts @lukescott I took the liberty of retitling the issue given that the re-slicing thing was fixed but the underlying problem of bytes.Buffer always being heap-allocated was not (you also discussed this in the now-closed #7661). |
I don't know why this would cause an escape. Probably a tricky corner case? Cycles in the escapes-to graph? |
I don't "know", but I suspect. Pretty sure that forms a cycle in the graph, which can look like an escape for various reasons. |
Change https://golang.org/cl/86976 mentions this issue: |
All credit and blame goes to Ian for this suggestion, copied from the runtime. Fixes #23382 Updates #7921 Change-Id: I3d5a9ee4ab730c87e0f3feff3e7fceff9bcf9e18 Reviewed-on: https://go-review.googlesource.com/86976 Reviewed-by: Ian Lance Taylor <iant@golang.org>
Change https://golang.org/cl/133375 mentions this issue: |
It's hard to solve this issue without any API changes just with improvements to the escape analysis. See https://github.com/intel-go/bytebuf. The only change to type Buffer struct {
buf []byte // contents are the bytes buf[off : len(buf)]
off int // read at &buf[off], write at &buf[len(buf)]
- bootstrap [64]byte // memory to hold first slice; helps small buffers avoid allocation.
+ bootstrap *[64]byte // memory to hold first slice; helps small buffers avoid allocation.
lastRead readOp // last read operation, so that Unread* can work correctly.
} And we get these numbers:
|
@quasilyte there's something I don't understand. The theory behind the bootstrap array was to avoid an allocation, by having the initial slice point to it. Unfortunately, our escape analysis is unable to prove that the buffer does not escape (would you be able to explain why, what is the limit?), so we always got 1 allocation (the Your change makes the buffer to be a pointer to an array, so that ( So shouldn't it be the same? |
@rasky trick is that |
Without applying CL133375 and with this patch to
The bytes package benchmarks aren't so rosy, though. |
Since this seems to be a big improvement for certain scenarios, but requires an API change, maybe this change would be a candidate for Go 2. |
Change https://golang.org/cl/133715 mentions this issue: |
@opennota what you're doing is essentially a The main benefit from not having a It also opens some new optimization possibilities described below. Results for benchmarks from
With array self-assignment removed, we don't have a leaking param in - var buf bytes.Buffer
+ buf := bytes.NewBuffer(make([]byte, 0, 64)) Note that one can use different capacity, so we can effectively have bootstrap "array" of more than 64 bytes.
Note that we do less allocations for workloads that fit the slice capacity. Key points of the new implementation:
There were no big reasons to use
|
Rationale: small buffer optimization does not work and it has made things slower since 2014. Until we can make it work, we should prefer simpler code that also turns out to be more efficient. With this change, it's possible to use NewBuffer(make([]byte, 0, bootstrapSize)) to get the desired stack-allocated initial buffer since escape analysis can prove the created slice to be non-escaping. New implementation key points: - Zero value bytes.Buffer performs better than before - You can have a truly stack-allocated buffer, and it's not even limited to 64 bytes - The unsafe.Sizeof(bytes.Buffer{}) is reduced significantly - Empty writes don't cause allocations Buffer benchmarks from bytes package: name old time/op new time/op delta ReadString-8 9.20µs ± 1% 9.22µs ± 1% ~ (p=0.148 n=10+10) WriteByte-8 28.1µs ± 0% 26.2µs ± 0% -6.78% (p=0.000 n=10+10) WriteRune-8 64.9µs ± 0% 65.0µs ± 0% +0.16% (p=0.000 n=10+10) BufferNotEmptyWriteRead-8 469µs ± 0% 461µs ± 0% -1.76% (p=0.000 n=9+10) BufferFullSmallReads-8 108µs ± 0% 108µs ± 0% -0.21% (p=0.000 n=10+10) name old speed new speed delta ReadString-8 3.56GB/s ± 1% 3.55GB/s ± 1% ~ (p=0.165 n=10+10) WriteByte-8 146MB/s ± 0% 156MB/s ± 0% +7.26% (p=0.000 n=9+10) WriteRune-8 189MB/s ± 0% 189MB/s ± 0% -0.16% (p=0.000 n=10+10) name old alloc/op new alloc/op delta ReadString-8 32.8kB ± 0% 32.8kB ± 0% ~ (all equal) WriteByte-8 0.00B 0.00B ~ (all equal) WriteRune-8 0.00B 0.00B ~ (all equal) BufferNotEmptyWriteRead-8 4.72kB ± 0% 4.67kB ± 0% -1.02% (p=0.000 n=10+10) BufferFullSmallReads-8 3.44kB ± 0% 3.33kB ± 0% -3.26% (p=0.000 n=10+10) name old allocs/op new allocs/op delta ReadString-8 1.00 ± 0% 1.00 ± 0% ~ (all equal) WriteByte-8 0.00 0.00 ~ (all equal) WriteRune-8 0.00 0.00 ~ (all equal) BufferNotEmptyWriteRead-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) BufferFullSmallReads-8 3.00 ± 0% 2.00 ± 0% -33.33% (p=0.000 n=10+10) The most notable thing in go1 benchmarks is reduced allocs in HTTPClientServer (-1 alloc): HTTPClientServer-8 64.0 ± 0% 63.0 ± 0% -1.56% (p=0.000 n=10+10) For more explanations and benchmarks see the referenced issue. Updates #7921 Change-Id: Ica0bf85e1b70fb4f5dc4f6a61045e2cf4ef72aa3 Reviewed-on: https://go-review.googlesource.com/133715 Reviewed-by: Martin Möhrmann <moehrmann@google.com> Reviewed-by: Robert Griesemer <gri@golang.org> Reviewed-by: Brad Fitzpatrick <bradfitz@golang.org>
I had retitled this bug to be about bytes.Buffer's bootstrap array, so now that we got rid of that I suppose we can close this bug. The underlying issue of self-referential structures tricking escape analysis (as demonstrated in #7921 (comment)) remains, however. |
Instead of skipping all OSLICEARR, skip only ones with non-pointer array type. For pointers to arrays, it's safe to apply the self-assignment slicing optimizations. Refactored the matching code into separate function for readability. This is an extension to already existing optimization. On its own, it does not improve any code under std, but it opens some new optimization opportunities. One of them is described in the referenced issue. Updates #7921 Change-Id: I08ac660d3ef80eb15fd7933fb73cf53ded9333ad Reviewed-on: https://go-review.googlesource.com/133375 Run-TryBot: Iskander Sharipov <iskander.sharipov@intel.com> TryBot-Result: Gobot Gobot <gobot@golang.org> Reviewed-by: Cherry Zhang <cherryyz@google.com>
The iterator returned by `tree.MakeIter` was escaping to the heap for two reasons: 1. it was capturing a reference to the tree itself. 2. it was pointing a slice into its own array. This change addresses both of these problems and prevents the iterator from escaping when used. The fixes were: 1. copy the tree's root pointer reference instead of a reference to the tree itself. 2. avoid creating the self-referential slice reference. This mistakenly escapes because of golang/go#7921, which also caused issues with `bytes.Buffer` (https://golang.org/cl/133715). This change also adds a new benchmark which demonstrates whether `MakeIter` escapes or not: ``` name old time/op new time/op delta BTreeMakeIter-4 131ns ±14% 25ns ± 1% -81.23% (p=0.000 n=9+9) name old alloc/op new alloc/op delta BTreeMakeIter-4 144B ± 0% 0B -100.00% (p=0.000 n=10+10) name old allocs/op new allocs/op delta BTreeMakeIter-4 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) ``` Release note: None
The iterator returned by `tree.MakeIter` was escaping to the heap for two reasons: 1. it was capturing a reference to the tree itself. 2. it was pointing a slice into its own array. This change addresses both of these problems and prevents the iterator from escaping when used. The fixes were: 1. copy the tree's root pointer reference instead of a reference to the tree itself. 2. avoid creating the self-referential slice reference. This mistakenly escapes because of golang/go#7921, which also caused issues with `bytes.Buffer` (https://golang.org/cl/133715). This change also adds a new benchmark which demonstrates whether `MakeIter` escapes or not: ``` name old time/op new time/op delta BTreeMakeIter-4 131ns ±14% 25ns ± 1% -81.23% (p=0.000 n=9+9) name old alloc/op new alloc/op delta BTreeMakeIter-4 144B ± 0% 0B -100.00% (p=0.000 n=10+10) name old allocs/op new allocs/op delta BTreeMakeIter-4 1.00 ± 0% 0.00 -100.00% (p=0.000 n=10+10) ``` Release note: None
There's a lot of history on the thread, but skimming briefly I understand the issue boils down to @cespare's example:
This function is analyzed as "leaking param: b", which means calling The issue here is that we're assigning &b.n through a (implicit) pointer dereference, so escape analysis pessimistically assumes the pointer might point to the heap. There's an optimization esc.go:isSelfAssign that tries to look for things like (The above explanation applies to both esc.go and escape.go; they use the same approach here and even both use isSelfAssign for this optimization.) Edit: This is wrong. Simply recognizing |
@mdempsky If we eliminate unnecessary leaks in
will only has leaking param content in |
@cuonglm Unfortunately, my earlier suggestion to ignore In the example you pasted, it's important that To fix this, we need a way to tag functions with something like " Once we remove esc.go, I expect we'll be able to cleanup/simplify the tagging scheme, and there will probably be opportunities for representing more fine-grained semantics like this. |
I think I may case the same problem: The memory is leaking |
@mdempsky now that esc.go is gone, is this ripe for revisiting? |
I just ran up against this issue again and wondered if it's likely that there might be some work done on this in the next cycle or two. |
latest go1.17-b8a7c33ec9:
Builds:
|
I believe that if this were fixed, we could use a bootstrap array to fix #2320. |
Attachments:
The text was updated successfully, but these errors were encountered: