-
Notifications
You must be signed in to change notification settings - Fork 104
Avoid unnecessary string allocations during memory-idx.Add #687
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the lower allocs, and the code seems a bit simpler too (2 loops instead of 1)
However I was surprised to find that on my system the code runs slower.
I ran the benchmarks 20 times master vs your branch, (with a warmup) like so:
date; for i in {1..20}; do go test -bench=. > /dev/null; done; for i in {1..20}; do go test -bench=. >> master.txt; done; git checkout shanson/FewerAllocs; for i in {1..20}; do go test -bench=. >> new.txt; done; date
then I compared with https://godoc.org/golang.org/x/perf/cmd/benchstat:
benchstat master.txt new.txt
name old time/op new time/op delta
Find-8 108µs ± 3% 109µs ± 2% ~ (p=0.247 n=20+19)
Concurrent4Find-8 63.2µs ± 2% 61.5µs ± 3% -2.56% (p=0.000 n=20+19)
Concurrent8Find-8 52.9µs ± 1% 51.4µs ± 2% -2.76% (p=0.000 n=18+19)
Indexing-8 2.81µs ± 2% 3.56µs ±33% +26.83% (p=0.000 n=18+20)
Deletes-8 3.45µs ±46% 3.21µs ±15% ~ (p=0.092 n=20+15)
name old alloc/op new alloc/op delta
Find-8 50.3kB ± 0% 50.3kB ± 0% -0.00% (p=0.048 n=20+20)
Concurrent4Find-8 50.3kB ± 0% 50.3kB ± 0% ~ (p=0.351 n=20+20)
Concurrent8Find-8 50.3kB ± 0% 50.3kB ± 0% ~ (p=0.982 n=19+20)
Indexing-8 1.27kB ± 0% 1.18kB ± 0% -7.55% (p=0.000 n=18+20)
Deletes-8 1.39kB ± 0% 1.39kB ± 0% ~ (all equal)
name old allocs/op new allocs/op delta
Find-8 967 ± 0% 967 ± 0% ~ (all equal)
Concurrent4Find-8 967 ± 0% 967 ± 0% ~ (all equal)
Concurrent8Find-8 966 ± 0% 966 ± 0% ~ (all equal)
Indexing-8 21.0 ± 0% 17.0 ± 0% -19.05% (p=0.000 n=20+20)
Deletes-8 8.00 ± 0% 8.00 ± 0% ~ (all equal)
I wonder how you got that decrease in runtime for this benchmark.
could be a fluke. these sort of benchmarks can be sensitive to background noise (e.g. browsers/deskstop stuff running in the background, cpu throttling (frequency adjustments) kicking in, etc)
branch := strings.Join(nodes[0:i], ".") | ||
pos := strings.LastIndex(path, ".") | ||
prevPos := len(path) | ||
for true { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can just for {
with go :)
tree.Items[branch] = &Node{ | ||
Path: branch, | ||
Children: []string{nodes[nodePos]}, | ||
Children: []string{prevNode}, | ||
Defs: make([]string, 0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually we could probably save ourselves another allocation here.
Because Arrays do not need to be initialized explicitly; the zero value of an array is a ready-to-use array whose elements are themselves zeroed
: https://blog.golang.org/go-slices-usage-and-internals
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to be clear though, this is a slice. but yeah, could use nil
here and save an allocation.
from that same page:
The zero value of a slice is nil. The len and cap functions will both return 0 for a nil slice.
Since the zero value of a slice (nil) acts like a zero-length slice, you can declare a slice variable and then append to it
note that this is not @shanson7's doing , but ours. we have a bunch of these cases littered around that we need to clean up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i'm going to sed
search & replace them and add another PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's a bunch of more cases like this through the file. we can adress in an other PR.
I suspect the
so this looks a lot better. |
This is a small optimization (shaved about 3% off the heap size in our POC cluster, not major).
Instead of splitting and joining, the code iterates over the string in place and just saves references to the pieces of the string.
Benchmark: