Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

store: Optimized common cases for time selecting smaller amount of series. Avoid looking up symbols. #3531

Merged
merged 3 commits into from
Dec 4, 2020

Conversation

bwplotka
Copy link
Member

@bwplotka bwplotka commented Dec 2, 2020

Key rationales: The majority of query latency comes from symbols.Lookup.

Also cleaned up the code.

Benchmarks

Baseline:

/tmp/___BenchmarkTelemeterRealData_Series_in_github_com_thanos_io_thanos_pkg_store -test.v -test.bench ^\QBenchmarkTelemeterRealData_Series\E$ -test.run ^$ -test.benchtime=1m -test.benchmem
goos: linux
goarch: amd64
pkg: github.com/thanos-io/thanos/pkg/store
BenchmarkTelemeterRealData_Series
Built index header; Starting
BenchmarkTelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12         	      43	1866550478 ns/op	395648807 B/op	 4348808 allocs/op
BenchmarkTelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12        	      50	1417560470 ns/op	160595177 B/op	 3017177 allocs/op
BenchmarkTelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12      	     392	 168130862 ns/op	49921099 B/op	  493076 allocs/op
BenchmarkTelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12           	      14	5295236741 ns/op	1083815895 B/op	11798226 allocs/op
BenchmarkTelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12          	      18	4149971387 ns/op	436568360 B/op	 8577264 allocs/op
PASS

Process finished with exit code 0

``

New

benchstat -delta-test=none _dev/bench/store_symb_a.txt _dev/bench/store_symb_b.txt
name                                                                old time/op    new time/op    delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        1.87s ± 0%     1.66s ± 0%  -10.85%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       1.42s ± 0%     0.25s ± 0%  -82.17%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12     168ms ± 0%     157ms ± 0%   -6.72%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12          5.30s ± 0%     4.53s ± 0%  -14.53%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         4.15s ± 0%     0.68s ± 0%  -83.67%

name                                                                old alloc/op   new alloc/op   delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        396MB ± 0%     396MB ± 0%   -0.00%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       161MB ± 0%      95MB ± 0%  -40.95%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12    49.9MB ± 0%    49.9MB ± 0%   +0.00%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12         1.08GB ± 0%    1.08GB ± 0%   -0.00%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         437MB ± 0%     228MB ± 0%  -47.77%

name                                                                old allocs/op  new allocs/op  delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        4.35M ± 0%     4.35M ± 0%   -0.00%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       3.02M ± 0%     0.05M ± 0%  -98.36%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12      493k ± 0%      493k ± 0%   +0.00%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12          11.8M ± 0%     11.8M ± 0%   -0.00%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         8.58M ± 0%     0.02M ± 0%  -99.78%

Cmp:

benchstat -delta-test=none _dev/bench/store_symb_a.txt _dev/bench/store_symb_b.txt
name                                                                old time/op    new time/op    delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        1.87s ± 0%     1.66s ± 0%  -10.85%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       1.42s ± 0%     0.25s ± 0%  -82.17%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12     168ms ± 0%     157ms ± 0%   -6.72%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12          5.30s ± 0%     4.53s ± 0%  -14.53%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         4.15s ± 0%     0.68s ± 0%  -83.67%

name                                                                old alloc/op   new alloc/op   delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        396MB ± 0%     396MB ± 0%   -0.00%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       161MB ± 0%      95MB ± 0%  -40.95%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12    49.9MB ± 0%    49.9MB ± 0%   +0.00%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12         1.08GB ± 0%    1.08GB ± 0%   -0.00%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         437MB ± 0%     228MB ± 0%  -47.77%

name                                                                old allocs/op  new allocs/op  delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        4.35M ± 0%     4.35M ± 0%   -0.00%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       3.02M ± 0%     0.05M ± 0%  -98.36%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12      493k ± 0%      493k ± 0%   +0.00%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12          11.8M ± 0%     11.8M ± 0%   -0.00%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         8.58M ± 0%     0.02M ± 0%  -99.78%

Code: https://gist.github.com/bwplotka/cbcbbcd1802181b7785da11dcc0f5cfd

Signed-off-by: Bartlomiej Plotka bwplotka@gmail.com

…Avoid looking up symbols.

Key part: Majority of query latency comes from symbols.Lookup.

Also cleaned up the code.

## Benchmarks

Baseline:
```
/tmp/___BenchmarkTelemeterRealData_Series_in_github_com_thanos_io_thanos_pkg_store -test.v -test.bench ^\QBenchmarkTelemeterRealData_Series\E$ -test.run ^$ -test.benchtime=1m -test.benchmem
goos: linux
goarch: amd64
pkg: github.com/thanos-io/thanos/pkg/store
BenchmarkTelemeterRealData_Series
Built index header; Starting
BenchmarkTelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12         	      43	1866550478 ns/op	395648807 B/op	 4348808 allocs/op
BenchmarkTelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12        	      50	1417560470 ns/op	160595177 B/op	 3017177 allocs/op
BenchmarkTelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12      	     392	 168130862 ns/op	49921099 B/op	  493076 allocs/op
BenchmarkTelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12           	      14	5295236741 ns/op	1083815895 B/op	11798226 allocs/op
BenchmarkTelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E
BenchmarkTelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12          	      18	4149971387 ns/op	436568360 B/op	 8577264 allocs/op
PASS

Process finished with exit code 0
```

New
```
benchstat -delta-test=none _dev/bench/store_symb_a.txt _dev/bench/store_symb_b.txt
name                                                                old time/op    new time/op    delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        1.87s ± 0%     1.66s ± 0%  -10.85%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       1.42s ± 0%     0.25s ± 0%  -82.17%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12     168ms ± 0%     157ms ± 0%   -6.72%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12          5.30s ± 0%     4.53s ± 0%  -14.53%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         4.15s ± 0%     0.68s ± 0%  -83.67%

name                                                                old alloc/op   new alloc/op   delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        396MB ± 0%     396MB ± 0%   -0.00%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       161MB ± 0%      95MB ± 0%  -40.95%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12    49.9MB ± 0%    49.9MB ± 0%   +0.00%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12         1.08GB ± 0%    1.08GB ± 0%   -0.00%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         437MB ± 0%     228MB ± 0%  -47.77%

name                                                                old allocs/op  new allocs/op  delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        4.35M ± 0%     4.35M ± 0%   -0.00%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       3.02M ± 0%     0.05M ± 0%  -98.36%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12      493k ± 0%      493k ± 0%   +0.00%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12          11.8M ± 0%     11.8M ± 0%   -0.00%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         8.58M ± 0%     0.02M ± 0%  -99.78%
```
Cmp:

```
benchstat -delta-test=none _dev/bench/store_symb_a.txt _dev/bench/store_symb_b.txt
name                                                                old time/op    new time/op    delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        1.87s ± 0%     1.66s ± 0%  -10.85%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       1.42s ± 0%     0.25s ± 0%  -82.17%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12     168ms ± 0%     157ms ± 0%   -6.72%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12          5.30s ± 0%     4.53s ± 0%  -14.53%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         4.15s ± 0%     0.68s ± 0%  -83.67%

name                                                                old alloc/op   new alloc/op   delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        396MB ± 0%     396MB ± 0%   -0.00%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       161MB ± 0%      95MB ± 0%  -40.95%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12    49.9MB ± 0%    49.9MB ± 0%   +0.00%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12         1.08GB ± 0%    1.08GB ± 0%   -0.00%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         437MB ± 0%     228MB ± 0%  -47.77%

name                                                                old allocs/op  new allocs/op  delta
TelemeterRealData_Series/alerts2w/01DN3SK96XDAEKRB1AN30AAW6E-12        4.35M ± 0%     4.35M ± 0%   -0.00%
TelemeterRealData_Series/alerts15s/01DN3SK96XDAEKRB1AN30AAW6E-12       3.02M ± 0%     0.05M ± 0%  -98.36%
TelemeterRealData_Series/subssyncs2w/01DN3SK96XDAEKRB1AN30AAW6E-12      493k ± 0%      493k ± 0%   +0.00%
TelemeterRealData_Series/subs2w/01DN3SK96XDAEKRB1AN30AAW6E-12          11.8M ± 0%     11.8M ± 0%   -0.00%
TelemeterRealData_Series/subs15s/01DN3SK96XDAEKRB1AN30AAW6E-12         8.58M ± 0%     0.02M ± 0%  -99.78%
```

Code: https://gist.github.com/bwplotka/cbcbbcd1802181b7785da11dcc0f5cfd


Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
@bwplotka bwplotka changed the title store: Optimized common cases for time selecting tiny amount series. Avoid looking up symbols. store: Optimized common cases for time selecting tiny amount of series. Avoid looking up symbols. Dec 2, 2020
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
@bwplotka bwplotka changed the title store: Optimized common cases for time selecting tiny amount of series. Avoid looking up symbols. store: Optimized common cases for time selecting smaller amount of series. Avoid looking up symbols. Dec 2, 2020
pkg/store/bucket.go Show resolved Hide resolved
Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Copy link
Contributor

@yeya24 yeya24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Amazing optimization!

}
sort.Sort(s.lset)
}
if err := indexr.LookupLabelsSymbols(symbolizedLset, &lset); err != nil {
Copy link
Contributor

@yeya24 yeya24 Dec 3, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another idea to optimize this:

  1. we can pass s.lset and extLset to this function, if the label name is already in extLset, we don't need to decode that label value

  2. Then we can simplify line 746-756 and reduce the slice append

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, will try it out

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm what about merging this improvement and focusing on next ones in next PR?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM 👍

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a simple benchmark but seems this idea is not very good.

name                                                        old time/op    new time/op    delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              162ms ± 0%     205ms ± 0%  +26.23%
BucketSeries/1000000SeriesWith1Samples/10of1000000             206ms ± 0%     175ms ± 0%  -15.38%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        3.90s ± 0%     4.54s ± 0%  +16.47%
BucketSeries/100000SeriesWith100Samples/1of10000000           13.8ms ± 0%    12.3ms ± 0%  -11.25%
BucketSeries/100000SeriesWith100Samples/100of10000000         13.4ms ± 0%    13.0ms ± 0%   -2.59%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     323ms ± 0%     356ms ± 0%  +10.39%
BucketSeries/1SeriesWith10000000Samples/1of10000000            229µs ± 0%     166µs ± 0%  -27.38%
BucketSeries/1SeriesWith10000000Samples/100of10000000          246µs ± 0%     168µs ± 0%  -31.50%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000     105ms ± 0%      75ms ± 0%  -28.75%

name                                                        old alloc/op   new alloc/op   delta
BucketSeries/1000000SeriesWith1Samples/1of1000000             62.0MB ± 0%    62.1MB ± 0%   +0.01%
BucketSeries/1000000SeriesWith1Samples/10of1000000            62.0MB ± 0%    62.0MB ± 0%   +0.00%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000       1.30GB ± 0%    1.31GB ± 0%   +1.17%
BucketSeries/100000SeriesWith100Samples/1of10000000           4.86MB ± 0%    4.86MB ± 0%   -0.00%
BucketSeries/100000SeriesWith100Samples/100of10000000         4.82MB ± 0%    4.82MB ± 0%   +0.00%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     125MB ± 0%     127MB ± 0%   +1.50%
BucketSeries/1SeriesWith10000000Samples/1of10000000            225kB ± 0%     225kB ± 0%   -0.05%
BucketSeries/1SeriesWith10000000Samples/100of10000000          225kB ± 0%     225kB ± 0%   -0.04%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000    38.0MB ± 0%    38.0MB ± 0%   -0.00%

name                                                        old allocs/op  new allocs/op  delta
BucketSeries/1000000SeriesWith1Samples/1of1000000              9.70k ± 0%     9.70k ± 0%   +0.07%
BucketSeries/1000000SeriesWith1Samples/10of1000000             9.81k ± 0%     9.81k ± 0%   -0.01%
BucketSeries/1000000SeriesWith1Samples/1000000of1000000        12.1M ± 0%     12.1M ± 0%   -0.00%
BucketSeries/100000SeriesWith100Samples/1of10000000            1.11k ± 0%     1.11k ± 0%   -0.27%
BucketSeries/100000SeriesWith100Samples/100of10000000          1.15k ± 0%     1.15k ± 0%   -0.17%
BucketSeries/100000SeriesWith100Samples/10000000of10000000     1.21M ± 0%     1.21M ± 0%   -0.00%
BucketSeries/1SeriesWith10000000Samples/1of10000000              210 ± 0%       208 ± 0%   -0.95%
BucketSeries/1SeriesWith10000000Samples/100of10000000            210 ± 0%       208 ± 0%   -0.95%
BucketSeries/1SeriesWith10000000Samples/10000000of10000000      170k ± 0%      170k ± 0%   -0.01%

bwplotka added a commit to prometheus/prometheus that referenced this pull request Dec 3, 2020
Similar optimization to what we did on Thanos: thanos-io/thanos#3531

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Copy link
Member

@brancz brancz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

very nice work!

@bwplotka bwplotka merged commit 5cb63f8 into master Dec 4, 2020
@bwplotka bwplotka deleted the decode-series-optimize branch December 4, 2020 11:02
)
for _, id := range ps {
if err := indexr.LoadedSeries(id, &lset, &chks, req); err != nil {
ok, err := indexr.LoadSeriesForTime(id, &symbolizedLset, &chks, req.SkipChunks, req.MinTime, req.MaxTime)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nitpick, but perhaps ok -> found ? Also on line 2091.

bwplotka added a commit to prometheus/prometheus that referenced this pull request Dec 4, 2020
Similar optimization to what we did on Thanos: thanos-io/thanos#3531

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
bwplotka added a commit to prometheus/prometheus that referenced this pull request Dec 4, 2020
Similar optimization to what we did on Thanos: thanos-io/thanos#3531

Signed-off-by: Bartlomiej Plotka <bwplotka@gmail.com>
Copy link
Contributor

@pracucci pracucci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good job! Diff is a bit hard to read, but comparing with previous version it LGTM.

@pracucci pracucci mentioned this pull request Jan 7, 2021
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants