Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batch adding series to query limiter to optimize locking #5505

Merged
merged 2 commits into from
Sep 14, 2023

Conversation

harry671003
Copy link
Contributor

@harry671003 harry671003 commented Aug 9, 2023

What this PR does:

When streaming data from ingesters, a lot of time can be spend on waiting for the QueryLimiter lock to be acquired. This change optimizes this a bit by allowing a batch of series to be added at once.

profile

Benchmarks

I saw some improvements in the benchmarks:

Without batching
(base) ➜  cortex git:(optimize_limiter) go test -bench=^BenchmarkQueryLimiter_AddSeries$ -benchmem -benchtime=200000x github.com/cortexproject/cortex/pkg/util/limiter -count 10          
goos: darwin
goarch: arm64
pkg: github.com/cortexproject/cortex/pkg/util/limiter
BenchmarkQueryLimiter_AddSeries-10        200000             27349 ns/op           31728 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             29049 ns/op           31728 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             30271 ns/op           31728 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             32509 ns/op           31728 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             33181 ns/op           31727 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             32375 ns/op           31727 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             30465 ns/op           31728 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             32754 ns/op           31728 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             33264 ns/op           31728 B/op        599 allocs/op
BenchmarkQueryLimiter_AddSeries-10        200000             31569 ns/op           31728 B/op        599 allocs/op
PASS
ok      github.com/cortexproject/cortex/pkg/util/limiter        63.644s
With batching
(base) ➜  cortex git:(optimize_limiter) go test -bench=^BenchmarkQueryLimiter_AddSeriesBatch$ -benchmem -benchtime=200000x github.com/cortexproject/cortex/pkg/util/limiter -count 10
goos: darwin
goarch: arm64
pkg: github.com/cortexproject/cortex/pkg/util/limiter
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             11871 ns/op           31729 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             10766 ns/op           31728 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             10934 ns/op           31728 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             11575 ns/op           31728 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             11615 ns/op           31729 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             10766 ns/op           31728 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             11049 ns/op           31728 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             10903 ns/op           31728 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             10601 ns/op           31729 B/op        401 allocs/op
BenchmarkQueryLimiter_AddSeriesBatch-10           200000             11080 ns/op           31728 B/op        401 allocs/op
PASS
ok      github.com/cortexproject/cortex/pkg/util/limiter        23.073s

Which issue(s) this PR fixes:
Fixes #

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

@harry671003 harry671003 changed the title Optimize query limiter by allowing to add batch of series Batch adding series to query limiter to optimize locking Aug 9, 2023
for _, s := range series {
fingerprint := client.FastFingerprint(s)
ql.uniqueSeries[fingerprint] = struct{}{}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am okay with this but I am also wondering if we can calculate the hashes first before we lock.
This requires to allocate a slice of int64 so not sure if it is worth. Probably we need some benchmark here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did see some improvement when hashing the series outside the lock. I'll implement your suggestion.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harry671003 Can you run the benchmark again? And let's also add -benchmem to show memory allocation?

Copy link
Contributor Author

@harry671003 harry671003 Aug 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

# old.out - Without the slice of fps
# new.out - With the slice of fps

> benchstat old.out new.out                                                                                                                              
goos: darwin
goarch: arm64
pkg: github.com/cortexproject/cortex/pkg/util/limiter
                               │   old.out   │               new.out               │
                               │   sec/op    │   sec/op     vs base                │
QueryLimiter_AddSeriesBatch-10   12.97µ ± 7%   11.39µ ± 3%  -12.20% (p=0.000 n=10)

                               │   old.out    │               new.out               │
                               │     B/op     │     B/op      vs base               │
QueryLimiter_AddSeriesBatch-10   30.20Ki ± 0%   30.98Ki ± 0%  +2.59% (p=0.000 n=10)

                               │  old.out   │              new.out              │
                               │ allocs/op  │ allocs/op   vs base               │
QueryLimiter_AddSeriesBatch-10   400.0 ± 0%   401.0 ± 0%  +0.25% (p=0.000 n=10)

@harry671003 harry671003 force-pushed the optimize_limiter branch 2 times, most recently from 3d9eaef to cfd6140 Compare August 10, 2023 21:32
Signed-off-by: 🌲 Harry 🌊 John 🏔 <johrry@amazon.com>
@harry671003
Copy link
Contributor Author

@alanprot @friedrichg Could you take a look please?

// AddSeries adds the input series and returns an error if the limit is reached.
func (ql *QueryLimiter) AddSeries(seriesLabels []cortexpb.LabelAdapter) error {
// AddSeriesBatch adds the batch of input series and returns an error if the limit is reached.
func (ql *QueryLimiter) AddSeries(series ...[]cortexpb.LabelAdapter) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small nit:

Can we receive an [][]cortexpb.LabelAdapter instaed of ...[]cortexpb.LabelAdapter to avoid copying this slice over?

From go doc:

If f is variadic with a final parameter p of type ...T, then within f the type of p is equivalent to type []T. If f is invoked with no actual arguments for p, the value passed to p is nil. Otherwise, the value passed is a new slice of type []T with a new underlying array whose successive elements are the actual arguments, which all must be assignable to T. The length and capacity of the slice is therefore the number of arguments bound to p and may differ for each call site.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct me if I'm wrong. The doc is saying that if we pass a slice into a variadic function, it'll not create a new array right?

If the final argument is assignable to a slice type []T and is followed by ..., it is passed unchanged as the value for a ...T parameter. In this case no new slice is created.

series := [][]cortexpb.LabelAdapter{s1, s2}
AddSeries(series...) // This will not create a new underlying array.

@alanprot
Copy link
Member

This looks great! Just a small nit!

LGTM.

@yeya24 yeya24 merged commit f560115 into cortexproject:master Sep 14, 2023
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants