store: Moved to our own custom posting helpers. #753

bwplotka · 2019-01-22T20:19:21Z

This is necessary to support newest TSDB, so blocker for: #704

Newest optimization (http://github.com/prometheus/tsdb/pull/486) makes tsdb.PostingForMatcher impossible to use.

This also hopefully reduce amount of code to understand as we don't need to necessarily fit into
index.Postings even though we don't need streaming for now. This should make the code less complex and more readable (in comparision to previous lazyPostings implementation)

Benchmarks to come.

Signed-off-by: Bartek Plotka bwplotka@gmail.com

bwplotka · 2019-01-22T20:27:29Z

Benchmarks:

Current master branch:

goos: linux
goarch: amd64
pkg: github.com/improbable-eng/thanos/pkg/store
BenchmarkBucketStore_Series-12    	  100000	    771896 ns/op	   59657 B/op	     729 allocs/op
PASS
ok  	github.com/improbable-eng/thanos/pkg/store	86.232s

This PR:

goos: linux
goarch: amd64
pkg: github.com/improbable-eng/thanos/pkg/store
BenchmarkBucketStore_Series-12    	  100000	    766911 ns/op	   60421 B/op	     735 allocs/op
PASS
ok  	github.com/improbable-eng/thanos/pkg/store	86.234s

Test: https://gist.github.com/bwplotka/b297ada1b8cc9c14087f8307233bff5a

So it seems like there is no much difference in terms of performance.. new version is slightly worse. Parity check ✔️

Can spend some time to look if we can have quick gains.

bwplotka · 2019-01-22T20:37:21Z

pkg/store/bucket.go

@@ -1188,89 +1150,200 @@ func newBucketIndexReader(ctx context.Context, logger log.Logger, block *bucketB
 	return r
 }

-func (r *bucketIndexReader) preloadPostings() error {
-	const maxGapSize = 512 * 1024
+func (r *bucketIndexReader) lookupSymbol(o uint32) (string, error) {


This method will be used in the Upgrade TSDB PR.

bwplotka · 2019-01-23T00:26:11Z

In terms of CPU, most of it is heap and mallocgc..

In terms of memory, nothing extra ordinary either. Mostly chunk pool that alloc bytes which is reasonable as we benchmark, since we use 12 CPU cores... We might want to have concurrent query limit, to make sure pool can reuse memory better.

domgreen

All looks good, took a while to review/understand in depth what was happening.

Only thing I can see that I don't think is correct is the incrementing of the stats for postingsFetched.

domgreen · 2019-01-23T11:51:16Z

pkg/store/bucket.go

+// on object storage.
+// Found posting IDs (ps) are not strictly required to point to a valid Series, e.g. during
+// background garbage collections.
+func (r *bucketIndexReader) ExpandedPostings(ms []labels.Matcher) ([]uint64, error) {


Would be good to add a high-level overview of what a posting is just so it is more understandable for readers:

A posting is a reference (represented as a uint64) to a series reference, which in turn points to the first chunk where the series contains the matching label-value pair for a given block of data.

Thoughts?

Makes sense. Extended bit as well.

domgreen · 2019-01-23T12:08:23Z

pkg/store/bucket.go

-		return nil, stats, errors.Wrap(err, "expand postings")
-	}
+	stats := &queryStats{}
+	stats = stats.merge(indexr.stats)


could we do this after the if len(ps)==0 and just return indexr.stats if we have storepb.EmptySeriesSet?

domgreen · 2019-01-23T12:55:08Z

pkg/store/bucket.go

+	for _, m := range ms {
+		matching, err := matchingLabels(r.LabelValues, m)
+		if err != nil {
+			return nil, errors.Wrap(err, "match postings")


match postings > matching labels?

domgreen · 2019-01-23T13:00:48Z

pkg/store/bucket.go

-		_, l, err := r.dec.Postings(c)
-		if err != nil {
-			return errors.Wrap(err, "read postings list")
+// fetchPostings returns sorted slice of postings.


fetchPostings returns sorted slice of postings that match the selected labels.

domgreen · 2019-01-23T13:35:57Z

pkg/store/bucket.go

+			defer r.mtx.Unlock()
+
+			r.stats.postingsFetchCount++
+			r.stats.postingsFetched += len(ptrs)


len(ptrs) > j - i?
It seems to me that we are increasing by the number of pointers in each go routine, therefore, this would be a much bigger value than it, in fact, should be.

or could just increment it in the loop below r.stats.postingFetched++ for each pointer

I cannot increment in loop below as they are not fetched yet there. Otherwise you comment makes sense it's a bug, should be j-i, thanks!

domgreen · 2019-01-23T13:40:32Z

pkg/store/bucket.go

+			for _, p := range ptrs[i:j] {
+				c := b[p.ptr.Start-start : p.ptr.End-start]
+
+				_, l, err := r.dec.Postings(c)


l > fetchedPostings?

domgreen · 2019-01-23T13:45:45Z

pkg/store/bucket.go

+		return nil, err
+	}
+
+	return index.Merge(postings...), nil


would it be worth checking here if the returned can be cast to ErrPostings?

Hm.. This will bubble up eventually after expand operation. ErrPostings is just a postings with Next = false and some Err. I think it's fine for now, we can check for errors early if that will cause problems (it will be hard to find cause of error) .

domgreen · 2019-01-23T13:51:11Z

pkg/store/bucket.go

+	// at the loading level.
+	if r.block.indexVersion >= 2 {
+		for i, id := range ps {
+			ps[i] = id * 16


not sure I understand this how does multiplying by 16 ensure its padded correctly? Maybe speak offline.

That's 1:1 copied with what it was.

Basically TSDB are now storing those reference every 16 bytes. The index format changed.

Ahhhhh, okay now I understand we are calculating the offset which as we are saving everything as a fixed byte range we need to ensure we calculate the offset of id to point to the correct bytes 👍

domgreen

LGTM

just one nit on variable name spelling.

domgreen · 2019-01-24T12:33:04Z

pkg/store/bucket.go

+
+		start := int64(p.start)
+		// We assume index does not have any ptrs that has 0 length.
+		lenght := int64(p.end) - start


lenght > length

domgreen · 2019-01-24T12:52:58Z

pkg/store/bucket.go

+	// at the loading level.
+	if r.block.indexVersion >= 2 {
+		for i, id := range ps {
+			ps[i] = id * 16


Ahhhhh, okay now I understand we are calculating the offset which as we are saving everything as a fixed byte range we need to ensure we calculate the offset of id to point to the correct bytes 👍

FUSAKLA · 2019-01-25T07:14:38Z

pkg/store/bucket_e2e_test.go

+		// gets created each. This way we can easily verify we got 10 chunks per series below.
+		id1, err := testutil.CreateBlock(dir, series[:4], 10, mint, maxt, extLset, 0)
+		testutil.Ok(t, err)
+		id2, err := testutil.CreateBlock(dir, series[4:], 10, mint, maxt, extLset, 0)


Aren't those blocks overlapping? And if yes, isn't that an issue?

They have different external labels 😉

but thanks for being careful here!

off-course 🤦‍♂️ sorry and thanks for pointing out

FUSAKLA · 2019-01-25T07:17:22Z

pkg/store/bucket_e2e_test.go

+	rctx, rcancel := context.WithTimeout(ctx, 30*time.Second)
+	defer rcancel()
+	testutil.Ok(t, runutil.Retry(100*time.Millisecond, rctx.Done(), func() error {
+		if store.numBlocks() < 6 {


hm.. might be somehow noted above that we create 2 blocks for each of 3 time slots?
Or maybe even make this a variable and increment it each time you upload a block so it's not hardcoded?

This is necessary to support newest TSDB. Newest optimization (http://github.com/prometheus/tsdb/pull/486) makes tsdb.PostingForMatcher impossible to use. This also hopefully reduce amount of code to understand as we don't need to necessarily fit into index.Postings even though we don't need streaming for now. This should make the code less complex and more readable (in comparision to previous `lazyPostings` implementation. Signed-off-by: Bartek Plotka <bwplotka@gmail.com>

bwplotka requested review from domgreen and improbable-ludwik January 22, 2019 20:19

bwplotka commented Jan 22, 2019

View reviewed changes

domgreen reviewed Jan 23, 2019

View reviewed changes

bwplotka mentioned this pull request Jan 23, 2019

More efficient Merge implementation. prometheus-junkyard/tsdb#486

Merged

bwplotka force-pushed the posting-fetcher branch from 574b0e6 to 4d4eeac Compare January 24, 2019 09:44

bwplotka added the priority: P0 label Jan 24, 2019

domgreen approved these changes Jan 24, 2019

View reviewed changes

FUSAKLA reviewed Jan 25, 2019

View reviewed changes

bwplotka force-pushed the posting-fetcher branch from 4d4eeac to 9e12e0b Compare January 25, 2019 12:08

bwplotka merged commit fc2adfa into master Jan 25, 2019

This was referenced Feb 12, 2019

store: crashing after upgrade to 0.3.0 #829

Closed

store: Querying with regex label matchers return invalid metrics in version 0.3.0 #833

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

store: Moved to our own custom posting helpers. #753

store: Moved to our own custom posting helpers. #753

bwplotka commented Jan 22, 2019 •

edited

Loading

bwplotka commented Jan 22, 2019 •

edited

Loading

bwplotka Jan 22, 2019

bwplotka commented Jan 23, 2019 •

edited

Loading

domgreen left a comment

domgreen Jan 23, 2019

bwplotka Jan 24, 2019

domgreen Jan 23, 2019

domgreen Jan 23, 2019

domgreen Jan 23, 2019

domgreen Jan 23, 2019

domgreen Jan 23, 2019

bwplotka Jan 24, 2019

domgreen Jan 23, 2019

domgreen Jan 23, 2019

bwplotka Jan 24, 2019

domgreen Jan 23, 2019

bwplotka Jan 24, 2019

domgreen Jan 24, 2019

domgreen left a comment

domgreen Jan 24, 2019

domgreen Jan 24, 2019

FUSAKLA Jan 25, 2019

bwplotka Jan 25, 2019

bwplotka Jan 25, 2019

FUSAKLA Jan 25, 2019

FUSAKLA Jan 25, 2019

store: Moved to our own custom posting helpers. #753

store: Moved to our own custom posting helpers. #753

Conversation

bwplotka commented Jan 22, 2019 • edited Loading

bwplotka commented Jan 22, 2019 • edited Loading

Choose a reason for hiding this comment

bwplotka commented Jan 23, 2019 • edited Loading

domgreen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

domgreen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bwplotka commented Jan 22, 2019 •

edited

Loading

bwplotka commented Jan 22, 2019 •

edited

Loading

bwplotka commented Jan 23, 2019 •

edited

Loading