Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for caching based on compaction level and block age #805

Merged
merged 15 commits into from
Aug 6, 2021

Conversation

annanay25
Copy link
Contributor

@annanay25 annanay25 commented Jul 7, 2021

What this PR does:
Adds support for caching based on compaction level and max block age. Ideally, we want to cache bloom filters of blocks that will be around longer - so blocks which reach higher levels of compaction soon.

Which issue(s) this PR fixes:
Fixes #na!

Checklist

  • Tests updated
  • Documentation added
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX]

…size histogram

Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
…m shard counters

Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
… level

Signed-off-by: Annanay <annanayagarwal@gmail.com>
Signed-off-by: Annanay <annanayagarwal@gmail.com>
@annanay25 annanay25 marked this pull request as ready for review July 16, 2021 15:09
@annanay25
Copy link
Contributor Author

There's one last thing left to implement: remove any caching while writing objects at the ingester. At the compactor we can have a similar shouldCache function as implemented for querier.

@joe-elliott
Copy link
Member

Don't we need to merge #818 first before this?

remove any caching while writing objects at the ingester

Why wouldn't we cache objects written by the ingester?

@annanay25
Copy link
Contributor Author

Don't we need to merge #818 first before this?

That's right. 818 is cleaned up and ready to review.

Why wouldn't we cache objects written by the ingester?

My bad, not remove caching, the write methods on ingester/compactor will use ShouldCache() to decide whether the objects should be cached.

Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
docs/tempo/website/operations/caching.md Show resolved Hide resolved
tempodb/tempodb.go Show resolved Hide resolved
Comment on lines +104 to +105
uncachedReader backend.Reader
uncachedWriter backend.Writer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice if we could shield off direct access to r, w, uncachedReader and uncachedWriter. So you are always forced to call getReaderForBlock and getWriterForBlock.
Unfortunately we can't shield private fields because everything is in the same package.

I think we should either add a comment to warn you or somehow combine these fields into a new struct that knows what to cache.

When I search for usage of r and w I still find some code using them directly (so without getXxxForBlock). I don't know if this is okay.

r:

  • compactor.go:154 (compact)
  • compactor.go:160 (compact)
  • tempodb.go:208 (CompleteBlock)

w:

  • compactor.go:251 (appendBlock)
  • tempodb.go:208 (CompleteBlock)

Copy link
Contributor

@mdisibio mdisibio Aug 6, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some cases are unavoidable, such as compactor.go:154 to load the block meta from the backend, so the fields will have to remain accessible. compactor:160 and :251 are ok because they are iterating or flushing data, which does not involve the bloom filter. tempodb.go:208 actually looks like obsolete code only used by a querier test. The ingester used to call this when iterating the wal, but now it calls CompleteBlockWithBackend.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is compactor.go:154 unavoidable? Using r directly means we read using the cache, what's the difference with using the reader we get from getReaderForBlock?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. I was thinking that there will be cases where the meta is not available, i.e. a call to get the meta, but in this case the compactor does already have it.

@kvrhdn

This comment has been minimized.

Copy link
Member

@kvrhdn kvrhdn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall changes look good and are easy to follow. It's fine for me to merge now and address the comments later.

Signed-off-by: Martin Disibio <mdisibio@gmail.com>
Signed-off-by: Martin Disibio <mdisibio@gmail.com>
@mdisibio mdisibio merged commit db1209f into grafana:main Aug 6, 2021
@annanay25 annanay25 deleted the caching-strategy branch August 9, 2021 06:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants