boltdb shipper index list cache improvements #6054

sandeepsukhani · 2022-04-29T11:32:02Z

What this PR does / why we need it:
To reduce the number of list calls we make to the object store when using boltdb-shipper, we have a cache in place. The cache stays valid for 1 min and is refreshed when a list call is made and it finds that cache needs to be rebuilt. The list call could take a couple of seconds to finish in a large cluster which can add up to the query latency when downloading index at query time.

This PR adds a parameter to list calls to skip the list cache and directly load the objects list from the object store. We will skip the cache only while doing query time download of index which is not too common, we see just up to 8k query time download operations in a day in our largest Loki cluster.

The other change this PR does is to detect the staleness of the list cache. Before explaining the problem, let us keep few things in mind:

While running compaction, it would first upload the newly created(compacted) file and then remove the source files. So it is possible that a list call can see the newly compacted file with the old source files while the compactor is still removing them.
When we see a 404 error while downloading the index, we just ignore it assuming the compactor would have removed it during compaction.

Now the problem is if compaction happened just after we cached the list of objects, the sync operation on the recently compacted table would try to download old source files that were compacted away but we would just ignore missing files as mentioned above. This means we won't have downloaded either of source index files and the new compacted file, still not know too us due to stale cache. This PR adds a check in sync code to see if we skipped downloading all the files, if so then we would retry the sync after force refreshing the index list cache.

Checklist

Tests updated

cyriltovena

LGTM

* bypass index list cache when doing query time downloading of index * detect and refresh stale index list cache during sync (cherry picked from commit 2758dc6)

* bypass index list cache when doing query time downloading of index * detect and refresh stale index list cache during sync (cherry picked from commit 2758dc6) Co-authored-by: Sandeep Sukhani <sandeep.d.sukhani@gmail.com>

…ex shipper

…per (#6316)

sandeepsukhani added 2 commits April 29, 2022 16:44

bypass index list cache when doing query time downloading of index

1ba1671

detect and refresh stale index list cache during sync

f00c35f

sandeepsukhani requested a review from a team as a code owner April 29, 2022 11:32

pull-request-size bot added the size/L label Apr 29, 2022

sandeepsukhani mentioned this pull request Apr 29, 2022

improve index list cache in boltdb-shipper to avoid refreshing cache at query time #6047

Closed

1 task

cyriltovena approved these changes May 2, 2022

View reviewed changes

sandeepsukhani merged commit 2758dc6 into grafana:main May 2, 2022

trevorwhitney added the backport k95 label May 3, 2022

grafanabot mentioned this pull request May 3, 2022

[k95] boltdb shipper index list cache improvements #6091

Merged

sandeepsukhani added a commit to sandeepsukhani/loki that referenced this pull request Jun 6, 2022

copy boltdb-shipper cache changes from PR grafana#6054 to generic ind…

a2ffdbb

…ex shipper

sandeepsukhani mentioned this pull request Jun 6, 2022

copy boltdb-shipper cache changes from PR #6054 to generic index shipper #6316

Merged

1 task

sandeepsukhani added a commit that referenced this pull request Jun 6, 2022

copy boltdb-shipper cache changes from PR #6054 to generic index ship…

9a2df5a

…per (#6316)

sandeepsukhani added a commit that referenced this pull request Jun 6, 2022

copy boltdb-shipper cache changes from PR #6054 to generic index ship…

92a8b63

…per (#6316)

sandeepsukhani added a commit that referenced this pull request Jun 6, 2022

copy boltdb-shipper cache changes from PR #6054 to generic index ship…

6565a48

…per (#6316)

This was referenced Jun 28, 2022

dannykopping/remove cache stats dannykopping/loki#13

Closed

dannykopping/remove cache stats dannykopping/loki#14

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

boltdb shipper index list cache improvements #6054

boltdb shipper index list cache improvements #6054

sandeepsukhani commented Apr 29, 2022

cyriltovena left a comment

boltdb shipper index list cache improvements #6054

boltdb shipper index list cache improvements #6054

Conversation

sandeepsukhani commented Apr 29, 2022

cyriltovena left a comment

Choose a reason for hiding this comment