Compact: Apply retention before compaction #5766

igorwwwwwwwwwwwwwwwwwwww · 2022-10-06T12:13:33Z

I added CHANGELOG entry for this change.
Change is not relevant to the end user.

The current behaviour of the compactor results in lots of unnecessary work being done if there is a compaction backlog.

Description

AFAICT the compactor will keep compacting until the entire backlog is cleared. Only then will it start downsampling and only once downsampling is complete, will it apply retention.

This means that we spend a lot of work compacting blocks that may already be outside of the retention window. The progress calculator determines which blocks would be marked for deletion. But since the retention code never gets to run, we don't actually perform that marking.

Since the compactor only skips blocks that actually were marked for deletion, we don't get to skip these would-be deleted blocks.

Analysis

The compact command performs compaction here:

thanos/cmd/thanos/compact.go

Line 423 in 09ddcc2

if err := compactor.Compact(ctx); err != nil {

It would apply retention here:

thanos/cmd/thanos/compact.go

Line 462 in 09ddcc2

    
           if err := compact.ApplyRetentionPolicyByResolution(ctx, logger, bkt, sy.Metas(), retentionByResolution, compactMetrics.blocksMarked.WithLabelValues(metadata.DeletionMarkFilename, "")); err != nil {

However, if we look at the inner loop in the compact call:

thanos/pkg/compact/compact.go

Line 1231 in 09ddcc2

// Loop over bucket and compact until there's no work left.
We can see that as long as there are blocks that can be compacted, shouldRerunGroup will be true:

thanos/pkg/compact/compact.go

Line 1250 in 09ddcc2

shouldRerunGroup, _, err := g.Compact(workCtx, c.compactDir, c.planner, c.comp)
And thus we never break out of the loop until the entire backlog is processed:

thanos/pkg/compact/compact.go

Lines 1351 to 1353 in 09ddcc2

if finishedAllGroups {

break

}

Proposal

This is a minimal change that moves the retention call before the compaction one. That ensures that at least upon compactor process restart, we will actually mark old blocks for deletion.

We'll then still go into the same loop, so no additional retention will be applied until either the backlog clears or the process is restarted.

This is not perfect. Some more concurrency or interleaving of compaction + retention would be desirable. But it is already quite a big improvement over the status quo.

Changes

Compact: Apply retention before compaction.

Verification

TBD.

Signed-off-by: Igor Wiedler <iwiedler@gitlab.com>

yeya24 · 2022-10-06T16:33:37Z

cmd/thanos/compact.go

+			return errors.Wrap(err, "sync before retention")
+		}
+
+		if err := compact.ApplyRetentionPolicyByResolution(ctx, logger, bkt, sy.Metas(), retentionByResolution, compactMetrics.blocksMarked.WithLabelValues(metadata.DeletionMarkFilename, "")); err != nil {


One concern I have is that some raw blocks might beyond the retention. But they are able to be downsampled to 5m blocks that still within the 5m block retention period.
Changing the order makes users configure a longer retention for 5m blocks to get those downsampled.

@yeya24 Yeah, this makes sense. So basically for this to be safe, ApplyRetentionPolicyByResolution would need to also check if we have downsampled a block already. And only apply retention if it's been downsampled.

However, this defeats the point (with the current implementation), as downsampling only occurs after compaction. So we are where we started again, unless we also perform downsampling before compaction, and have a check in the downsampler that ensures we are at the right compaction level for downsampling.

Moving retention to the beginning could also mistakenly deleted blocks that are expected to be compacted. Since currently retention calculation uses the block's max time, so after a compaction we may keep the block instead of deleting the small blocks.
Corretness wise I don't think this is safe to do.

I wonder if we could handle this case by simulating all compaction + downsampling operations, and if the resulting block would be outside of retention, mark it for deletion immediately.

The specific case that this would help with: Lowering retention to purge the backlog. It would come at the cost of additional complexity though, so I'm not sure it's worth it.

If it is for purging the backlog we have the bucket tool retention already. Does that satisfy your usecase?

Ahh, TIL about bucket tool retention. Though I suspect it suffers from the same issue we discussed here: it does not take into account downsampling, so we run the risk of purging blocks that have not yet been downsampled.

In other words, using that tool is unsafe for the same reason that this proposed patch is unsafe.

Right, it is the same as this patch so you use it at your own risk.
If you agree it is the same then do we still need this patch? If you have other things to improve we can discuss on the issue #1708

Agreed, I think we can close this. 👍

pull-request-size bot added the size/S label Oct 6, 2022

Compact: Apply retention before compaction

15aae97

Signed-off-by: Igor Wiedler <iwiedler@gitlab.com>

igorwwwwwwwwwwwwwwwwwwww force-pushed the compact-retention-before-compaction branch from 60c5772 to 15aae97 Compare October 6, 2022 12:15

yeya24 reviewed Oct 6, 2022

View reviewed changes

igorwwwwwwwwwwwwwwwwwwww closed this Oct 6, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compact: Apply retention before compaction #5766

Compact: Apply retention before compaction #5766

igorwwwwwwwwwwwwwwwwwwww commented Oct 6, 2022

yeya24 Oct 6, 2022

igorwwwwwwwwwwwwwwwwwwww Oct 6, 2022

yeya24 Oct 6, 2022 •

edited

Loading

igorwwwwwwwwwwwwwwwwwwww Oct 6, 2022

yeya24 Oct 6, 2022

igorwwwwwwwwwwwwwwwwwwww Oct 6, 2022

yeya24 Oct 6, 2022 •

edited

Loading

igorwwwwwwwwwwwwwwwwwwww Oct 6, 2022

	if finishedAllGroups {
	break
	}

Compact: Apply retention before compaction #5766

Compact: Apply retention before compaction #5766

Conversation

igorwwwwwwwwwwwwwwwwwwww commented Oct 6, 2022

Description

Analysis

Proposal

Changes

Verification

yeya24 Oct 6, 2022

Choose a reason for hiding this comment

igorwwwwwwwwwwwwwwwwwwww Oct 6, 2022

Choose a reason for hiding this comment

yeya24 Oct 6, 2022 • edited Loading

Choose a reason for hiding this comment

igorwwwwwwwwwwwwwwwwwwww Oct 6, 2022

Choose a reason for hiding this comment

yeya24 Oct 6, 2022

Choose a reason for hiding this comment

igorwwwwwwwwwwwwwwwwwwww Oct 6, 2022

Choose a reason for hiding this comment

yeya24 Oct 6, 2022 • edited Loading

Choose a reason for hiding this comment

igorwwwwwwwwwwwwwwwwwwww Oct 6, 2022

Choose a reason for hiding this comment

yeya24 Oct 6, 2022 •

edited

Loading

yeya24 Oct 6, 2022 •

edited

Loading