Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: clean up ancestral layers after split, old index_part objects #7043

Closed
3 tasks done
Tracked by #6288
jcsp opened this issue Mar 7, 2024 · 1 comment
Closed
3 tasks done
Tracked by #6288
Assignees
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver

Comments

@jcsp
Copy link
Collaborator

jcsp commented Mar 7, 2024

Per the "Cleaning up parent-shard layers" section in #6358 -- currently after a shard split, layers from the parent shards are not deleted until the whole tenant is eventually deleted.

We should implement an occasional online scrub routine that checks which of these are referenced by children, and cleans them up.

It likely makes sense to combine this work with cleaning up old-generation index_part.json objects, as these older objects will likely reference parent shard layers -- we should first define the criteria for cleaning up old indices, and then use the still-alive indices as the source of references for cleaning up parent layers.

Tasks

Preview Give feedback
  1. c/storage/scrubber t/feature
  2. c/storage/pageserver c/storage/scrubber t/feature
@jcsp jcsp added c/storage/pageserver Component: storage: pageserver a/tech_debt Area: related to tech debt labels Mar 7, 2024
@jcsp jcsp self-assigned this May 31, 2024
jcsp added a commit that referenced this issue Jun 3, 2024
## Problem

Currently, we leave `index_part.json` objects from old generations
behind each time a pageserver restarts or a tenant is migrated. This
doesn't break anything, but it's annoying when a tenant has been around
for a long time and starts to accumulate 10s-100s of these.

Partially implements: #7043 

## Summary of changes

- Add a new `pageserver-physical-gc` command to `s3_scrubber`

The name is a bit of a mouthful, but I think it makes sense:
- GC is the accurate term for what we are doing here: removing data that
takes up storage but can never be accessed.
- "physical" is a necessary distinction from the "normal" GC that we do
online in the pageserver, which operates at a higher level in terms of
LSNs+layers, whereas this type of GC is purely about S3 objects.
- "pageserver" makes clear that this command deals exclusively with
pageserver data, not safekeeper.
@jcsp
Copy link
Collaborator Author

jcsp commented Jul 29, 2024

This will be enabled in staging here: https://github.com/neondatabase/aws/pull/1654

Then we'll let it soak for at least a week before proceeding to prod.

@jcsp jcsp closed this as completed Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
a/tech_debt Area: related to tech debt c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

No branches or pull requests

1 participant