Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pageserver: clean up stray VM pages #9927

Closed
erikgrinaker opened this issue Nov 28, 2024 · 3 comments
Closed

pageserver: clean up stray VM pages #9927

erikgrinaker opened this issue Nov 28, 2024 · 3 comments
Assignees
Labels
c/storage/pageserver Component: storage: pageserver

Comments

@erikgrinaker
Copy link
Contributor

erikgrinaker commented Nov 28, 2024

In #9855 and #9914, we saw that ClearVmBits updates may be applied on shards that aren't responsible for VM pages. This can lead to stray writes to these keys, which can be incomplete since the shards will not have seen WAL records for these pages. We should clean these keys up, to avoid them cropping up again later and causing problems.

Bonus question: if we only apply ClearVmBits but not the page images on stray shards, why don't these error out during compaction when we presumably try to reconstruct the VM pages?

@erikgrinaker
Copy link
Contributor Author

erikgrinaker commented Nov 29, 2024

This is already done during compactions, which removes relation blocks that don't belong to the shard. This also explains why these stray keys didn't cause compaction failures.

/// Return true if the key should be discarded if found in this shard's
/// data store, e.g. during compaction after a split.
///
/// Shards _may_ drop keys which return false here, but are not obliged to.
pub fn is_key_disposable(&self, key: &Key) -> bool {
if key_is_shard0(key) {
// Q: Why can't we dispose of shard0 content if we're not shard 0?
// A1: because the WAL ingestion logic currently ingests some shard 0
// content on all shards, even though it's only read on shard 0. If we
// dropped it, then subsequent WAL ingest to these keys would encounter
// an error.
// A2: because key_is_shard0 also covers relation size keys, which are written
// on all shards even though they're only maintained accurately on shard 0.
false
} else {
!self.is_key_local(key)
}
}

@erikgrinaker
Copy link
Contributor Author

@jcsp indicated in #9786 (comment) that it's probably too expensive to force a recompaction of all tenants, since it requires downloading all layers from S3. So we'll just do this best-effort during compactions.

Let me know if I misinterpreted you @jcsp.

@jcsp
Copy link
Collaborator

jcsp commented Dec 3, 2024

Bonus question: if we only apply ClearVmBits but not the page images on stray shards, why don't these error out during compaction when we presumably try to reconstruct the VM pages?

Speaking from memory: the ShardedRange that we use to build the list of pages to include in compaction excludes them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver
Projects
None yet
Development

No branches or pull requests

2 participants