Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do tenant manifest validation with index-part #10007

Merged
merged 5 commits into from
Dec 11, 2024
Merged

Conversation

arpad-m
Copy link
Member

@arpad-m arpad-m commented Dec 4, 2024

This adds some validation of invariants that we want to uphold wrt the tenant manifest and index_part.json:

  • the data the manifest has about a timeline must match with the data in index_part.json. It might actually change, e.g. when we do reparenting during detach ancestor, but that requires the timeline to be unoffloaded, i.e. removed from the manifest.
  • any timeline mentioned in index part, must, if present, be archived. If we unarchive, we first update the tenant manifest to unoffload, and only then update index part. And one needs to archive before offloading.
  • it is legal for timelines to be mentioned in the manifest but have no index_part: this is a temporary state visible during deletion of the timeline. if the pageserver crashed, an attach of the tenant will clean the state up.
  • it is also legal for offloaded timelines to have an ancestor_retain_lsn of None while having an ancestor_timeline_id. This is for the to-be-added flattening functionality: the plan is to set former to None if we have flattened a timeline.

follow-up of #9942
part of #8088

@arpad-m arpad-m requested a review from problame December 4, 2024 13:13
@arpad-m arpad-m requested a review from a team as a code owner December 4, 2024 13:13
Copy link

github-actions bot commented Dec 4, 2024

7077 tests run: 6752 passed, 0 failed, 325 skipped (full report)


Flaky tests (6)

Postgres 17

Postgres 15

Postgres 14

Code coverage* (full report)

  • functions: 31.4% (8329 of 26533 functions)
  • lines: 47.7% (65545 of 137448 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
ba19374 at 2024-12-09T21:38:22.039Z :recycle:

@arpad-m arpad-m force-pushed the arpad/scrubber_manifests branch from df62cb3 to 4d5898a Compare December 4, 2024 18:44
Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm skeptical that the checks in validate_index_part_with_offloaded are race-free because we can't fetch the manifest and the index part atomically.

For example, the following sequence

scrubber: fetches manifest, has timeline as archived
pageserver: timeline gets unarchived, uploads index part with archived_at=None
scrubber: fetches index part
scrubber: runs validate_index_part_with_offloaded
scrubber: logs Timeline offloaded in manifest but not archived in index-part

I already complained about the messiness of the tenants.map_ok(|...| { ... } inside pageserver_physical_gc in the previous PR that added tenant manifest validation to the scrubber.

I think it's really getting out of hand - there must be a cleaner way to express what we're doing here -,-.

Not asking you to solve it though, but, maybe raise it in #rust once this PR is merged, asking for prettification from the rust async gurus.

@arpad-m
Copy link
Member Author

arpad-m commented Dec 5, 2024

I'm skeptical that the checks in validate_index_part_with_offloaded are race-free because we can't fetch the manifest and the index part atomically.

hmm yeah good point. What about a HEAD request? If we discover any errors, we could check if the manifest is still unchanged (comparing etag, last-modified, etc) since the index part has been downloaded. If the manifest has changed, we either demote it to info level, or suppress it entirely.

I think it's really getting out of hand - there must be a cleaner way to express what we're doing here -,-.

Indeed it's quite a mess, I'm sorry. I'm also a fan of imperative programming over functional programming because you get into precisely this situation if you want to move a little bit of data around in a way that is a little bit more custom than simple map.

There is advantages though: stuff like parallel execution is quite easy: we have both timeline and tenant level multi threading here, but it's also bounded by the number of configured threads. So if you invoke the scrubber with a single tenant that has many timelines it still supports parallelism.

I think that's pretty cool and a property I wanted to preserve. getting that right if done manually is pretty hard and there is lots of subtle bugs that can show up, as in you can't just have it spawn to an unbounded channel/executor, potentially growing ram, you need to have backpressure, etc etc. Maybe with a semaphore? idk.

It might be worth a shot to try applying macros from async_stream to this code. I can try that after this PR is merged.

@arpad-m arpad-m force-pushed the arpad/scrubber_manifests branch from 4d5898a to 4d679ab Compare December 7, 2024 00:59
@arpad-m
Copy link
Member Author

arpad-m commented Dec 7, 2024

If we discover any errors, we could check if the manifest is still unchanged (comparing etag, last-modified, etc) since the index part has been downloaded. If the manifest has changed, we either demote it to info level, or suppress it entirely.

I've pushed a commit to do this. A HEAD request isn't enough. One needs to do a listing of all generations: the "new" manifest could be of a newer generation than the manifest we've had previously.

@arpad-m arpad-m requested a review from problame December 7, 2024 01:02
Copy link
Contributor

@problame problame left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem with checking last_modified is its relatively coarse resolution (milliseconds).
I guess it's more of a theoretical problem, but still, kind of dissatisfying.

scrubber: fetches manifest, has timeline as archived_at=T1
pageserver: timeline gets unarchived, uploads index part with archived_at=None
scrubber: fetches index part observes archived_at=None
scrubber: runs validate_index_part_with_offloaded
scrubber: validate fails with logs index-part archived_at={} differs from manifest archived_at={}
scrubber: decides to fetch mainfest again
pageserver: archives timeline at T2, updates both index part and manifest
scrubber: fetches manifest again, observes archived_at=T2
scrubber: second validation would fail again, but last_modified very likely mismatches  

I think a common on the last_modified comparison being imperfect but practically probably sufficient is appropriate.

@arpad-m
Copy link
Member Author

arpad-m commented Dec 11, 2024

The problem with checking last_modified is its relatively coarse resolution (milliseconds).

Actually it's worse, it's only a resolution of seconds. You are right, we should maybe use etags instead. We have etags in the remote_storage API already, just not for ListingObject. I think it would be good for a followup.

@arpad-m arpad-m added this pull request to the merge queue Dec 11, 2024
Merged via the queue into main with commit 7fa986b Dec 11, 2024
83 checks passed
@arpad-m arpad-m deleted the arpad/scrubber_manifests branch December 11, 2024 20:11
github-merge-queue bot pushed a commit that referenced this pull request Jan 20, 2025
This simplifies the code in `pageserver_physical_gc` a little bit after
the feedback in #10007 that the code is too complicated.

Most importantly, we don't pass around `GcSummary` any more in a
complicated fashion, and we save on async stream-combinator-inception in
one place in favour of `try_stream!{}`.

Follow-up of #10007
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants