Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delete tenant's data from s3 #4855

Merged
merged 29 commits into from
Aug 10, 2023
Merged

Delete tenant's data from s3 #4855

merged 29 commits into from
Aug 10, 2023

Conversation

LizardWizzard
Copy link
Contributor

@LizardWizzard LizardWizzard commented Jul 31, 2023

Summary of changes

For context see https://github.com/neondatabase/neon/blob/main/docs/rfcs/022-pageserver-delete-from-s3.md

Create Flow to delete tenant's data from pageserver. The approach heavily mimics previously implemented timeline deletion implemented mostly in #4384 and followed up in #4552

For remaining deletion related issues consult with deletion project here: https://github.com/orgs/neondatabase/projects/33

resolves #4250
resolves #3889

Copy link
Member

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First round of questions, suggestions

pageserver/src/tenant.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/delete.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/delete.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/delete.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/delete.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/timeline/delete.rs Outdated Show resolved Hide resolved
pageserver/src/tenant/timeline/delete.rs Outdated Show resolved Hide resolved
test_runner/fixtures/pageserver/utils.py Outdated Show resolved Hide resolved
pageserver/src/tenant.rs Outdated Show resolved Hide resolved
LizardWizzard and others added 2 commits August 1, 2023 17:49
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
Co-authored-by: Joonas Koivunen <joonas@neon.tech>
@github-actions
Copy link

github-actions bot commented Aug 1, 2023

1584 tests run: 1508 passed, 0 failed, 76 skipped (full report)


The comment gets automatically updated with the latest test results
6bca75a at 2023-08-10T15:25:09.659Z :recycle:

@LizardWizzard
Copy link
Contributor Author

Thanks for review! I resolved some of the comments to more clearly see remaining ones, feel free to unresolve if needed

@LizardWizzard
Copy link
Contributor Author

Existing tests seem to pass, continue with adding more deletion specific ones

@LizardWizzard LizardWizzard marked this pull request as ready for review August 7, 2023 15:39
@LizardWizzard LizardWizzard requested a review from a team as a code owner August 7, 2023 15:39
@jcsp
Copy link
Collaborator

jcsp commented Aug 9, 2023

ListObjects can take quite some time (many s3 roundtrips depending on the number of layers). So I'm not sure why it should be cheaper in terms of runtime cost. Additionally we already have all layers listed in RemoteTimelineClient internals, so to me it doesnt look like there is a big difference where to get the list from.

I guess the main difference would be simplicity: one could write a very short function that just streams the objects from a ListObjectsv2 call into something that buffers them up into DeleteObjects calls. It's true that there's some overhead to using the listing instead of just using the list of objects we already hold in memory. Using the list of layers we have in memory is cheaper if the layer count is large, I suppose, whereas for a timeline with few layers, it would be cheaper to do everything in one DeleteObjects rather than having the separate delete, cleanup steps.

But yeah, it's kind of debatable either way, and anyway not directly touched in this PR.

@LizardWizzard
Copy link
Contributor Author

it would be cheaper to do everything in one DeleteObjects rather than having the separate delete, cleanup steps.

Its a good idea! I havent thought about that. But yeah, unfortunately this doesnt scale to bigger number of layers, so we'd need to maintain two code paths in that case. We can consider taking this shortcut in the future. Also for that to work we need to be sure that DeleteObjects is as atomic as single DeleteObject. S3 Consistency model says:

Amazon S3 provides strong read-after-write consistency for PUT and DELETE requests of objects in your Amazon S3 bucket in all AWS Regions.

And we should keep in mind that we shouldnt rely on AWS-only features of s3 API to make it easier to adapt to s3 API implementations in other cloud providers

pageserver/src/tenant.rs Outdated Show resolved Hide resolved
Copy link
Member

@koivunej koivunej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All in all I don't think it's useful to spend more time looking at this in review. We should merge it today and then spend time debugging any test flakyness if such arise. The DELETE's will not be issued from control plane anyways soon. Do you know the issue tracking that progress?

@LizardWizzard
Copy link
Contributor Author

We should merge it today and then spend time debugging any test flakyness if such arise.

Also I'd take a look at startup times because with this PR we'll do one more s3 request per tenant

The DELETE's will not be issued from control plane anyways soon. Do you know the issue tracking that progress?

I plan to look into it after this PR gets merged. I added this to deletion project but didnt create an issue yet: https://github.com/orgs/neondatabase/projects/33/views/1


So I do one final cleanup pass to remove fixmes and when CI passes I'll merge the PR. Thanks!

LizardWizzard and others added 2 commits August 10, 2023 11:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement deletion as described in RFC for tenants Delete pageserver data from s3 when project is deleted
4 participants