Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compaction: remove temporary files if compaction is not complete #4749

Open
skyzh opened this issue Jul 18, 2023 · 2 comments
Open

compaction: remove temporary files if compaction is not complete #4749

skyzh opened this issue Jul 18, 2023 · 2 comments
Assignees
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug

Comments

@skyzh
Copy link
Member

skyzh commented Jul 18, 2023

If compaction fails (i.e., running out of space), the temporary files generated will be left on disk, and therefore they will not be removed and occupy the space. We should remove them if compaction fails.

good_first_issue Initial implementation ideas

test_runner/regress/test_duplicate_layers.py::test_actually_duplicated_l1 creates a similar situation as what this issue is after fixing but instead of exiting pageserver after first image layer has been written out, we want to return an error and have all the layers which have not made it to layermap be deleted when exiting the scope. This is not currently implemented, and the way the pageserver::tenant::storage_layer::Layer is implemented right now is too deep in the DeltaLayerWriter. If the rename would happen later, we would have no problem.

The needed structure is not however trivial, because assuming we have new compacted layers ls which has more than 1 layer. After we rename 1 element to have it's final path, that final path might still need to be deleted in case any of the renames later fail.

This will most likely require changing pageserver::tenant::storage_layer::Layer::finish_creating or at least the outer procedure of "how to get final named paths become layers while creating compacted layers." My initial idea was to instead of having DeltaLayerWriter create the individual Layer values, have it create something like UnfinishedLayer which has the ability of being renamed from temporary file to actual path, but will retain the tempfile alike "delete on drop" until finally moved out to be an actual Layer which gets added to LayerMap.

There is quite the risk of becoming a badly reviewed external contributor PR because this is so open, so not yet adding that label.

@skyzh skyzh added t/bug Issue Type: Bug c/storage/pageserver Component: storage: pageserver labels Jul 18, 2023
@koivunej koivunej self-assigned this Oct 4, 2023
@koivunej
Copy link
Member

koivunej commented Oct 4, 2023

Will be handled by #5172 at least for crashes, but it would make sense to keep them as tempfiles, so will not be covered completly #5172.

@koivunej
Copy link
Member

Still unimplemented.

koivunej added a commit that referenced this issue Nov 21, 2023
While reviewing code noticed a scary `layer_paths.pop().unwrap()` then
realized this should be further asyncified, something I forgot to do
when I switched the `compact_level0_phase1` back to async in #4938.

This keeps the double-fsync for new deltas as #4749 is still unsolved.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
c/storage/pageserver Component: storage: pageserver t/bug Issue Type: Bug
Projects
None yet
Development

No branches or pull requests

2 participants