-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix marking static delta commits as partial #2549
Conversation
This patch makes it so that we mark the .commit file from a static delta as partial before writing the commit to the staging directory. This exactly mirrors what we do in meta_fetch_on_complete() when writing the commit on that codepath, which should lend some credibility to the correctness of this patch. I have checked that this fixes an issue Flatpak users have been encountering (flatpak/flatpak#3479) which results in error messages like "error: Failed to install org.freedesktop.Sdk.Extension.texlive: Failed to read commit c7958d966cfa8b80a42877d1d6124831d7807f93c89461a2a586956aa28d438a: No such metadata object 8bdaa943b957f3cf14d19301c59c7eec076e57389e0fbb3ef5d30082e47a178f.dirtree" Here's the sequence of events that lead to the error: 1. An install operation is started that fetches static deltas. 2. The fetch is interrupted for some reason such as network connectivity dropping. 3. The .commit and .commitmeta files for the commit being pulled are left in the staging dir, e.g. "~/.local/share/flatpak/repo/tmp/staging-dfe862b2-13fc-49a2-ac92-5a59cc0d8e18-RURckd" 4. There is no `.commitpartial` file for the commit in "~/.local/share/flatpak/repo/state/" 5. The next time the user attempts the install, libostree reuses the existing staging dir, pulls the commit and commitmeta objects into the repo from the staging dir on the assumption that it's a complete commit. 6. Flatpak then tries to deploy the commit but fails in ostree_repo_read_commit() in flatpak_dir_deploy(), leading to the error message "Failed to read commit ..." 7. This happens again any subsequent time the user attempts the install, until the incomplete commit is removed with "flatpak repair --user". I will try to also add a workaround in Flatpak so this is fixed even when Flatpak links against affected versions of libostree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that makes sense. I'm surprised this hasn't come up more before. It would be good to test that the partial is actually written, but I can't think of a good way to do it. Can you add something to a test that makes sure there's no partial after the pull completes?
All the details of the bug are in: ostreedev/ostree#2549 #3479 This patch works around it by marking the commit we're about to pull partial, so that if the .commit object exists in a staging directory from a previous failed pull, it will not be erroneously considered a complete commit, even by affected versions of libostree that don't have the above patch. If for some reason the commit in the staging dir is complete, libostree should harmlessly verify that and pull it in. Usually the commit we are pulling does not already exist in the local repo, but add a check anyway so we don't risk marking a complete commit as partial, and so this works on the code path from "flatpak install --reinstall ..." Fixes #3479
Strictly speaking that would be testing existing code not this patch. I'm not sure if I'll have time soon to look at writing a unit test for this, but I think it would make sense to get it merged and get a release out since it's pretty high impact. I don't think these two lines of code necessarily need a unit test since they are pretty obviously correct, but it would be good to test the robustness of interrupted pulls in general, if we can think of a way to do that. |
This bug looks like the cause of a repo corruption I've run into more than once. The symptoms seem exactly the same (thought it seems I was also experiencing the bug fixed in #2544). I never checked if a .commitpartial file exists, nor did I touch anything in repo/tmp. I fixed my repo by manually finding the dead commit(s) and removing them Thanks for the fix! |
All the details of the bug are in: ostreedev/ostree#2549 #3479 This patch works around it by marking the commit we're about to pull partial, so that if the .commit object exists in a staging directory from a previous failed pull, it will not be erroneously considered a complete commit, even by affected versions of libostree that don't have the above patch. If for some reason the commit in the staging dir is complete, libostree should harmlessly verify that and pull it in. Usually the commit we are pulling does not already exist in the local repo, but add a check anyway so we don't risk marking a complete commit as partial, and so this works on the code path from "flatpak install --reinstall ..." Fixes #3479
All the details of the bug are in: ostreedev/ostree#2549 #3479 This patch works around it by marking the commit we're about to pull partial, so that if the .commit object exists in a staging directory from a previous failed pull, it will not be erroneously considered a complete commit, even by affected versions of libostree that don't have the above patch. If for some reason the commit in the staging dir is complete, libostree should harmlessly verify that and pull it in. Usually the commit we are pulling does not already exist in the local repo, but add a check anyway so we don't risk marking a complete commit as partial, and so this works on the code path from "flatpak install --reinstall ..." Fixes #3479
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
/override continuous-integration/jenkins/pr-merge |
@cgwalters: Overrode contexts on behalf of cgwalters: continuous-integration/jenkins/pr-merge In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Yeah, it's pretty bad. I'm definitely going to backport it to our Endless stable branch and maybe even our old stable branch. I might submit a PR for Debian too. |
All the details of the bug are in: ostreedev/ostree#2549 #3479 This patch works around it by marking the commit we're about to pull partial, so that if the .commit object exists in a staging directory from a previous failed pull, it will not be erroneously considered a complete commit, even by affected versions of libostree that don't have the above patch. If for some reason the commit in the staging dir is complete, libostree should harmlessly verify that and pull it in. Usually the commit we are pulling does not already exist in the local repo, but add a check anyway so we don't risk marking a complete commit as partial, and so this works on the code path from "flatpak install --reinstall ..." Fixes #3479 (cherry picked from commit 11158c2)
All the details of the bug are in: ostreedev/ostree#2549 #3479 This patch works around it by marking the commit we're about to pull partial, so that if the .commit object exists in a staging directory from a previous failed pull, it will not be erroneously considered a complete commit, even by affected versions of libostree that don't have the above patch. If for some reason the commit in the staging dir is complete, libostree should harmlessly verify that and pull it in. Usually the commit we are pulling does not already exist in the local repo, but add a check anyway so we don't risk marking a complete commit as partial, and so this works on the code path from "flatpak install --reinstall ..." Fixes #3479 (cherry picked from commit 11158c2)
This patch makes it so that we mark the .commit file from a static delta
as partial before writing the commit to the staging directory. This
exactly mirrors what we do in meta_fetch_on_complete() when writing the
commit on that codepath, which should lend some credibility to the
correctness of this patch.
I have checked that this fixes an issue Flatpak users have been
encountering (flatpak/flatpak#3479) which
results in error messages like "error: Failed to install
org.freedesktop.Sdk.Extension.texlive: Failed to read commit
c7958d966cfa8b80a42877d1d6124831d7807f93c89461a2a586956aa28d438a: No
such metadata object
8bdaa943b957f3cf14d19301c59c7eec076e57389e0fbb3ef5d30082e47a178f.dirtree"
Here's the sequence of events that lead to the error:
dropping.
left in the staging dir, e.g.
"~/.local/share/flatpak/repo/tmp/staging-dfe862b2-13fc-49a2-ac92-5a59cc0d8e18-RURckd"
.commitpartial
file for the commit in"~/.local/share/flatpak/repo/state/"
existing staging dir, pulls the commit and commitmeta objects into
the repo from the staging dir on the assumption that it's a complete
commit.
ostree_repo_read_commit() in flatpak_dir_deploy(), leading to the
error message "Failed to read commit ..."
until the incomplete commit is removed with "flatpak repair --user".
I will try to also add a workaround in Flatpak so this is fixed even
when Flatpak links against affected versions of libostree.