-
Notifications
You must be signed in to change notification settings - Fork 290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ostree/prune: Calculate reachability under exclusive lock #2808
Conversation
When we calculate the reachability set in `ostree prune`, we do this without any locking. This means that between the time we build the set and when we call `ostree_repo_prune_from_reachable`, new content might've been added. This then causes us to immediately prune that content since it's not in the now outdated set. Fix this by calculating the set under an exclusive lock. I think this is what happened in fedora-silverblue/issue-tracker#405. While the pruner was running, the `new-updates-sync` script[1] was importing content into the repo. The newly imported commits were immediately deleted by the many `ostree prune --commit-only` calls the pruner does, breaking the refs. [1] https://pagure.io/fedora-infra/ansible/blob/35b35127e444/f/roles/bodhi2/backend/files/new-updates-sync#_18
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. Classic GC issue. The obvious optimization here would be to skip pruning any commits created (filesystem timestamp I guess) after the scan started.
But for now, SGTM.
Good idea. Let me try this and see how hard it'd be. It's definitely unfortunate to be locking for the whole operation since it could take a long time to calculate reachability. Hmm, though I guess the timestamp trick won't work if you're somehow using rsync for getting content into the repo and preserving timestamps (I think this has come up before IIRC). I wonder if what we actually want here is a new |
OK, updated this to do that now! It seems safer than the timestamp trick but does require more invasive public API changes. |
Well this is awkward... |
OK I ended up going back to just using an exclusive lock for this. The issue with the previous approach (listing objects and passing it in) is that it doesn't really solve the race issue because refs are usually the last thing that get updated when content is imported into the repo, so there's still a time where objects will appear unreferenced. The timestamp trick could be made to work, but (1) it would only work on archive repos where the object files don't have canonicalized timestamps (in the non-archive case, we could scope it to just handling the Also, benchmarking how much time the reachability calculation step takes revealed that it's far less expensive than the actual |
/retest |
When we calculate the reachability set in
ostree prune
, we do thiswithout any locking. This means that between the time we build the set
and when we call
ostree_repo_prune_from_reachable
, new contentmight've been added. This then causes us to immediately prune that
content since it's not in the now outdated set.
Fix this by calculating the set under an exclusive lock.
I think this is what happened in
fedora-silverblue/issue-tracker#405. While
the pruner was running, the
new-updates-sync
script[1] was importingcontent into the repo. The newly imported commits were immediately
deleted by the many
ostree prune --commit-only
calls the pruner does,breaking the refs.
[1] https://pagure.io/fedora-infra/ansible/blob/35b35127e444/f/roles/bodhi2/backend/files/new-updates-sync#_18