Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gvfs clone may fail due to prefetch packfile "gaps" #306

Closed
derrickstolee opened this issue Sep 26, 2018 · 2 comments · Fixed by #313
Closed

gvfs clone may fail due to prefetch packfile "gaps" #306

derrickstolee opened this issue Sep 26, 2018 · 2 comments · Fixed by #313
Assignees

Comments

@derrickstolee
Copy link
Contributor

The prefetch packfiles use timestamps to signal "this is what the cache server knew about at this time". When a client prefetches, they contact a cache server with their latest timestamp and the server responds with a list of packfiles. As we call different cache servers, we can get into a state where there are "gaps": some commits and trees never exist in a prefetch pack.

In a normal enlistment, these gaps can be refilled on-demand. However, at clone time we may be directed to an existing shared object cache, see that the cache has the commit and root tree that we expect, but be missing a tree that is reachable from that root tree. As we attempt to run a git checkout, we will fail to find that object and we don't have the read-object hook available.

This is a very odd data shape that has always been possible, but it is more likely to be hit now that we have background prefetch (we are more likely to satisfy CommitAndRootTreeExists() for the latest ref).

One fix is to run git rev-list --objects <root-tree-hash> instead of just checking if the objects exist. If that command succeeds, then we can continue without downloading the initial packfile. This will take less time than the git checkout step, so may be reasonable.

Another fix is to always download that initial packfile. This is likely not the best solution because we rarely need that data a second time, and we would be slowing down clones that are usually "trivial".

@sanoursa
Copy link
Contributor

@derrickstolee I don't think prefetch is the cause here. Clone already tries to download the tip commit (plus full frees) for the branch it's going to checkout, because there's no guarantee that the prefetch packs will have the commit we want, so this is a data shape we already tried to account for. See https://github.com/Microsoft/VFSForGit/blob/master/GVFS/GVFS/CommandLine/CloneVerb.cs#L520

However, the issue is the implementation of that method makes a big assumption - if the commit object and its root tree object exist locally, then we probably also have all the rest of the trees, and skip downloading anything. See https://github.com/Microsoft/VFSForGit/blob/master/GVFS/GVFS/CommandLine/GVFSVerb.cs#L406

So we could fix this in a couple of ways - 1) always download that tip commit+trees or 2) detect the missing tree errors from our initial git checkout, and if we see that, fall back to downloading and retry.

@sanoursa
Copy link
Contributor

My point is, the issue is not with prefetch. Prefetch contains full trees, so background prefetch is not at fault here. The root cause is that an existing repo could touch just the root tree object, not download the full set of trees, and then a subsequent clone attempt will fail in this manner.

@derrickstolee derrickstolee self-assigned this Sep 28, 2018
derrickstolee added a commit that referenced this issue Oct 3, 2018
… on failed initial checkout

On Clone, we need to create the `.git/index` file during a checkout, which requires having all trees reachable from the root tree of the tip commit. To ensure we have all of the objects, we call `TryDownloadCommit()`. However, this method doesn't actually try downloading a pack of trees if we already have the commit and root tree.

Due to the shared object cache, we may have these objects for a few reasons:

1. A recent fetch included a ref update that points to that commit, and we downloaded the commit and tree as loose objects.
2. A background prefetch downloaded a prefetch pack containing that commit and tree, but some of the reachable trees are missing due to clock skew between cache servers.

This fixes the issue by forcibly downloading the packfile if the checkout fails and trying again.

Resolves #306.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants