-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cache submodules between different git checkouts #10279
Conversation
(rust-highfive has picked a reviewer for you, use r? to override) |
cc @ehuss |
f054fc0
to
1e09e34
Compare
I haven't had a chance to review this, but I did want to mention a few things before you spend too much time on them. I don't think we can rely on symlinks on windows. Older versions do not support them. I don't think links are necessary, though. I would expect this to use the |
2f93c42
to
2a2a94a
Compare
Ok, done :) |
Reading over this I don't think that this is the best approach to solve this problem. Naively the "fix" I would expect for this issue is what @ehuss described. Checkouts of submodules would perform basically the exact same process as the rest of git checkouts. There's a shared db for each submodule url, and within that db we make resolve the submodule git commit or otherwise fetch contents to try to resolve it. Afterwards we'd then perform a checkout from the bare db repository into the checkout of the outer git checkout. Most of this probably wouldn't actually use libgit2's submodule support other than simply iterating, otherwise we'd be managing submodules in a custom fashion. |
@alexcrichton Sorry, I forgot to update the PR description - this now uses @ehuss's suggestion of bare checkouts. |
While that may be the case, I had other suggestions in my previous comment I don't think should be ignored simply because bare checkouts are used somewhere now. |
☔ The latest upstream changes (presumably #10296) made this pull request unmergeable. Please resolve the merge conflicts. |
Sorry, I'm not sure I understand what I haven't done from that list.
Is there something I missed? |
c533f07
to
e4eaa3b
Compare
This base64-encodes the URLs to avoid errors like the following: ``` error: failed to get `dep1` as a dependency of package `foo v0.5.0 (D:/a/cargo/cargo/target/tmp/cit/t1035/foo)` Caused by: failed to load source for dependency `dep1` Caused by: Unable to update file:///D:/a/cargo/cargo/target/tmp/cit/t1035/dep1 Caused by: failed to update submodule `src` Caused by: failed to make directory 'D:/a/cargo/cargo/target/tmp/cit/t1035/home/.cargo/git/checkouts/submodules/file:': The filename, directory name, or volume label syntax is incorrect. ; class=Os (2) ', tests\testsuite\git.rs:2515:10 ``` It uses bare checkouts instead of symbolic links to avoid permission errors on Windows.
e4eaa3b
to
9a8816c
Compare
Cargo has existing infrastructure for a global database of git repos and a global database of git checkouts. You've bypassed all that infrastructure and invented your own scheme. My comment is that you should be using what Cargo already has instead of inventing something new. |
Thanks for the PR, but I'm going to be stepping down from the Cargo team so I'm going to un-assign myself from this. The Cargo team will help review this when they get a chance. |
I am not quite sure how to proceed here; I'll open a new PR if I figure it out. |
What does this PR try to resolve?
Rather than letting git manage each submodule checkout individually, this uses a shared bare submodule directory in
target/git/checkouts/submodules
and makes a checkout of the shared directory in the relative path.This caches based on the URL, but base64-encodes the URL to avoid errors about invalid FS paths. I'm hesitant to cache based on the relative path in the git checkout because it could use a different remote between commits of the parent.
Fixes #7987.
How should we test and review this PR?
I tested like so:
and confirmed that the time to do the second update is negligible. I'm not sure how to test this automatically using the
cargo_test
framework since cargo still prints "Updating git submodule" on the second checkout (because it has to create the symlink).