-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip no-op git fetches from the registry #2451
Comments
This seems reasonable to do eventually, but I don't want to start using unstable GitHub APIs as we're not really watching updates to the API. It'd be nice though to see some benchmarks about how fast this is such as:
|
Also worth thinking about what the behavior should be when the registry isn't Github hosted. |
I am seeing the HTTP request take 440 (+/- 10) ms and a noop fetch take 1350 (+/- 150) ms but occasionally up to 4 seconds. At those numbers, HTTP request followed if necessary by fetch would save time if less than 2/3 of registry updates require a fetch which is certainly true in my usage. |
Thanks for the data @dtolnay! I'd actually be pretty curious to test this out and see what kind of perf gains we get in practice. We could perhaps even go as far as having an environment variable flag to test it out unstable-y for awhile. Note that the referenced github API support was in preview when first introduced, but this appears to no longer be the case, so we should be good to go in that sense! @sfackler I would expect this logic to be along the lines of when updating a git index which has the host of "github.com" the special logic takes effect, so it should be quite easy to continue to support arbitrary registries not on github. We could then implement fast paths for other registries in the future as well if need be. |
Cool. I could see having a "github-like" flag in custom registry config to enable the fast path for stuff hosted on e.g. GitHub for Enterprise. |
Turns out this wasn't too hard to implement, but it's based on #2857 which may take a moment to merge. I didn't notice a huge difference in github registry update times, but perhaps it'll add up over time? |
I'd be interested in trying this with the rust repo. Cargo fetches that ungodly slowly, but that may be the actual fetching itself rather than the would-be fast path. |
Speed up noop registry updates with GitHub This commit adds supports to registry index updates to use GitHub's HTTP API [1] which is purportedly [2] much faster than doing a git clone, and emprically that appears to be the case. This logic kicks in by looking at the URL of a registry's index, and if it looks exactly like `github.com/$user/$repo` then we'll attempt to use GitHub's API, otherwise we always fall back to a git update. This behavior may *slow down* registry updates which actually need to download information as an extra HTTP request is performed to figure out that we need to perform a git fetch, but hopefully that won't actually be the case much of the time! [1]: https://developer.github.com/v3/repos/commits/#get-the-sha-1-of-a-commit-reference [2]: https://developer.github.com/changes/2016-02-24-commit-reference-sha-api/ Closes #2451
I just want to follow up and say this change has been a noticeable improvement for me. |
Quoting CocoaPods/CocoaPods#4989 (comment):
Cargo could use this to determine that its local copy of rust-lang/crates.io-index is already up to date and it doesn’t need to git fetch.
In addition to limit the risk of hitting GitHub’s rate limiting on git traffic (see #2452) this can maybe make "Updating registry" faster.
It doesn’t seem rare in my Cargo using to see "Updating registry" multiple times within a few minutes, while the registry has had an average of one commit per ~half hour since its creation.
Servo may be more of a pathological case, but we have a
./mach cargo-update
command that callcargo update
in each of four "top-level" crates (that each have aCargo.lock
file), causing the registry to be updated four times within a few seconds. Updating the registry seems to often dominate the time taken by this command (andcargo update
in general).The text was updated successfully, but these errors were encountered: