Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: Support sparse-checkout to git_worker #13747

Open
sluongng opened this issue Jul 27, 2021 · 4 comments
Open

Feature Request: Support sparse-checkout to git_worker #13747

sluongng opened this issue Jul 27, 2021 · 4 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request

Comments

@sluongng
Copy link
Contributor

sluongng commented Jul 27, 2021

ATTENTION! Please read and follow:

  • if this is a question about how to build / test / query / deploy using Bazel, or a discussion starter, send it to bazel-discuss@googlegroups.com
  • if this is a bug or feature request, fill the form below as best as you can.

Description of the problem / feature request:

Recently, git.git has been adding feature to enable sparse-checkout possible.
This help reducing the amount of git-fetch data to a selected subset of trees and blob inside the repository instead of having to download all trees and blobs.

https://git-scm.com/docs/git-sparse-checkout

Feature requests: what underlying problem are you trying to solve with this feature?

In a multi-repo setup where a small repo A running Bazel depends on components in a relatively big repo B, it's undesirable that git_repository(B) in A has to download the entire checkout copy of big repo B. Leveraging git-sparse-checkout command, we can limit the download of B as well as the size-on-disk of B to be minimal.

Add a sparse_checkout_spec: List[str] attribute in git_repository() rule that would enable git repositories to be fetched
with sparse-checkout enabled.

Also added a sparse_checkout_code_mode: bool attribute, default to true if sparse_checkout_spec is none-empty that would enable sparse checkout code mode in git. This is a sensible mode that would include all the blobs in the parent trees of a dir inside sparse_checkout_spec.

Propose Implementation

In https://github.com/bazelbuild/bazel/blob/master/tools/build_defs/repo/git_worker.bzl _update() we can add an optional stage enable_sparse_checkout() between add_origin() and fetch() and execute the logic there.

@aiuto aiuto added team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. untriaged team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website labels Jul 29, 2021
@philwo philwo added P3 We're not considering working on this, but happy to review a PR. (No assignee) type: feature request and removed untriaged labels Aug 31, 2021
@philwo philwo removed the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Nov 29, 2021
@cameron-martin
Copy link
Contributor

cameron-martin commented Jan 16, 2023

I have a use-case where this would be really useful. We currently have a rust project that has git dependencies on a subdirectory of quite a large repo. If this feature was added then rules_rust could do a sparse checkout in this case, reducing the amount of time taken to download the dependency.

@cameron-martin
Copy link
Contributor

cameron-martin commented Mar 18, 2023

Maybe we want to use a partial clone here rather than a sparse checkout, since a sparse checkout still downloads the entire repository.

@manuelnaranjo
Copy link

FYI we introduced this feature into rules_booking in bookingcom/rules_booking@a110993, I don't have the time to make that into an official bazel change, but maybe someone can?

Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 1+ years. It will be closed in the next 90 days unless any other activity occurs. If you think this issue is still relevant and should stay open, please post any comment here and the issue will no longer be marked as stale.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Aug 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) stale Issues or PRs that are stale (no activity for 30 days) team-ExternalDeps External dependency handling, remote repositiories, WORKSPACE file. type: feature request
Projects
None yet
Development

No branches or pull requests

5 participants