Skip to content

Commit

Permalink
Auto merge of #11448 - Byron:integrate-gitoxide, r=weihanglo
Browse files Browse the repository at this point in the history
gitoxide integration: fetch

This PR is the first step towards resolving #1171.

In order to get there, we integrate `gitoxide` into `cargo` in such a way that one can control its usage in nightly via `-Zgitoxide` or `Zgitoxide=<feature>[,featureN]`.

Planned features are:

* **fetch** - all fetches are done with `gitxide` (this PR)
* **shallow_index** - the crates index will be a shallow clone (_planned_)
* **shallow_deps** - git dependencies will be a shallow clone (_planned_)
* **checkout** - plain checkouts with `gitoxide` (_planned_)

The above list is a prediction and might change as we understand the requirements better.

### Testing and Transitioning

By default, everything stays as is. However, relevant tests can be re-runwith `gitoxide` using

```
RUSTFLAGS='--cfg always_test_gitoxide' cargo test git
```

There are about 200 tests with 'git' in their name and I plan to enable them one by one. That way the costs for CI stay managable (my first measurement with one test was 2min 30s), while allowing to take one step at a time.

Custom tests shall be added once we realize that more coverage is needed.

That way we should be able to maintain running `git2` and `gitoxide` side by side until we are willing to switch over to `gitoxide` entirely on stable cargo. Then turning on `git2` might be a feature toggle for a while until we finally remove it from the codebase.

_Please see the above paragraph as invitation for discussion, it's merely a basis to explore from and improve upon._

### Tasks

* [x] add feature toggle
* [x] setup test system with one currently successful test
* [x] implement fetch with `gitoxide` (MVP)
* [x] fetch progress
* [x] detect spurious errors
* [x] enable as many git tests as possible (and ignore what's not possible)
* [x] fix all git-related test failures (except for 1: built-in upload-pack, skipped for now)
* [x] validate that all HTTP handle options that come from `cargo` specific values are passed to `gitoxide`
* [x] a test to validate `git2` code can handle crates-index clones created with `gitoxide` and vice-versa
* [x] remove patches that enabled `gitoxide` enabled testing - it's not used anymore
* [x] ~~remove all TODOs and use crates-index version of `git-repository`~~ The remaining 2 TODO's are more like questions for the reviewer.
* [x] run all tests with gitoxide on the fastest platform as another parallel task
* [x] switch to released version
* [x] [Tasks from first review round](#11448 (comment))
* [x] create a new `gitoxide` release and refer to the latest version from crates.io (instead of git-dependency)
* [x] [address 2nd review round comments](#11448 (comment))

### Postponed Tasks

I suggest to go breadth-first and implement the most valuable features first, and then aim for a broad replacement of `git2`. What's left is details and improved compatibility with the `git2` implementation that will be required once `gitoxide` should become the default implementation on stable to complete the transition.

* **built-in support for serving the `file` protocol** (i.e. without using `git`). Simple cases like `clone` can probably be supported quickly, `fetch` needs more work though due to negotiation.
* SSH name fallbacks via a native (probably ~~libssh~~ (avoid LGPL) `libssh2` based) transport. Look at [this issue](#2399) for some history.
* additional tasks from [this tracking issue](GitoxideLabs/gitoxide#450 (comment))

### Proposed Workflow

I am now using [stacked git](https://stacked-git.github.io) to keep commits meaningful during development. This will also mean that before reviews I will force-push a lot as changes will be bucketed into their respective commits.

Once review officially begins I will stop force-pushing and create small commits to address review comments. That way it should be easier to understand how things change over time.

Those review-comments can certainly be squashed into one commit before merging.

_Please let me know if this is feasible or if there are other ways of working you prefer._

### Development notes

* unrelated: [this line](https://github.com/rust-lang/cargo/blob/9827412fee4f5a88ac85e013edd954b2b63f399b/src/cargo/ops/registry.rs#L620) refers to an issue that has since been resolved in `curl`.
* Additional tasks related to a correct fetch implementation are collected in this [tracking issue](GitoxideLabs/gitoxide#450). **These affect how well the HTTP transport can be configured, needs work**
* _authentication_ [is quite complex](https://github.com/rust-lang/cargo/blob/37cad5bd7f7dcd2f6d3e45312a99a9d3eec1e2a0/src/cargo/sources/git/utils.rs#L490) and centred around making SSH connections work. This feature is currently the weakest in `gitoxide` as it simply uses `ssh` (the program) and calls it a day.  No authentication flows are supported there yet and the goal would be to match `git` there at least (which it might already do by just calling `ssh`). Needs investigation. Once en-par with `git` I think `cargo` can restart the whole fetch operation to try different user names like before.
   - the built-in `ssh`-program based transport can now understand permission-denied errors, but the capability isn't used after all since a builtin ssh transport is required.
* It would be possible to implement `git::Progress` and just ignore most of the calls, but that's known to be too slow as the implementation assumes a `Progress::inc()` call is as fast as an atomic increment and makes no attempt to reduce its calls to it.
* learning about [a way to get custom traits in `thiserror`](dtolnay/thiserror#212) could make spurious error checks nicer and less error prone during maintenance. It's not a problem though.
* I am using `RUSTFLAGS=--cfg` to influence the entire build and unit-tests as environment variables didn't get through to the binary built and run for tests.

### Questions

* The way `gitoxide` is configured the user has the opportunity to override these values using more specific git options, for example using url specific http settings. This looks like a feature to me, but if it's not `gitoxide` needs to provide a way to disable applying these overrides. Please let me know what's desired here - my preference is to allow overrides.
* `gitoxide` currently opens repositories similar to how `git` does which respects git specific environment variables. This might be a deviation from how it was before and can be turned off. My preference is to see it as a feature.

### Prerequisite PRs

* #11602
  • Loading branch information
bors committed Mar 2, 2023
2 parents c61b0f0 + cfffda9 commit 7cba527
Show file tree
Hide file tree
Showing 17 changed files with 856 additions and 108 deletions.
27 changes: 24 additions & 3 deletions .github/workflows/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ jobs:
os: windows-latest
rust: stable-msvc
other: i686-pc-windows-msvc
- name: Windows x86_64 gnu nightly
- name: Windows x86_64 gnu nightly # runs out of space while trying to link the test suite
os: windows-latest
rust: nightly-gnu
other: i686-pc-windows-gnu
Expand Down Expand Up @@ -102,7 +102,16 @@ jobs:
if: "!contains(matrix.rust, 'stable')"

- run: cargo test
# The testsuite generates a huge amount of data, and fetch-smoke-test was
- name: Clear intermediate test output
run: |
df -h
rm -rf target/tmp
df -h
- name: gitoxide tests (all git-related tests)
run: cargo test git
env:
__CARGO_USE_GITOXIDE_INSTEAD_OF_GIT2: 1
# The testsuite generates a huge amount of data, and fetch-smoke-test was
# running out of disk space.
- name: Clear test output
run: |
Expand Down Expand Up @@ -156,6 +165,18 @@ jobs:
- run: rustup update stable && rustup default stable
- run: cargo test --manifest-path crates/resolver-tests/Cargo.toml

test_gitoxide:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- run: rustup update --no-self-update stable && rustup default stable
- run: rustup target add i686-unknown-linux-gnu
- run: sudo apt update -y && sudo apt install gcc-multilib libsecret-1-0 libsecret-1-dev -y
- run: rustup component add rustfmt || echo "rustfmt not available"
- run: cargo test
env:
__CARGO_USE_GITOXIDE_INSTEAD_OF_GIT2: 1

build_std:
runs-on: ubuntu-latest
env:
Expand Down Expand Up @@ -196,7 +217,7 @@ jobs:
permissions:
contents: none
name: bors build finished
needs: [docs, rustfmt, test, resolver, build_std]
needs: [docs, rustfmt, test, resolver, build_std, test_gitoxide]
runs-on: ubuntu-latest
if: "success() && github.event_name == 'push' && github.ref == 'refs/heads/auto-cargo'"
steps:
Expand Down
2 changes: 2 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,8 @@ filetime = "0.2.9"
flate2 = { version = "1.0.3", default-features = false, features = ["zlib"] }
git2 = "0.16.0"
git2-curl = "0.17.0"
gix = { version = "0.38.0", default-features = false, features = ["blocking-http-transport-curl", "progress-tree"] }
gix-features-for-configuration-only = { version = "0.27.0", package = "gix-features", features = [ "parallel" ] }
glob = "0.3.0"
hex = "0.4"
hmac = "0.12.1"
Expand Down
7 changes: 7 additions & 0 deletions crates/cargo-test-support/src/git.rs
Original file line number Diff line number Diff line change
Expand Up @@ -247,3 +247,10 @@ pub fn tag(repo: &git2::Repository, name: &str) {
false
));
}

/// Returns true if gitoxide is globally activated.
///
/// That way, tests that normally use `git2` can transparently use `gitoxide`.
pub fn cargo_uses_gitoxide() -> bool {
std::env::var_os("__CARGO_USE_GITOXIDE_INSTEAD_OF_GIT2").map_or(false, |value| value == "1")
}
84 changes: 84 additions & 0 deletions src/cargo/core/features.rs
Original file line number Diff line number Diff line change
Expand Up @@ -716,6 +716,8 @@ unstable_cli_options!(
doctest_xcompile: bool = ("Compile and run doctests for non-host target using runner config"),
dual_proc_macros: bool = ("Build proc-macros for both the host and the target"),
features: Option<Vec<String>> = (HIDDEN),
gitoxide: Option<GitoxideFeatures> = ("Use gitoxide for the given git interactions, or all of them if no argument is given"),
jobserver_per_rustc: bool = (HIDDEN),
minimal_versions: bool = ("Resolve minimal dependency versions instead of maximum"),
direct_minimal_versions: bool = ("Resolve minimal dependency versions instead of maximum (direct dependencies only)"),
mtime_on_use: bool = ("Configure Cargo to update the mtime of used files"),
Expand Down Expand Up @@ -827,6 +829,74 @@ where
parse_check_cfg(crates.into_iter()).map_err(D::Error::custom)
}

#[derive(Debug, Copy, Clone, Default, Deserialize)]
pub struct GitoxideFeatures {
/// All fetches are done with `gitoxide`, which includes git dependencies as well as the crates index.
pub fetch: bool,
/// When cloning the index, perform a shallow clone. Maintain shallowness upon subsequent fetches.
pub shallow_index: bool,
/// When cloning git dependencies, perform a shallow clone and maintain shallowness on subsequent fetches.
pub shallow_deps: bool,
/// Checkout git dependencies using `gitoxide` (submodules are still handled by git2 ATM, and filters
/// like linefeed conversions are unsupported).
pub checkout: bool,
/// A feature flag which doesn't have any meaning except for preventing
/// `__CARGO_USE_GITOXIDE_INSTEAD_OF_GIT2=1` builds to enable all safe `gitoxide` features.
/// That way, `gitoxide` isn't actually used even though it's enabled.
pub internal_use_git2: bool,
}

impl GitoxideFeatures {
fn all() -> Self {
GitoxideFeatures {
fetch: true,
shallow_index: true,
checkout: true,
shallow_deps: true,
internal_use_git2: false,
}
}

/// Features we deem safe for everyday use - typically true when all tests pass with them
/// AND they are backwards compatible.
fn safe() -> Self {
GitoxideFeatures {
fetch: true,
shallow_index: false,
checkout: true,
shallow_deps: false,
internal_use_git2: false,
}
}
}

fn parse_gitoxide(
it: impl Iterator<Item = impl AsRef<str>>,
) -> CargoResult<Option<GitoxideFeatures>> {
let mut out = GitoxideFeatures::default();
let GitoxideFeatures {
fetch,
shallow_index,
checkout,
shallow_deps,
internal_use_git2,
} = &mut out;

for e in it {
match e.as_ref() {
"fetch" => *fetch = true,
"shallow-index" => *shallow_index = true,
"shallow-deps" => *shallow_deps = true,
"checkout" => *checkout = true,
"internal-use-git2" => *internal_use_git2 = true,
_ => {
bail!("unstable 'gitoxide' only takes `fetch`, 'shallow-index', 'shallow-deps' and 'checkout' as valid inputs")
}
}
}
Ok(Some(out))
}

fn parse_check_cfg(
it: impl Iterator<Item = impl AsRef<str>>,
) -> CargoResult<Option<(bool, bool, bool, bool)>> {
Expand Down Expand Up @@ -879,6 +949,13 @@ impl CliUnstable {
for flag in flags {
self.add(flag, &mut warnings)?;
}

if self.gitoxide.is_none()
&& std::env::var_os("__CARGO_USE_GITOXIDE_INSTEAD_OF_GIT2")
.map_or(false, |value| value == "1")
{
self.gitoxide = GitoxideFeatures::safe().into();
}
Ok(warnings)
}

Expand Down Expand Up @@ -967,6 +1044,13 @@ impl CliUnstable {
"doctest-xcompile" => self.doctest_xcompile = parse_empty(k, v)?,
"doctest-in-workspace" => self.doctest_in_workspace = parse_empty(k, v)?,
"panic-abort-tests" => self.panic_abort_tests = parse_empty(k, v)?,
"jobserver-per-rustc" => self.jobserver_per_rustc = parse_empty(k, v)?,
"gitoxide" => {
self.gitoxide = v.map_or_else(
|| Ok(Some(GitoxideFeatures::all())),
|v| parse_gitoxide(v.split(',')),
)?
}
"host-config" => self.host_config = parse_empty(k, v)?,
"target-applies-to-host" => self.target_applies_to_host = parse_empty(k, v)?,
"publish-timeout" => self.publish_timeout = parse_empty(k, v)?,
Expand Down
2 changes: 1 addition & 1 deletion src/cargo/ops/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@ mod cargo_uninstall;
mod common_for_install_and_uninstall;
mod fix;
mod lockfile;
mod registry;
pub(crate) mod registry;
mod resolve;
pub mod tree;
mod vendor;
Expand Down
11 changes: 5 additions & 6 deletions src/cargo/ops/registry.rs
Original file line number Diff line number Diff line change
Expand Up @@ -619,9 +619,6 @@ pub fn configure_http_handle(config: &Config, handle: &mut Easy) -> CargoResult<
handle.useragent(&format!("cargo {}", version()))?;
}

// Empty string accept encoding expands to the encodings supported by the current libcurl.
handle.accept_encoding("")?;

fn to_ssl_version(s: &str) -> CargoResult<SslVersion> {
let version = match s {
"default" => SslVersion::Default,
Expand All @@ -631,13 +628,15 @@ pub fn configure_http_handle(config: &Config, handle: &mut Easy) -> CargoResult<
"tlsv1.2" => SslVersion::Tlsv12,
"tlsv1.3" => SslVersion::Tlsv13,
_ => bail!(
"Invalid ssl version `{}`,\
choose from 'default', 'tlsv1', 'tlsv1.0', 'tlsv1.1', 'tlsv1.2', 'tlsv1.3'.",
s
"Invalid ssl version `{s}`,\
choose from 'default', 'tlsv1', 'tlsv1.0', 'tlsv1.1', 'tlsv1.2', 'tlsv1.3'."
),
};
Ok(version)
}

// Empty string accept encoding expands to the encodings supported by the current libcurl.
handle.accept_encoding("")?;
if let Some(ssl_version) = &http.ssl_version {
match ssl_version {
SslVersionConfig::Single(s) => {
Expand Down
5 changes: 5 additions & 0 deletions src/cargo/sources/git/mod.rs
Original file line number Diff line number Diff line change
@@ -1,5 +1,10 @@
pub use self::source::GitSource;
pub use self::utils::{fetch, GitCheckout, GitDatabase, GitRemote};
mod known_hosts;
mod oxide;
mod source;
mod utils;

pub mod fetch {
pub type Error = gix::env::collate::fetch::Error<gix::refspec::parse::Error>;
}
Loading

0 comments on commit 7cba527

Please sign in to comment.