Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cargo.lock and dependency resolution #12

Closed
ehuss opened this issue Jul 20, 2019 · 8 comments
Closed

Cargo.lock and dependency resolution #12

ehuss opened this issue Jul 20, 2019 · 8 comments
Labels
enhancement New feature or request

Comments

@ehuss
Copy link
Contributor

ehuss commented Jul 20, 2019

This issue is for working through the implementation issues with locked dependencies.

It is almost certain that the standard library will need to be built with the exact same dependencies used in the release. That is, dependencies like libc will not be allowed to "float" to the most recent release. All dependencies will also be built separately from the user's project, so if the user has a dependency on libc, it will be built separately from the libc for the standard library (which may be the same or different version). Metadata tricks are used to link these properly.

The rust-src rustup component includes the standard library sources, but does not include dependencies. It includes the Cargo.lock file, from which we can infer the dependency versions.

I'm not certain at this time what problems this might present.

@ehuss ehuss added the implementation Implementation exploration and tracking issues label Jul 20, 2019
@alexcrichton
Copy link
Member

One other reason I think we need to lock dependencies is for determinism. Or well I think that we should provide a way to have deterministic builds of the standard library (no floating deps over time) and unless we store more information in project lock files (which I'm not sure is feasible) then using the rust lock file would solve this issue

@SimonSapin
Copy link

All dependencies will also be built separately from the user's project, so if the user has a dependency on libc, it will be built separately from the libc for the standard library

This could be represented in Cargo’s resolution as a separate "source", for example std+registry+https://github.com/rust-lang/crates.io-index instead of registry+https://github.com/rust-lang/crates.io-index.

@Ericson2314
Copy link

Ericson2314 commented Jul 23, 2019

Hmm there's two issues at play: making the depedencies not "float" and making them not conflict with the users dependencies. For the latter, we can and should use private dependencies. They are *exactly" what we need, and will also help us catch the standard library leaking implementation details. For the former one can just pin exact versions from the stdlib Cargo.toml? Remember with different features/platform combinations, the exact dependencies of std and friends can change.

@Eh2406
Copy link

Eh2406 commented Jul 23, 2019

Um... That is not what private dependencies do according to the current RFC. Private dependencies are just as locked to the rest of the dependency tree as the current "unspecified" dependencies. IE only one global use of each links and no two semver compatible versions.

@Ericson2314
Copy link

no two semver compatible versions.

Well that is silly. Private dependencies should never unify (i.e. even if we use the same libc it should show up as if we used two different ones so we cannot accidentally mix them together.) Therefore maybe Cargo can prefer to user fewer versions, but if std depends on an exact version and something else another exact version, the build plan should still go through.

@Eh2406
Copy link

Eh2406 commented Jul 24, 2019

That is an interesting thought, probably worth discussing a new RFC for that, but for now that is not what "private dependencies" does. So we will need to do something special for std.

@Ericson2314
Copy link

Well for the MVP we don't. I personally think this is a minor tweak to that RFC so I'll comment in that issue, and see what people say.

bors added a commit to rust-lang/cargo that referenced this issue Sep 3, 2019
Basic standard library support.

This is not intended to be useful to anyone. If people want to try it, that's great, but do not rely on this. This is only for experimenting and setting up for future work.

This adds a flag `-Zbuild-std` to build the standard library with a project. The flag can also take optional comma-separated crate names, like `-Zbuild-std=core`. Default is `std,core,panic_unwind,compiler_builtins`.

Closes rust-lang/wg-cargo-std-aware#10.

Note: I can probably break some of the refactoring into smaller PRs if necessary.

## Overview
The general concept here is to use two resolvers, and to combine everything in the Unit graph. There are a number of changes to support this:

- A synthetic workspace for the standard library is created to set up the patches and members correctly.
- Decouple `unit_dependencies` from `Context` to make it easier to manage.
- Add `features` to `Unit` to keep it unique and to remove the need to query a resolver.
- Add a `UnitDep` struct which encodes the edges between `Unit`s. This removes the need to query a resolver for `extern_crate_name` and `public`.
- Remove `Resolver` from `BuildContext` to avoid any confusion and to keep the complexity focused in `unit_dependencies`.
- Remove `Links` from `Context` since it used the resolver. Adjusted so that instead of checking links at runtime, they are all checked at once in the beginning. Note that it does not check links for the standard lib, but it should be safe? I think `compiler-rt` is the only `links`?

I currently went with a strategy of linking the standard library dependencies using `--extern` (instead of `--sysroot` or `-L`). This has some benefits but some significant drawbacks. See below for some questions.

## For future PRs
- Add Cargo.toml support. See rust-lang/wg-cargo-std-aware#5
- Source is not downloaded. It assumes you have run `rustup component add rust-src`. See rust-lang/wg-cargo-std-aware#11
- `cargo metadata` does not include any information about std. I don't know how this should work.
- `cargo clean` is not std-aware.
- `cargo fetch` does not fetch std dependencies.
- `cargo vendor` does not vendor std dependencies.
- `cargo pkgid` is not std-aware.
- `--target` is required on the command-line. This should default to host-as-target.
- `-p` is not std aware.
- A synthetic `Cargo.toml` workspace is created which has to know about things like `rustc-std-workspace-core`. Perhaps rust-lang/rust should publish the source with this `Cargo.toml` already created?
- `compiler_builtins` uses default features (pure Rust implementation, etc.). See rust-lang/wg-cargo-std-aware#15
    - `compiler_builtins` may need to be built without debug assertions, see [this](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L210-L214). Could maybe use profile overrides.
- Panic issues:
    - `panic_abort` is not yet supported, though it should probably be easy. It could maybe look at the profile to determine which panic implementation to use? This requires more hard-coding in Cargo to know about rustc implementation details.
    - [This](https://github.com/rust-lang/rust/blob/8e917f48382c6afaf50568263b89d35fba5d98e4/src/bootstrap/bin/rustc.rs#L186-L201) should probably be handled where `panic` is set for `panic_abort` and `compiler_builtins`. I would like to get a test case for it. This can maybe be done with profile overrides?
- Using two resolvers is quite messy and causes a lot of complications. It would be ideal if it could only use one, though that may not be possible for the foreseeable future. See rust-lang/wg-cargo-std-aware#12
- Features are hard-coded. See rust-lang/wg-cargo-std-aware#13
- Lots of various platform-specific support is not included (musl, wasi, windows-gnu, etc.).
- Default `backtrace` is used with C compiler. See rust-lang/wg-cargo-std-aware#16
- Sanitizers are not built. See rust-lang/wg-cargo-std-aware#17
- proc_macro has some hacky code to synthesize its dependencies. See rust-lang/wg-cargo-std-aware#18. This may not be necessary if this uses `--sysroot` instead.
- Profile overrides cause weird linker errors.
  That is:
  ```toml
  [profile.dev.overrides.std]
  opt-level = 2
  ```
  Using `[profile.dev.overrides."*"]` works. I tried fiddling with it, but couldn't figure it out.
  We may also want to consider altering the syntax for profile overrides. Having to repeat the same profile for `std` and `core` and `alloc` and everything else would not be ideal.
- ~~`Context::unit_deps` does not handle build overrides, see #7215.~~ FIXED

## Questions for this PR
- I went with the strategy of using `--extern` to link the standard lib. This seems to work, and I haven't found any problems, but it seems risky. It also forces Cargo to know about certain implicit dependencies like `compiler_builtins` and `panic_*`. The alternative is to create a sysroot and copy all the crates to that directory and pass `--sysroot`. However, this is complicated by pipelining, which would require special support to copy `.rmeta` files when they are generated. Let me know if you think I should use a different strategy. I'm on the fence here, and I think using `--sysroot` may be safer, but adds more complexity.
    - As an aside, if rustc ever tries to grab a crate from sysroot that was not passed in via `--extern`, then it results in duplicate lang items. For example, saying `extern crate proc_macro;` without specifying `proc_macro` as a dependency. We could prevent rustc from ever trying by passing `--sysroot=/nonexistent` to prevent it from trying. Or add an equivalent flag to rustc.
- How should this be tested? I added a janky integration test, but it has some drawbacks. It requires internet access. It is slow. Since it is slow, it reuses the same target directory for multiple tests which makes it awkward to work with.
    - What interesting things are there to test?
    - We may want to disable the test before merging if it seems too annoying to make it the default. It requires rust-src to be downloaded, and takes several minutes to run, and are somewhat platform-dependent.
- How to test that it is actually linking the correct standard library? I did tests locally with a modified libcore, but I can't think of a good way to do that in the test suite.
- I did not add `__CARGO_DEFAULT_LIB_METADATA` to the hash. I had a hard time coming up with a test case where it would matter.
    - My only thought is that it is a problem because libstd includes a dylib, which prevents the hash from being added to the filename. It does cause recompiles when switching between compilers, for example, when it normally wouldn't.
    - Very dumb question: Why exactly does libstd include a dylib? This can cause issues (see rust-lang/rust#56443).
    - This should probably change, but I want to better understand it first.
- The `bin_nostd` test needs to link libc on linux, and I'm not sure I understand why. I'm concerned there is something wrong there. libstd does not do that AFAIK.
@ehuss ehuss added enhancement New feature or request and removed implementation Implementation exploration and tracking issues labels May 3, 2023
@ehuss
Copy link
Contributor Author

ehuss commented May 3, 2023

I'm going to close since I don't expect any changes regarding this for the foreseeable future.

build-std currently uses the Cargo.lock from the rust repo to ensure the dependencies are locked, and uses a separate resolver to ensure that the dependencies are kept independent and private from the user's dependencies.

It might be nice if that wasn't needed, but I don't think there is going to be a different solution, such as something that allows duplicate semver-compatible dependencies. That isn't on our radar, and would be a whole initiative on its own.

Alternatively, I am skeptical that allowing semver-compatible updates of dependencies is feasible since those dependencies would need to go under a much greater level of scrutiny than they currently do (else a breaking change could break all cargo users instantly). Some dependencies like compiler_builtins could do a semver-major version bump for every release to avoid that, but that would only be a partial solution.

If other issues require a change in how that works, then any changes will need to motivated and branched from there.

@ehuss ehuss closed this as completed May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

5 participants