-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vendor all rustbuild dependencies in this repo #37524
Conversation
r? @brson |
r? @aturon (rust_highfive has picked a reviewer for you, use r? to override) |
Note that the first commit is purely just adding |
Isn't it only the source tarballs which need to be self contained and vendor everything? If so, why does the git repo need to vendor everything? Can't rustbuild just acquire the external dependencies when creating the source tarball in the first place? |
Yes we only strictly need this source tarballs, but it's far easier to just check it into the repo (like LLVM) and not worry about that automation. |
LLVM uses git submodules. As far as I can tell, this just copies all the files. I don't know if it makes a difference but it does seem odd at a glance, could you elaborate? |
@rkruppe This is not using submodules because it is using cargo's vendoring support. So this is dogfooding the tools we are having our users use for similar purposes. |
@alexcrichton what is the process for updating dependencies after this patch? That's going to be something we do a lot so it needs to be easy. Where is it documented (the diff is truncated so maybe it's in here somewhere)? |
@brson hm yeah that's a good point about updating dependencies (or just restructuring the DAG) Right now we're always passing Alternatively we could change all the bots to always pass |
I don't know a definite way to detect when running on the bots, but we could add a flag to configure to indicate such. I don't know it's worth it for this though. A good workflow to me might be for bootstrap to somehow detect when the frozen build fails because it would change the lockfile and say "it looks like you are changing a dependency. run x.py generate-lockfiles" or something. |
This vendor directory seems to provide a convenient way to implement a license-check on third-party crates... |
In my opinion one of the great benefits of rustbuild was that Rust itself would be able to easily use external crates and not have to shove everything in-tree. This PR will undo a significant amount of that progress. Updating these dependencies will be a painfully manual process, and if it could be automated then there would be no reason for this PR as rustbuild would be able to automatically do it when creating a source tarball. |
@brson unfortunately I don't know of a great way to detect that Cargo failed because it would change the lock file, scraping the output tends to be harder than it's worth I figured we could change to a system like this and then if it becomes a problem we could look for a solution (likely adding a "I'm a bot" flag). @retep998 this PR is just a baby step towards making rustbuild more official. I wanted to reduce the chances this is blocked by a stray "zomg 3MB commit" comment by vendoring as little as possible. This is not a declaration that we'll never use dependencies from crates.io, as that's exactly the point of rustbuild. We'll get there in time, we're just not there yet. |
This is still making it easy to use external crates. It is still using cargo to do the build. This is the exact tooling we expect our users to use when they need to store code in-tree. The only difference between this arrangement and downloading from crates.io is that there is one extra step to do the vendoring when updating crates. At the moment we believe we must provide such trees for source distribution, and offhand I think it would be unnecessarily risky for our own builds to be that different from the builds we are giving to packagers, though we do have distcheck that could help ensure the source tarball works. Although I wonder how true it is that packagers expect the source to be vendored - they are already packaging cargo, and cargo does not vendor it's dependencies does it? |
Another approach we could take here is to have a tool that produces a vendored source tree on demand, but we don't use it by default. So in the cases where distro packagers do need it, they can do it themselves. Maybe we should consult some distro folks and see what their opinions are. |
(On the other hand, I kinda want to just do something so we can get rid of the makefiles, perfect it later; and having vendored source is the conservative option that distros won't complain about). |
Yeah I was kinda hoping we wouldn't block this on too many concerns. We're already "vendoring" hoedown, miniz, jemalloc, and llvm. While LLVM is normally a special case, all these other pieces I really hope we don't try to split out into separate packages as it's not worth it right now. |
You can download these files to local disk before compile/build starting. Don't pushing them into git repo, before you must fork & modify their source code. |
I do complain a little... but to the extent this stuff is used just at build time and not installed code, I care less. If there's a way to eventually override the vendoring to point to a distro directory registry instead, that would be even better. (We don't have any packaged crates in Fedora yet, but that's the plan.)
And in Fedora I've disabled jemalloc and use system llvm. The other two I just haven't got around to bothering with, since support for those as external libraries isn't already there. |
It does not, so in Fedora's cargo.src.rpm I have my own cargo-vendor'ed tarball as an extra source. But again, I eventually want that to point to a distro directory registry. Actually, last I checked there was no cargo source distribution at all, so even for that I'm just grabbing from the git tag. |
Here are my outstanding concerns:
|
@liigo I don't follow what you mean hear by "before you must fork and modify their source code". When would we be forking and modifying the vendored code? |
124336a
to
c911552
Compare
@brson ok I've implemented:
That way devs adding deps should work normally, it's just when you land an updated dep you'll have to run |
@est31 using submodules precludes standard usage of Cargo overrides, so no, unfortunately it can't be used. Yes tarballs should have dependencies in them, and this is the easiest way to do so. |
As formulated now nothing in this PR will immediately affect developers, nor distros, though distros will likely want to turn on I'm not convinced by this argument, primarily because it is a small amount of source code. Git is very good at managing source code and we've already churned through orders of magnitude more than this in Rust's history. It will accumulate over time though and we can never take it back. So I'm inclined to land this now to proceed with removal of the make build system. It doesn't commit us to anything beyond maintaining 20k lines of code in the git history, and can be re-considered later if needed. cc @rust-lang/compiler @rust-lang/libs this is a somewhat significant change to the Rust source tree that you might want to be aware of. The TL;DR: In order to make building Rust with cargo not hit the network (primarily for distros and our CI), this vendors all out-of-tree source in the src/vendor directory. Right now that only applies to rustbuild itself, but this strategy will extend to any out-of-tree components that become part of Rust in the future. The developer workflow change is: during CI, builds will be run against the vendored tree. If the build fails, the PR author will need to run The downside of this is that we will be storing the source code to deps in-tree, increasing the repo's size. |
One argument for this is that people might want to automate distro builds from git directly instead of using tarballs. It's probably fine in the short term. |
+1, but nit, can these be in |
At the very least, can there some policy be established to never ever modify the code in the vendor subdir? I've seen many C++ projects going down that path, and it has caused very bad issues every time, because updating now becomes something dangerous and scary. There should also be some automatic verification by travis or bors to make sure a PR that updates some dependency doesn't mess with its source. |
@est31 If possible, these should be hashed (in |
Fear not, @est31 : [steve@becoming lol (master)]$ cat Cargo.toml
[package]
name = "lol"
version = "0.1.0"
authors = ["Steve Klabnik <steve@steveklabnik.com>"]
[dependencies]
ref_slice = "1.1.0"
[steve@becoming lol (master)]$ cargo build
Compiling ref_slice v1.1.1
Compiling lol v0.1.0 (file:///home/steve/tmp/lol)
Finished debug [unoptimized + debuginfo] target(s) in 0.12 secs
[steve@becoming lol (master)]$ echo "// testing" >> vendor/ref_slice/src/lib.rs
[steve@becoming lol (master)]$ cargo clean
[steve@becoming lol (master)]$ cargo build
error: the listed checksum of `/home/steve/tmp/lol/vendor/ref_slice/src/lib.rs` has changed:
expected: 54152bc0580b0f05b37bc8a97263a36bcf43af25ec8d13c90d58dce67f919ed4
actual: e04c52abc3c719cdba6ed6a38ad10dba75ee276617fc8a43a325ccc1c5c03f40
directory sources are not intended to be edited, if modifications are required then it is recommended that [replace] is used with a forked copy of the source |
Yeah this was my initial inclination but jemalloc/llvm/hoedown/etc all stopped my as the only "convention" we have now is code goes under |
Thanks for clarifying @eddyb @steveklabnik ! |
@bors r+ Let's do it as-is. Maybe once we've got the other makefiles gone we can move all third-party code to a directory outside src/. |
📌 Commit 31a8638 has been approved by |
Vendor all rustbuild dependencies in this repo This commit vendors all crates.io dependencies into the rust-lang/rust repository using the `cargo-vendor` tool. This is done in an effort to make rustbuild distro-ready by ensuring that our source tarballs are self-contained units which don't need extraneous network downloads. A new `src/vendor` directory is created with all vendored crates, and Cargo, when using rustbuild, is configured to use this directory. Over time we can deduplicate this directory with the actual src tree (e.g. src/librustc_serialize, src/liblibc, src/libgetopts, ...). For now though that's left to a separate commit.
…hton Automate vendoring by invoking cargo-vendor when building src dist tarballs. This avoids rust-lang#39633 bringing the `src/vendor` checked into git by rust-lang#37524, past 200,000 lines of code. I believe the strategy of having rustbuild run `cargo vendor` during the `dist src` step is sound. However, the only way to be sure `cargo-vendor` exists is to run `cargo install --force cargo-vendor`, which will recompile it every time (not passing `--force` means you can't tell between "already exists" and "build error"). ~~This is quite suboptimal and I'd like to somehow do it in each `Dockerfile` that would need it.~~ * [ ] Cache `CARGO_HOME` (i.e. `~/.cargo`) between CI runs * `bin/cargo-vendor` and the actual caches are the relevant bits * [x] Do not build `cargo-vendor` all the time * ~~Maybe detect `~/.cargo/bin/cargo-vendor` already exists?~~ * ~~Could also try to build it in a `Dockerfile` but do we have `cargo`/`rustc` there?~~ * Final solution: check `cargo install --list` for a line starting with `cargo-vendor ` cc @rust-lang/tools
Automate vendoring by invoking cargo-vendor when building src dist tarballs. This avoids #39633 bringing the `src/vendor` checked into git by #37524, past 200,000 lines of code. I believe the strategy of having rustbuild run `cargo vendor` during the `dist src` step is sound. However, the only way to be sure `cargo-vendor` exists is to run `cargo install --force cargo-vendor`, which will recompile it every time (not passing `--force` means you can't tell between "already exists" and "build error"). ~~This is quite suboptimal and I'd like to somehow do it in each `Dockerfile` that would need it.~~ * [ ] Cache `CARGO_HOME` (i.e. `~/.cargo`) between CI runs * `bin/cargo-vendor` and the actual caches are the relevant bits * [x] Do not build `cargo-vendor` all the time * ~~Maybe detect `~/.cargo/bin/cargo-vendor` already exists?~~ * ~~Could also try to build it in a `Dockerfile` but do we have `cargo`/`rustc` there?~~ * Final solution: check `cargo install --list` for a line starting with `cargo-vendor ` cc @rust-lang/tools
Automate vendoring by invoking cargo-vendor when building src dist tarballs. This avoids #39633 bringing the `src/vendor` checked into git by #37524, past 200,000 lines of code. I believe the strategy of having rustbuild run `cargo vendor` during the `dist src` step is sound. However, the only way to be sure `cargo-vendor` exists is to run `cargo install --force cargo-vendor`, which will recompile it every time (not passing `--force` means you can't tell between "already exists" and "build error"). ~~This is quite suboptimal and I'd like to somehow do it in each `Dockerfile` that would need it.~~ * [ ] Cache `CARGO_HOME` (i.e. `~/.cargo`) between CI runs * `bin/cargo-vendor` and the actual caches are the relevant bits * [x] Do not build `cargo-vendor` all the time * ~~Maybe detect `~/.cargo/bin/cargo-vendor` already exists?~~ * ~~Could also try to build it in a `Dockerfile` but do we have `cargo`/`rustc` there?~~ * Final solution: check `cargo install --list` for a line starting with `cargo-vendor ` cc @rust-lang/tools
This commit vendors all crates.io dependencies into the rust-lang/rust repository using the
cargo-vendor
tool. This is done in an effort to make rustbuild distro-ready by ensuring that our source tarballs are self-contained units which don't need extraneous network downloads.A new
src/vendor
directory is created with all vendored crates, and Cargo, when using rustbuild, is configured to use this directory. Over time we can deduplicate this directory with the actual src tree (e.g. src/librustc_serialize, src/liblibc, src/libgetopts, ...). For now though that's left to a separate commit.