From 5205672210381dfe10072b72f8d147ddb3c83cd4 Mon Sep 17 00:00:00 2001 From: Charles Samborski Date: Fri, 16 Jun 2023 03:45:43 +0200 Subject: [PATCH] docs(faq): Recommend always committing `Cargo.lock` MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit This commit updates the FAQ to recommend to always commit `Cargo.lock` in all projects, both binaries and libraries. The FAQ previously discouraged `Cargo.lock` in libraries. The question of changing the recommendation to always commit the lock file was raised in https://github.com/rust-lang/cargo/issues/8728, as well as in a few Rust Internals threads in the previous months ([Feedback of cargo-upgrade](https://internals.rust-lang.org/t/feedback-on-cargo-upgrade-to-prepare-it-for-merging/17101/124?u=demurgos), [Cargo yank is a misfeature](https://internals.rust-lang.org/t/suggestion-cargo-yank-is-a-misfeature-and-should-be-deprecated-and-eventually-removed/18486/15)) and was a relatively popular proposition. `Cargo.lock` enables reproducible builds, I argue that this is a desirable property for _all_ projects. Reproducible builds are a safe and powerful default both for binaries and libraries. Wanting a non-reproducible build is less common, and still easy to achieve even if a lock file is committed. Applications have binaries compiled through `cargo build` and executed by `cargo run`. Libraries also have binaries, they're just executed through `cargo test` instead. In both cases, it is desirable to be able to work with the version control system when debugging some regression. A `Cargo.lock` enables workflows such as `git bisect` or reproducing a failed test across all environments (e.g. CI and dev machines). 1. A lockfile enables reproducible builds across commits. This means that you can browse the git history and reproduce your tests as they were at this time. This unlocks `git bisect` workflow and is extremely useful when working on transitive dependency issues. 2. A lockfile ensures that you can reproduce an older build even if it contains yanked dependencies or weakly constrained versions (semver requirement `*`, git repository without a revision, ...). 3. Even if a lock file is present, it is easy to refresh it / force dependency resolution again. On the other hand, if the lockfile is missing then it is very hard to retrieve the resolution at this time (or even impossible). Having a lock file is a safer default (no information loss). 4. When multiple developers contribute to the same project (even a library), a lock file ensures that all contributors test with the same dependencies. Without a lockfile, developers may see different behavior when testing. This also enables reproducing CI errors locally for example. 5. The usual objection to committing a lockfile is that it may take longer to detect regressions in transitive dependencies. In practice, this does not change much: refreshing the dependencies is always an explicit operation in Cargo anyway. The only time when it's done implicitly is when cloning a project for the first if it does not have a lock file yet. Checking transitive dependencies is better handled explicitly in CI or through tools such as Dependabot. This change to always commit lock files also aligns with prior art from other ecosystems. - [Yarn for Node.js](https://yarnpkg.com/getting-started/qa#which-files-should-be-gitignored) > Which files should be gitignored? > > [...] > > - yarn.lock should always be stored within your repository ([even if you develop a library](https://yarnpkg.com/getting-started/qa#should-lockfiles-be-committed-to-the-repository)). - [Poetry for Python](https://python-poetry.org/docs/basic-usage/#committing-your-poetrylock-file-to-version-control) > As a library developer > > A simple way to avoid such a scenario is to omit the `poetry.lock` file. However, by doing so, you sacrifice reproducibility and performance to a certain extent. Without a lockfile, it can be difficult to find the reason for failing tests, because in addition to obvious code changes an unnoticed library update might be the culprit. [...] > > If you do not want to give up the reproducibility and performance benefits, consider a regular refresh of `poetry.lock` to stay up-to-date and reduce the risk of sudden breakage for users. - [Bundler for Ruby](https://bundler.io/guides/faq.html#using-gemfiles-inside-gems) > Q: Should I commit my `Gemfile.lock` when writing a gem? > > A: Yes, you should commit it. The presence of a `Gemfile.lock` in a gem’s repository ensures that a fresh checkout of the repository uses the exact same set of dependencies every time. We believe this makes repositories more friendly towards new and existing contributors. Ideally, anyone should be able to clone the repo, run `bundle install`, and have passing tests. If you don’t check in your `Gemfile.lock`, new contributors can get different versions of your dependencies, and run into failing tests that they don’t know how to fix. Fixes https://github.com/rust-lang/cargo/issues/8728 --- src/doc/src/faq.md | 38 ++++++++++++++++++-------------------- 1 file changed, 18 insertions(+), 20 deletions(-) diff --git a/src/doc/src/faq.md b/src/doc/src/faq.md index e4a753413f64..7c9bac6e7ddb 100644 --- a/src/doc/src/faq.md +++ b/src/doc/src/faq.md @@ -102,7 +102,7 @@ issue][cargo-issues]. [cargo-issues]: https://github.com/rust-lang/cargo/issues -### Why do binaries have `Cargo.lock` in version control, but not libraries? +### Why should I always have `Cargo.lock` in version control? The purpose of a `Cargo.lock` lockfile is to describe the state of the world at the time of a successful build. Cargo uses the lockfile to provide @@ -110,25 +110,23 @@ deterministic builds on different times and different systems, by ensuring that the exact same dependencies and versions are used as when the `Cargo.lock` file was originally generated. -This property is most desirable from applications and packages which are at the -very end of the dependency chain (binaries). As a result, it is recommended that -all binaries check in their `Cargo.lock`. - -For libraries the situation is somewhat different. A library is not only used by -the library developers, but also any downstream consumers of the library. Users -dependent on the library will not inspect the library’s `Cargo.lock` (even if it -exists). This is precisely because a library should **not** be deterministically -recompiled for all users of the library. - -If a library ends up being used transitively by several dependencies, it’s -likely that just a single copy of the library is desired (based on semver -compatibility). If Cargo used all of the dependencies' `Cargo.lock` files, -then multiple copies of the library could be used, and perhaps even a version -conflict. - -In other words, libraries specify SemVer requirements for their dependencies but -cannot see the full picture. Only end products like binaries have a full -picture to decide what versions of dependencies should be used. +This property is desirable for all projects, whether they are intended to be used +as a library or from applications and packages which are at the very end of the +dependency chain (binaries). As a result, it is recommended that all projects +check in their `Cargo.lock`. + +Note that a `Cargo.lock` in a library is only useful when developing it, it is +ignored when using the library as a dependency. It is however very useful when +developing. It ensures reproducible tests when going through the version history. +It also allows advanced users to inspect for library's `Cargo.lock` when facing +compatibility issues with dependencies. In particular, it allows reproducibility +across different environments such as a Continuous Integration (CI) server or the +workstations of all contributors. + +It is recommended for all projects, and libraries in particular, to regularly +refresh their lockfile to detect compatibility issues with dependencies as soon +as possible. You can achieve it manually with `cargo upgrade` or with tools such +as _Dependabot_. ### Can libraries use `*` as a version for their dependencies?