Skip to content

Commit

Permalink
docs(faq): Recommend always committing Cargo.lock
Browse files Browse the repository at this point in the history
This commit updates the FAQ to recommend to always commit `Cargo.lock` in all projects, both binaries and libraries. The FAQ previously discouraged `Cargo.lock` in libraries.

The question of changing the recommendation to always commit the lock file was raised in #8728, as well as in a few Rust Internals threads in the previous months ([Feedback of cargo-upgrade](https://internals.rust-lang.org/t/feedback-on-cargo-upgrade-to-prepare-it-for-merging/17101/124?u=demurgos), [Cargo yank is a misfeature](https://internals.rust-lang.org/t/suggestion-cargo-yank-is-a-misfeature-and-should-be-deprecated-and-eventually-removed/18486/15)) and was a relatively popular proposition.

`Cargo.lock` enables reproducible builds, I argue that this is a desirable property for _all_ projects. Reproducible builds are a safe and powerful default both for binaries and libraries. Wanting a non-reproducible build is less common, and still easy to achieve even if a lock file is committed.

Applications have binaries compiled through `cargo build` and executed by `cargo run`. Libraries also have binaries, they're just executed through `cargo test` instead. In both cases, it is desirable to be able to work with the version control system when debugging some regression. A `Cargo.lock` enables workflows such as `git bisect` or reproducing a failed test across all environments (e.g. CI and dev machines).

1. A lockfile enables reproducible builds across commits. This means that you can browse the git history and reproduce your tests as they were at this time. This unlocks `git bisect` workflow and is extremely useful when working on transitive dependency issues.

2. A lockfile ensures that you can reproduce an older build even if it contains  yanked dependencies or weakly constrained versions (semver requirement `*`, git repository without a revision, ...).

3. Even if a lock file is present, it is easy to refresh it / force dependency resolution again. On the other hand, if the lockfile is missing then it is very hard to retrieve the resolution at this time (or even impossible). Having a lock file is a safer default (no information loss).

4. When multiple developers contribute to the same project (even a library), a lock file ensures that all contributors test with the same dependencies. Without a lockfile, developers may see different behavior when testing. This also enables reproducing CI errors locally for example.

5. The usual objection to committing a lockfile is that it may take longer to detect regressions in transitive dependencies. In practice, this does not change much: refreshing the dependencies is always an explicit operation in Cargo anyway. The only time when it's done implicitly is when cloning a project for the first if it does not have a lock file yet. Checking transitive dependencies is better handled explicitly in CI or through tools such as Dependabot. Catching problems with new dependencies should not happen when someone is trying to test an unrelated change. There is also a magnitude more builds in consumer projects than CI builds for a lib, so you should be ready to get messaged by your users anyway if you break something :)

This change to always commit lock files also aligns with prior art from other ecosystems.
- [Yarn for Node.js](https://yarnpkg.com/getting-started/qa#which-files-should-be-gitignored)
  > Which files should be gitignored?
  >
  > [...]
  >
  > - yarn.lock should always be stored within your repository ([even if you develop a library](https://yarnpkg.com/getting-started/qa#should-lockfiles-be-committed-to-the-repository)).
- [Poetry for Python](https://python-poetry.org/docs/basic-usage/#committing-your-poetrylock-file-to-version-control)
  > As a library developer
  >
  > A simple way to avoid such a scenario [testing with latest transitive dependencies] is to omit the `poetry.lock` file. However, by doing so, you sacrifice reproducibility and performance to a certain extent. Without a lockfile, it can be difficult to find the reason for failing tests, because in addition to obvious code changes an unnoticed library update might be the culprit. [...]
  >
  > If you do not want to give up the reproducibility and performance benefits, consider a regular refresh of `poetry.lock` to stay up-to-date and reduce the risk of sudden breakage for users.
- [Bundler for Ruby](https://bundler.io/guides/faq.html#using-gemfiles-inside-gems)
  > Q: Should I commit my `Gemfile.lock` when writing a gem?
  >
  > A: Yes, you should commit it. The presence of a `Gemfile.lock` in a gem’s repository ensures that a fresh checkout of the repository uses the exact same set of dependencies every time. We believe this makes repositories more friendly towards new and existing contributors. Ideally, anyone should be able to clone the repo, run `bundle install`, and have passing tests. If you don’t check in your `Gemfile.lock`, new contributors can get different versions of your dependencies, and run into failing tests that they don’t know how to fix.

Fixes #8728
  • Loading branch information
demurgos committed Jun 16, 2023
1 parent 0c14026 commit ccc6ac6
Show file tree
Hide file tree
Showing 2 changed files with 24 additions and 27 deletions.
40 changes: 20 additions & 20 deletions src/doc/src/faq.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,33 +102,32 @@ issue][cargo-issues].

[cargo-issues]: https://github.com/rust-lang/cargo/issues

### Why do binaries have `Cargo.lock` in version control, but not libraries?
### Why should I always have `Cargo.lock` in version control?

The purpose of a `Cargo.lock` lockfile is to describe the state of the world at
the time of a successful build. Cargo uses the lockfile to provide
deterministic builds on different times and different systems, by ensuring that
the exact same dependencies and versions are used as when the `Cargo.lock` file
was originally generated.

This property is most desirable from applications and packages which are at the
very end of the dependency chain (binaries). As a result, it is recommended that
all binaries check in their `Cargo.lock`.

For libraries the situation is somewhat different. A library is not only used by
the library developers, but also any downstream consumers of the library. Users
dependent on the library will not inspect the library’s `Cargo.lock` (even if it
exists). This is precisely because a library should **not** be deterministically
recompiled for all users of the library.

If a library ends up being used transitively by several dependencies, it’s
likely that just a single copy of the library is desired (based on semver
compatibility). If Cargo used all of the dependencies' `Cargo.lock` files,
then multiple copies of the library could be used, and perhaps even a version
conflict.

In other words, libraries specify SemVer requirements for their dependencies but
cannot see the full picture. Only end products like binaries have a full
picture to decide what versions of dependencies should be used.
This property is desirable for all projects, whether they are intended to be used
as a library or from applications and packages which are at the very end of the
dependency chain (binaries). As a result, it is recommended that all projects
check in their `Cargo.lock`. This allows all users to clone a project and start
working on it risking unexpected dependency issues.

Note that a `Cargo.lock` in a library is only useful when developing it, it is
ignored when using the library as a dependency. It is however very useful when
developing. It ensures reproducible tests when going through the version history.
It also allows advanced users to inspect for library's `Cargo.lock` when facing
compatibility issues with dependencies. In particular, it allows reproducibility
across different environments such as a Continuous Integration (CI) server or the
workstations of all contributors.

It is recommended for all projects, and libraries in particular, to regularly
refresh their lockfile to detect compatibility issues with dependencies as soon
as possible. You can achieve it manually with [`cargo update`][cargo-update] or
with tools such as _Dependabot_.

### Can libraries use `*` as a version for their dependencies?

Expand Down Expand Up @@ -301,6 +300,7 @@ causes and provide diagnostic techniques to help you out there:
and `Cargo.toml` using a [custom merge tool].


[cargo-update]: ./commands/cargo-update.md
[links]: https://doc.rust-lang.org/cargo/reference/resolver.html#links
[conventions in place]: https://doc.rust-lang.org/cargo/reference/build-scripts.html#-sys-packages
[`direct-minimal-versions`]: https://doc.rust-lang.org/nightly/cargo/reference/unstable.html#direct-minimal-versions
Expand Down
11 changes: 4 additions & 7 deletions src/doc/src/guide/cargo-toml-vs-cargo-lock.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,14 +8,11 @@ about them, here’s a summary:
* `Cargo.lock` contains exact information about your dependencies. It is
maintained by Cargo and should not be manually edited.

If you’re building a non-end product, such as a rust library that other rust
[packages][def-package] will depend on, put `Cargo.lock` in your
`.gitignore`. If you’re building an end product, which are executable like
command-line tool or an application, or a system library with crate-type of
`staticlib` or `cdylib`, check `Cargo.lock` into `git`. If you're curious
Always check `Cargo.lock` into `git`. This advice applies to all projects:
libraries, command line tools, applications, etc. If you're curious
about why that is, see
["Why do binaries have `Cargo.lock` in version control, but not libraries?" in the
FAQ](../faq.md#why-do-binaries-have-cargolock-in-version-control-but-not-libraries).
["Why should I always have `Cargo.lock` in version control?" in the
FAQ](../faq.md#why-should-i-always-have-cargolock-in-version-control).

Let’s dig in a little bit more.

Expand Down

0 comments on commit ccc6ac6

Please sign in to comment.