Skip to content

Commit

Permalink
fix: switch ld with lld on linux, fixes OOM
Browse files Browse the repository at this point in the history
We've recently taken to limit parallelism during builds due to Out Of
Memory errors we got while running `cargo test` on 16-core machine,
or 8-core machines running with `-j16`.

We hit OOM due to two reasons:
1. Commit fae5aed introduced `llvm`
   dependencies to the blockifier, which notably increased compilation
   and linking time.
2. Before fae5aed, we already had a quite
memory-heavy linking stage, specifically ~7-8 `ld` instances running in
parallel, each with ~2-3GB of memory, which was just enough not to hit
OOM on most 16core machines.
This was primarily due to a combination of factors:
   2.1 the blockifier being a large lib that needs a 2-3GB `ld` process
   to link in isolation.
   2.2 many crates have the blockifier as a dependency (transitively).
   2.3 `ld` naively parallelizes linking tasks.
The combination of the above three issues caused a heavily parallelized
environment to create many heavy linking jobs that shared no
memory (blockifier linking was duplicated).

This PR switches to the rust linker from `ld` to `lld`, thus solving the
OOM issue without artificially bounding the amount of
parallelization.

`lld` handles parallelizing better and reduces memory stress
from ~10 2-3GB `ld` linker instances post fae5aed,
to ~2 1-2GB `lld` linker instances for the same tasks, which has a
minimal memory impact.

Moreover, `lld` is already the default linker on `rust` nightly, and
will eventually become the default on stable (see code comment for link
to tracking issue).
  • Loading branch information
Gilad Chase committed Oct 20, 2024
1 parent 16caad8 commit 6dc3bd4
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 8 deletions.
14 changes: 7 additions & 7 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ LLVM_SYS_181_PREFIX = "/usr/lib/llvm-18/"
MLIR_SYS_180_PREFIX = "/usr/lib/llvm-18/"
TABLEGEN_180_PREFIX = "/usr/lib/llvm-18/"

# Limit concurrency to prevent crash during "cargo build" due to excessive memory usage. Should be removed in the future. Slows down performance.
[build]
jobs = 8

# Limit concurrency to prevent crash during "cargo test" due to excessive memory usage. Should be removed in the future. Slows down performance.
[test]
jobs = 8
# Use `lld` for linking instead of `ld`, since we run out of memory while linking with `ld` on
# 16-cores linux machines, see:
# https://nnethercote.github.io/perf-book/build-configuration.html#linking.
# TODO: remove this once `rust` stabilizes `lld` as the default linker, currently only on nightly:
# https://github.com/rust-lang/rust/issues/39915#issuecomment-618726211
[target.x86_64-unknown-linux-gnu]
rustflags = ["-Clink-arg=-fuse-ld=lld"]
1 change: 1 addition & 0 deletions .github/workflows/committer_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ jobs:
- uses: actions/checkout@v4
with:
ref: ${{ github.base_ref }}
- uses: ./.github/actions/bootstrap

# Download the old benchmark inputs.
- id: auth
Expand Down
3 changes: 2 additions & 1 deletion scripts/dependencies.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ function setup_llvm_deps() {
libmlir-18-dev \
libpolly-18-dev \
libzstd-dev \
mlir-18-tools
mlir-18-tools \
lld
'
;;
*)
Expand Down

0 comments on commit 6dc3bd4

Please sign in to comment.