Skip to content

Commit

Permalink
fix: switch ld with lld on linux, fixes OOM (#1388)
Browse files Browse the repository at this point in the history
We've recently taken to limit parallelism during builds due to Out Of
Memory errors we got while running `cargo test` on 16-core machine,
or 8-core machines running with `-j16`.

We hit OOM due to two reasons:
1. Commit fae5aed introduced `llvm`
    dependencies to the blockifier, which notably increased compilation
    and linking time.
2. Before fae5aed, we already had a quite
memory-heavy linking stage, specifically ~7-8 `ld` instances running in
parallel, each with ~2-3GB of memory, which was just enough not to hit
OOM on most 16core machines.
This was primarily due to a combination of factors:
    2.1 the blockifier being a large lib that needs a 2-3GB `ld` process
    to link in isolation.
    2.2 many crates have the blockifier as a dependency (transitively).
    2.3 `ld` naively parallelizes linking tasks.
The combination of the above three issues caused a heavily parallelized
environment to create many heavy linking jobs that shared no
memory (blockifier linking was duplicated).

This PR switches to the rust linker from `ld` to `lld`, thus solving the
OOM issue without artificially bounding the amount of
parallelization.

`lld` handles parallelizing better and reduces memory stress
from ~10 2-3GB `ld` linker instances post fae5aed,
to ~2 1-2GB `lld` linker instances for the same tasks, which has a
minimal memory impact.

Moreover, `lld` is already the default linker on `rust` nightly, and
will eventually become the default on stable (see code comment for link
to tracking issue).

Co-Authored-By: Gilad Chase <[email protected]>
  • Loading branch information
2 people authored and guy-starkware committed Oct 22, 2024
1 parent a467a6c commit 0f9a9d7
Show file tree
Hide file tree
Showing 3 changed files with 10 additions and 8 deletions.
14 changes: 7 additions & 7 deletions .cargo/config.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,10 +3,10 @@ LLVM_SYS_181_PREFIX = "/usr/lib/llvm-18/"
MLIR_SYS_180_PREFIX = "/usr/lib/llvm-18/"
TABLEGEN_180_PREFIX = "/usr/lib/llvm-18/"

# Limit concurrency to prevent crash during "cargo build" due to excessive memory usage. Should be removed in the future. Slows down performance.
[build]
jobs = 8

# Limit concurrency to prevent crash during "cargo test" due to excessive memory usage. Should be removed in the future. Slows down performance.
[test]
jobs = 8
# Use `lld` for linking instead of `ld`, since we run out of memory while linking with `ld` on
# 16-cores linux machines, see:
# https://nnethercote.github.io/perf-book/build-configuration.html#linking.
# TODO: remove this once `rust` stabilizes `lld` as the default linker, currently only on nightly:
# https://github.com/rust-lang/rust/issues/39915#issuecomment-618726211
[target.x86_64-unknown-linux-gnu]
rustflags = ["-Clink-arg=-fuse-ld=lld"]
1 change: 1 addition & 0 deletions .github/workflows/committer_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -87,6 +87,7 @@ jobs:
gcloud storage cp -r gs://committer-testing-artifacts/$NEW_BENCH_INPUT_FILES_PREFIX/* ./crates/committer_cli/test_inputs
# Benchmark the new code, splitting the benchmarks, and prepare the results for posting a comment.
- uses: ./.github/actions/bootstrap
- run: bash ./crates/committer_cli/benches/bench_split_and_prepare_post.sh benchmarks_list.txt bench_new.txt

- run: echo BENCHES_RESULT=$(cat bench_new.txt) >> $GITHUB_ENV
Expand Down
3 changes: 2 additions & 1 deletion scripts/dependencies.sh
Original file line number Diff line number Diff line change
Expand Up @@ -36,7 +36,8 @@ function setup_llvm_deps() {
libmlir-18-dev \
libpolly-18-dev \
libzstd-dev \
mlir-18-tools
mlir-18-tools \
lld
'
;;
*)
Expand Down

0 comments on commit 0f9a9d7

Please sign in to comment.