Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase in memory used to compile rollup-base-public #7001

Open
aakoshh opened this issue Jan 9, 2025 · 5 comments
Open

Increase in memory used to compile rollup-base-public #7001

aakoshh opened this issue Jan 9, 2025 · 5 comments
Labels
bug Something isn't working

Comments

@aakoshh
Copy link
Contributor

aakoshh commented Jan 9, 2025

Aim

Followup for #6972 added a new mem2reg pass. As a consequence we saw a 100% increase in memory used during the compilation of one of the protocol circuits in aztec-packages.

Expected Behavior

Didn't expect a significant increase in memory usage.

Bug

#6972 (comment)

To Reproduce

See how CI does it.

Workaround

None

Workaround Description

No response

Additional Context

No response

Project Impact

None

Blocker Context

No response

Nargo Version

nargo version = 1.0.0-beta.1 noirc version = 1.0.0-beta.1+bb8dd5ce43f0d89e393bd49f8415008826903652 (git version hash: 13b5871, is dirty: false)

NoirJS Version

No response

Proving Backend Tooling & Version

No response

Would you like to submit a PR for this Issue?

None

Support Needs

No response

@aakoshh aakoshh added the bug Something isn't working label Jan 9, 2025
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Noir Jan 9, 2025
@TomAFrench
Copy link
Member

I think a contributing factor this is how early we perform the inlining pass. Looking at the ordering of passes:

.run_pass(Ssa::remove_unreachable_functions, "Removing Unreachable Functions")
.run_pass(Ssa::defunctionalize, "Defunctionalization")
.run_pass(Ssa::remove_paired_rc, "Removing Paired rc_inc & rc_decs")
.run_pass(|ssa| ssa.inline_functions(options.inliner_aggressiveness), "Inlining (1st)")
// Run mem2reg with the CFG separated into blocks
.run_pass(Ssa::mem2reg, "Mem2Reg (1st)")
.run_pass(Ssa::simplify_cfg, "Simplifying (1st)")
.run_pass(Ssa::as_slice_optimization, "`as_slice` optimization")
.run_pass(Ssa::remove_unreachable_functions, "Removing Unreachable Functions")
.try_run_pass(
Ssa::evaluate_static_assert_and_assert_constant,
"`static_assert` and `assert_constant`",
)?
.run_pass(Ssa::loop_invariant_code_motion, "Loop Invariant Code Motion")
.try_run_pass(
|ssa| ssa.unroll_loops_iteratively(options.max_bytecode_increase_percent),
"Unrolling",
)?
.run_pass(Ssa::simplify_cfg, "Simplifying (2nd)")
.run_pass(Ssa::mem2reg, "Mem2Reg (2nd)")
.run_pass(Ssa::flatten_cfg, "Flattening")
.run_pass(Ssa::remove_bit_shifts, "Removing Bit Shifts")
// Run mem2reg once more with the flattened CFG to catch any remaining loads/stores
.run_pass(Ssa::mem2reg, "Mem2Reg (3rd)")
// Run the inlining pass again to handle functions with `InlineType::NoPredicates`.
// Before flattening is run, we treat functions marked with the `InlineType::NoPredicates` as an entry point.
// This pass must come immediately following `mem2reg` as the succeeding passes
// may create an SSA which inlining fails to handle.
.run_pass(
|ssa| ssa.inline_functions_with_no_predicates(options.inliner_aggressiveness),
"Inlining (2nd)",
)
.run_pass(Ssa::remove_if_else, "Remove IfElse")
.run_pass(Ssa::fold_constants, "Constant Folding")
.run_pass(Ssa::remove_enable_side_effects, "EnableSideEffectsIf removal")
.run_pass(Ssa::fold_constants_using_constraints, "Constraint Folding")
.run_pass(Ssa::dead_instruction_elimination, "Dead Instruction Elimination (1st)")
.run_pass(Ssa::simplify_cfg, "Simplifying:")
.run_pass(Ssa::array_set_optimization, "Array Set Optimizations")

You can see that we pretty much immediately inline all of the functions into the entrypoint function. This means that if we've got a function which is used in multiple places, we're going to have to do all the later passes N different times rather than fully simplifying the function on its own before we inline it.

Note that we can't just run all the various passes on every function before inlining. We're going to need to at the least tolerate loop unrolling failing as loop bounds may come from function arguments, also some thought would need to go into flattening as well (maybe?)

@TomAFrench
Copy link
Member

Image

Memory flamegraph showing that the blocks field of the PerFunctionContext within mem2reg is holding all the memory.

@aakoshh
Copy link
Contributor Author

aakoshh commented Jan 13, 2025

This shows the heaviest stack trace in terms of memory allocations:
Image

It's also inside Block::unify like in the flamegraph above, it points at the im::OrdMap as the culprit, which is the data structure backing all fields in Block.

@jfecher
Copy link
Contributor

jfecher commented Jan 13, 2025

The various maps in Block have been a known memory issue in the past. They were changed from hashmaps to OrdMaps since those used less memory in some tests in the past. I think further improvement will require more than just a container change. E.g. sacrificing optimizations by arbitrarily removing known values in a Block after some limit. Or rewriting mem2reg more thoroughly to use a different algorithm.

Edit: perhaps an easier change would be to add a check to drop any blocks we don't need any more (blocks whose successors are all finished as well).

@aakoshh
Copy link
Contributor Author

aakoshh commented Jan 13, 2025

With a bit of refactoring I could at least narrow it down to the maintenance of aliases, rather than the other fields that use OrdMap:
Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 📋 Backlog
Development

No branches or pull requests

3 participants