Increase in memory used to compile `rollup-base-public` #7001

aakoshh · 2025-01-09T12:12:23Z

Aim

Followup for #6972 added a new mem2reg pass. As a consequence we saw a 100% increase in memory used during the compilation of one of the protocol circuits in aztec-packages.

Expected Behavior

Didn't expect a significant increase in memory usage.

Bug

#6972 (comment)

To Reproduce

See how CI does it.

Workaround

None

Workaround Description

No response

Additional Context

No response

Project Impact

None

Blocker Context

No response

Nargo Version

nargo version = 1.0.0-beta.1 noirc version = 1.0.0-beta.1+bb8dd5ce43f0d89e393bd49f8415008826903652 (git version hash: 13b5871, is dirty: false)

NoirJS Version

No response

Proving Backend Tooling & Version

No response

Would you like to submit a PR for this Issue?

None

Support Needs

No response

The text was updated successfully, but these errors were encountered:

TomAFrench · 2025-01-13T12:33:03Z

I think a contributing factor this is how early we perform the inlining pass. Looking at the ordering of passes:

noir/compiler/noirc_evaluator/src/ssa.rs

Lines 153 to 191 in 74d258f

    
           .run_pass(Ssa::remove_unreachable_functions, "Removing Unreachable Functions") 
        
           .run_pass(Ssa::defunctionalize, "Defunctionalization") 
        
           .run_pass(Ssa::remove_paired_rc, "Removing Paired rc_inc & rc_decs") 
        
           .run_pass(|ssa| ssa.inline_functions(options.inliner_aggressiveness), "Inlining (1st)") 
        
           // Run mem2reg with the CFG separated into blocks 
        
           .run_pass(Ssa::mem2reg, "Mem2Reg (1st)") 
        
           .run_pass(Ssa::simplify_cfg, "Simplifying (1st)") 
        
           .run_pass(Ssa::as_slice_optimization, "`as_slice` optimization") 
        
           .run_pass(Ssa::remove_unreachable_functions, "Removing Unreachable Functions") 
        
           .try_run_pass( 
        
               Ssa::evaluate_static_assert_and_assert_constant, 
        
               "`static_assert` and `assert_constant`", 
        
           )? 
        
           .run_pass(Ssa::loop_invariant_code_motion, "Loop Invariant Code Motion") 
        
           .try_run_pass( 
        
               |ssa| ssa.unroll_loops_iteratively(options.max_bytecode_increase_percent), 
        
               "Unrolling", 
        
           )? 
        
           .run_pass(Ssa::simplify_cfg, "Simplifying (2nd)") 
        
           .run_pass(Ssa::mem2reg, "Mem2Reg (2nd)") 
        
           .run_pass(Ssa::flatten_cfg, "Flattening") 
        
           .run_pass(Ssa::remove_bit_shifts, "Removing Bit Shifts") 
        
           // Run mem2reg once more with the flattened CFG to catch any remaining loads/stores 
        
           .run_pass(Ssa::mem2reg, "Mem2Reg (3rd)") 
        
           // Run the inlining pass again to handle functions with `InlineType::NoPredicates`. 
        
           // Before flattening is run, we treat functions marked with the `InlineType::NoPredicates` as an entry point. 
        
           // This pass must come immediately following `mem2reg` as the succeeding passes 
        
           // may create an SSA which inlining fails to handle. 
        
           .run_pass( 
        
               |ssa| ssa.inline_functions_with_no_predicates(options.inliner_aggressiveness), 
        
               "Inlining (2nd)", 
        
           ) 
        
           .run_pass(Ssa::remove_if_else, "Remove IfElse") 
        
           .run_pass(Ssa::fold_constants, "Constant Folding") 
        
           .run_pass(Ssa::remove_enable_side_effects, "EnableSideEffectsIf removal") 
        
           .run_pass(Ssa::fold_constants_using_constraints, "Constraint Folding") 
        
           .run_pass(Ssa::dead_instruction_elimination, "Dead Instruction Elimination (1st)") 
        
           .run_pass(Ssa::simplify_cfg, "Simplifying:") 
        
           .run_pass(Ssa::array_set_optimization, "Array Set Optimizations")

You can see that we pretty much immediately inline all of the functions into the entrypoint function. This means that if we've got a function which is used in multiple places, we're going to have to do all the later passes N different times rather than fully simplifying the function on its own before we inline it.

Note that we can't just run all the various passes on every function before inlining. We're going to need to at the least tolerate loop unrolling failing as loop bounds may come from function arguments, also some thought would need to go into flattening as well (maybe?)

TomAFrench · 2025-01-13T12:35:28Z

Memory flamegraph showing that the blocks field of the PerFunctionContext within mem2reg is holding all the memory.

aakoshh · 2025-01-13T14:57:27Z

This shows the heaviest stack trace in terms of memory allocations:

It's also inside Block::unify like in the flamegraph above, it points at the im::OrdMap as the culprit, which is the data structure backing all fields in Block.

jfecher · 2025-01-13T16:07:51Z

The various maps in Block have been a known memory issue in the past. They were changed from hashmaps to OrdMaps since those used less memory in some tests in the past. I think further improvement will require more than just a container change. E.g. sacrificing optimizations by arbitrarily removing known values in a Block after some limit. Or rewriting mem2reg more thoroughly to use a different algorithm.

Edit: perhaps an easier change would be to add a check to drop any blocks we don't need any more (blocks whose successors are all finished as well).

aakoshh · 2025-01-13T16:32:54Z

With a bit of refactoring I could at least narrow it down to the maintenance of aliases, rather than the other fields that use OrdMap:

aakoshh added the bug Something isn't working label Jan 9, 2025

github-project-automation bot added this to Noir Jan 9, 2025

github-project-automation bot moved this to 📋 Backlog in Noir Jan 9, 2025

aakoshh mentioned this issue Jan 9, 2025

fix: Reproduce and fix bytecode blowup #6972

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Increase in memory used to compile `rollup-base-public` #7001

Increase in memory used to compile `rollup-base-public` #7001

aakoshh commented Jan 9, 2025

TomAFrench commented Jan 13, 2025

TomAFrench commented Jan 13, 2025

aakoshh commented Jan 13, 2025 •

edited

Loading

jfecher commented Jan 13, 2025 •

edited

Loading

aakoshh commented Jan 13, 2025

Increase in memory used to compile rollup-base-public #7001

Increase in memory used to compile rollup-base-public #7001

Comments

aakoshh commented Jan 9, 2025

Aim

Expected Behavior

Bug

To Reproduce

Workaround

Workaround Description

Additional Context

Project Impact

Blocker Context

Nargo Version

NoirJS Version

Proving Backend Tooling & Version

Would you like to submit a PR for this Issue?

Support Needs

TomAFrench commented Jan 13, 2025

TomAFrench commented Jan 13, 2025

aakoshh commented Jan 13, 2025 • edited Loading

jfecher commented Jan 13, 2025 • edited Loading

aakoshh commented Jan 13, 2025

Increase in memory used to compile `rollup-base-public` #7001

Increase in memory used to compile `rollup-base-public` #7001

aakoshh commented Jan 13, 2025 •

edited

Loading

jfecher commented Jan 13, 2025 •

edited

Loading