-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Noir not optimizing out increments to mutable variable in Brillig #6971
Comments
This looks like a known downside of the current mem2reg pass which is in handling loops. See the early comments on #4535 as an example. The short is that due to the back edges in the call graph we're unsure if the reference may be mutated or not at that point so we can't optimize |
Couldn't the previous set value be tracked, so that if we have to load a value to add to it we'd know it? |
Some other things I'm finding. If we don't use global MAX: u32 = 2;
global MIN: u32 = 1;
fn main(input: [u8; MIN]) -> pub [u8; MAX] {
let mut output = [0; MAX];
let mut offset = 0;
// for i in 0..MIN {
// output[offset] = input[i];
// }
offset += MIN;
let size = MAX - offset;
for _ in 0..size {
output[offset] = 0;
}
output
} then it's correctly optimized. Also if we print global MAX: u32 = 2;
global MIN: u32 = 1;
fn main(input: [u8; MIN]) -> pub [u8; MAX] {
let mut output = [0; MAX];
let mut offset = 0;
for i in 0..MIN {
output[offset] = input[i];
}
offset += MIN;
println(offset);
output
} the SSA has the |
I'll keep writing the things I found, I'm trying to see if I can improve this. For this program: global MAX: u32 = 2;
global MIN: u32 = 1;
fn main(input: [u8; MIN]) -> pub [u8; MAX] {
let mut output = [0; MAX];
let mut offset: u32 = 0;
for i in 0..MIN {
output[offset] = input[i];
}
offset += MIN;
let size = MAX - offset;
for i in 0..size {
output[offset + i] = 0;
}
output
} at one point (with
We can see that these:
were replaced by this:
However, a bit below there's still this:
I don't know why that load wasn't replaced with the constant "1". Maybe it's because |
@jfecher I was wondering if mem2reg could be optimized in the following way:
For example, given this SSA:
when computing the value of Given this code, though:
When computing the value of I actually started to code this to see if it works, but |
@asterite it depends on your "(and its predecessors also don't have stores to that address)" qualification. If by that you mean only its direct predecessors, then it could be broken by adding more blocks so that the store is more hidden. If it doesn't find the store, it'll assume the other value which ignores the store (but shouldn't). If by that you mean recursively check all predecessors (for every reference?) then I think it almost works but gets caught in the weeds. Where it gets caught is: how do you know the values for each reference in You mentioned doing a scanning pass beforehand but this'd just move the same question to that scanning pass. As long as you're only doing one scanning pass you'll inherently have this issue I think. My current thinking on this is that we'll probably have to move to do something more similar to other existing mem2reg passes and actually do multiple scans through the function. How those passes work is they scan the entire function multiple times until the results become stable (stop changing after each scan). In our case we'd expect loops would initially start with Unknown values but after the second pass would become Known. For our current algorithm though I think this new way would have performance implications. Since after the second pass the value becomes known... but what if there is just another loop afterward? It's starting value would be known so it is the same situation as we started with, so we'd need another pass for each subsequent loop. For a program with possibly many loops we'd be iterating at least once for each loop if we really wanted to run this until the results were stable. One option from here would be just to statically limit this to a maximum of say 5 iterations (arbitrary) and just accept the generated code could be optimized more. Edit: If we don't go with a new mem2reg algorithm then to solve this with (mostly) our current algorithm I think we'll need some way to let mem2reg know "v1 or its aliases do not change in this loop (set of blocks)" ahead of time so that we can use that in the unify check. If we're finding starting values and are between
|
Aim
We got this reduction with @aakoshh while debugging an increase in opcodes in some program.
Consider this code:
The final SSA, when compiled with
--force-brillig
is this:Now, if we change
offset += MIN
tooffset = MIN
:we get this SSA:
Given that tracking an
offset
value, mutating it, and using it to modify arrays is a very common operation, this could be a good optimization to try... or first understand why the compiler isn't already optimizing this.Expected Behavior
The compiler should produce the second final SSA in both cases.
Bug
The compiler isn't fully optimizing the program as much as it could.
Relatedly, if the code is this:
then the program is correctly optimized... so there might be more things going on here.
To Reproduce
Workaround
None
Workaround Description
No response
Additional Context
No response
Project Impact
None
Blocker Context
No response
Nargo Version
No response
NoirJS Version
No response
Proving Backend Tooling & Version
No response
Would you like to submit a PR for this Issue?
None
Support Needs
No response
The text was updated successfully, but these errors were encountered: