-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor alias analysis (WIP) #819
base: scala-2
Are you sure you want to change the base?
Conversation
…targetsOn. Assert that the latter never produces invalid selections.
… understanding. More renaming.
// NOTE(gsps): I believe this check is redundant except for the cases where we try to | ||
// select class fields on expressions whose type is an abstract type def. | ||
// We have a test case (extraction/valid/TypeMembers2.scala) in which we expand an | ||
// abstract base class' method to a dispatcher, and such type defs appear. Simply omitting | ||
// the effect by returning the empty set in such cases actually seems an odd choice to me, | ||
// though I can't see how to exploit it for unsoundness. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This rather sounds like a bug to me. I'm not sure I understand why we get the None
case here, though. Shouldn't asClassType
resolve the type def to a real class?
@romac: I think you wrote this code, why are we using an Option
in rec? Shouldn't we just crash if we find a class selector on a non-class type (after abstract type resolution)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's a bit complicated, I spent a couple of hours grokking all this yesterday. Basically, the abstract method in the test case gets expanded to something like
// Abstract method c on M
def c(thiss: M, t: thiss.T): Unit = {
if (thiss.isInstanceOf[F]) c_(thiss.asInstanceOf[F], t)
else ...
}
// Concrete method c on F
def c_(thiss: F, t_: thiss.T): Unit = { t_.x += 2 }
At the call site of c_
we try to relate some effects on t_
(the parameter of c_
) to t
(the argument being passed here). This involves taking the effect on t_.x
from c_
and "rebasing" it on t
. Since we aren't flow-sensitive, the type of t
in c
is still an abstract thiss.T
, though, so the sanity check fails here.
Note that initially I thought none of this might be a problem and we could just call computeTargets
anyways, but it turns out that we reject it later (with a MalformedStainlessCode exception, I believe). I forgot where exactly where it fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand, at the call site in test
do we see a call to c
or c_
? I would expect the Scala compiler to know that we're calling c_
here since the receiver type is concrete.
If it's a call to c_
, then we shouldn't need any flow analysis to determine that thiss.T
has a concrete override since thiss
should have type F
. What am I missing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I wasn't very clear here, I was just talking about the snippet I had posted. You can completely ignore test
, the issue also occurs without it.
It's a call to c_
, but a synthetic one that we introduced when expanding the abstract c
to something that model virtual dispatch. So there's no chance for scalac
to infer anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahhh, I see.
I wonder if we should be injecting a cast on t
in the synthetic dispatch then. Something like t.asInstanceOf[thiss.asInstanceOf[F].T]
? I assume this would allow type parameter resolution.
Is the type check even valid without that cast?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't we just inline the non-recursive calls until we see a t.asInstanceOf[thiss.asInstanceOf[F].T]
receiver?
Actually, in which situations are we calling targetsOn
? I would have thought that it wouldn't be invoked at all without the test
method in the original benchmark. Which effect are we trying to resolve when it's absent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We call it only in the situation mentioned in the second comment above (see the part of comment that goes "At the call site of c_ we try to ..."). There's no inlining happening there (this is in computeExprEffects
rather than computeTargets
) -- but perhaps you are suggesting to add inlining here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you talking about this line?
If I understand the code correctly, we enter that case when we've found a concrete effect that occurs in the body of the callee (in this case, c_
has an effect on t_.x
). We then try to resolve this effect on the provided argument (i.e. thiss
).
It seems to me that if we find an effect for the function parameter, we should correctly encode all the information we need to resolve it on the provided argument. This might include a path condition and possibly some casts if necessary. I thought we had support for this anyway? Don't we need those to support effects on this
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, we do (minus the casts): that's essentially Target
. The existing code mostly projects these down to Effect
s, which drop the path condition, however. As far as I can tell, the current situation is that whenever we deal with a function invocation we take these as an approximation of the function's effects, i.e., we disregard the path conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I think the rationale is that updating the aliased variable is fine even when the path condition doesn't hold (the update will just be a noop). This is probably necessary to deal with recursion. There is some support for inferring a minimal path condition while updating ADT fields when we might be given a different constructor instance: see mapApplication.
That code is fairly horrible, though... and I'm not sure it will translate well to type defs where the instance-of needs to be applied to a different argument (thiss
instead of t
).
@@ -336,6 +367,9 @@ trait EffectsAnalyzer extends oo.CachingPhase { | |||
And(Not(cnd).setPos(cnd), e.setPos(cnd)).setPos(cnd) | |||
} getOrElse(Not(cnd).setPos(cnd)) | |||
|
|||
// FIXME: This seems wrong: why are we ignoring t.condition? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@romac: bug?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it should likely be conjoined with cnd
.
// This function provides an under-approximation in some cases; in particular, it might return | ||
// an empty set for certain "non-local" constructs like function invocations. These seem to be | ||
// handled specially elsewhere. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Regis' approach to aliasing, function invocations never return an aliased value (it is always fresh). We relaxed this to allow aliasing in non-recursive functions, but left is disabled in recursive ones.
I thought we used to check this constraint in the EffectsChecker
, but I can't seem to find the corresponding code. @romac any idea where it lives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose you're referring to this? The recursive case will fall through to return Set.empty
. It's a bit hard to reason about whether we always deal with the empty set correctly, since computeTargets
(i.e. formerly getEffects
) gets used in a few places and there are also the more and less strict versions computeExactTargets
and computeKnownTargets
. The strict version (which turns an empty set into an exception) is used with array updates, mutable map updates and field assignments in AntiAliasing. Does that cover what you had in mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is that an empty set means nothing will be mutated. Since we assume that function invocation results are always fresh (i.e. contain no aliased values), we can safely return an empty set of targets since mutating a function invocation result won't affect any state. I do agree that computeExactTargets
and computeKnownTargets
are quite confusing, it would be worth cleaning those up. I'm not sure we actually need them.
What I meant about the EffectsChecker
constraint is that although we assume here that function results are always fresh, we don't seem to actually enforce this anywhere. I seem to recall we had this check at some point but I can't find it anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, that's a very useful remark.
Looking at a concrete case where the empty set case is differentiated from the non-empty case, I'm still a bit puzzled though. That particular piece of code translates Let
s during AntiAliasing. If we know the exact effects of the let-binding's value, we can apparently conclude that that value won't change any further -- that is, we translate it as a Let and simply rewrite all occurrences of the binding by the (translated) value expression.
On the other hand, if we find the set of effects to be empty (or, implicitly, if an error occurred), we translate the binding as a LetVar and allow it to be mutated later on. That's somewhat surprising to me. What kind of mutation could be applied to that binding that would be detected by computeTargets
, but wouldn't already have been encoded in the (translated) value expression of the binding?
(I realize I might be asking a very implementation-specific question here, so don't feel obliged to do any digging, if it's not obvious to you either.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the idea is that an empty set means nothing will be mutated.
Without qualification, that's not really true, though, because function invocations might absolutely mutate their arguments, but we return the empty set for non-trivial (e.g. recursive) function invocations.
I think what's really meant by "effect" in this phase is "observable effects on the environment". That is,
a) we ignore effects on things that can't be referred to from the current scope (through some series of selections), and
b) we also ignore effects on things that are in scope, but have been consumed by some operations (chiefly, function invocations -- not sure if there are others). (Through the lens of a linear type system, these things arguably weren't in scope any longer, making this a special case of a).)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at a concrete case where the empty set case is differentiated from the non-empty case, I'm still a bit puzzled though.
For let-bindings with mutable type, we're dealing with the following two cases:
- We can compute the exact alias on the rhs of the let-binding. In that case, we can support this simple form of aliasing by replacing all occurrences of the bound variable with the set of computed targets.
- The rhs of the let-binding is fresh, which is (possibly wrongly) captured by
getKnownEffects(lhs).isEmpty
. In this case, we can replace the let-binding by a let-var and rewrite effects on parts of the binding to apply to the let-var itself (e.g.x.a.b = 1
will be rewritten tox = A(B(1))
).
It seems to me that the case where we couldn't compute exact aliasing information and the rhs is not fresh isn't handled here. Maybe that case was ruled out in one of the previous checks? I agree it would make sense to improve the computeTargets
function to have a clearer API. I'm not sure why ADT(...)
is considered malformed (when the path head doesn't match some field) but not Array(...)
. Ideally, the computeTargets
function would correctly capture the aliasing/freshness/unsupported state of its input.
I think what's really meant by "effect" in this phase is "observable effects on the environment".
Ah, I wasn't clear with what I meant by "nothing will be mutated". I meant that given some expression 'E[t]' where 'E[.]' applies some mutation on 't', if computeTargets(t)
is empty, then we can ignore the result of 't' when computing global and local mutations. The observable effects should be equivalent to 'E[deep_copy(t)]'.
You can't ignore 'E[.]' or 't' when computing the actual effects, but you can ignore the effect of 'E[.]' on 't'.
(This might be the same thing you're saying :) getting a good mental image of effects is always a bit difficult.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Sorry for the wall of text. Been appending to this since yesterday night.)
The rhs of the let-binding is fresh, which is (possibly wrongly) captured by getKnownEffects(lhs).isEmpty.
We also have isExpressionFresh
in EffectChecker
which seems to be another way of determining fresh-ness. Not sure if that one would be more appropriate.
I agree it would make sense to improve the computeTargets function to have a clearer API. (...)
I've been toying with some different representations for the result of computeTargets
for the last two days. The way I now think about computeTargets(expr)
is that it should compute a mapping effAt : Expr => Eff
from path conditions to Eff
, where Eff
can be one of three things:
- "no effect" (when there is in fact no effect under that path condition)
- "unobservable effect" (when there is an effect, but it occurs on a fresh value)
- "observable effect on receiver#path" (basically an instance of what is called
Effect
now).
The way it is actually implemented right now only keeps track of the latter (cumulatively represented as a Set[Target]
), and conflates everything else into Set.empty
(or raises an exception, if it encounters unsupported code).
I'm not completely sure whether it's worth explicitly representing the first two cases. But looking through the various cases where we combine targets, I noted that the behavior of computeTargets
is slightly dubious:
- For
If
s we seem to end up with the empty set, if either of the then or the else branch yields the empty set. - For
Let
s we end up with the empty set, if either of the binding's value or the body yields the empty set. ForLet
s, this immediately leads to an exception, though.
So, to summarize the current situation: Even without computeKnownTargets
(which turns exceptions into the empty set), computeTargets
might return the empty set as some sort of "unknown" sentinel value, and definitely does not just indicate "fresh" values.
I think the only reason why this doesn't break lots of tests is that we usually go through computeExprEffects
, which actually handles these cases (e.g., If
s) conservatively (taking the union of all effects). Also, in AntiAliasing
, we only use targets in two other ways: computeExactTargets
to find out which targets are affected by a field assignment, array-, or map-update. This usage is safe, since it actually treats empty sets as an error. computeKnownTargets
is only used when rewriting Let
bindings, as we discussed earlier. So the addendum to your above explanation is that in case 2.) the value might in fact not be fresh.
Speaking of computeExprEffects
, there's also some strange behavior:
- For
Let
s we seem to randomly pick an effect from the binding's value when extending the environment, because(effect(e, env): Set[Effect]).map(vd.toVariable -> _)
will create key value pairs with the same key, so only the last one will survive.
Now based on these insights, I managed to find some new unsound examples:
- In the first example, function
f
below, the behavior ofcomputeTargets
onIf
lets us create bindings that are ostensibly effect-free. We can incorrectly assert thatf
is pure (andAntiAliasing
will not rewrite the function to return an updatedb
). - In the other two examples, functions
g
andh
below, lead to crashes duringImperativeCodeElimination
, because we try to Assign to bindings that are Lets, not LetVars. The underlying issue has to do with both the handling ofLet
inAntiAliasing
and with the bug incomputeExprEffects
mentioned above, I believe.
import stainless.annotation.pure
case class Box(var length: Int)
object Foo {
def makeBox(): Box = Box(0)
// Passes and omits effect on b
// (Notice that this function is in fact *not* pure!)
@pure def f(b: Box): Unit = {
val c = // => Set.empty!
if (true) b // Set(Target(b, ...))
else makeBox() // Set.empty
c.length = 1 // Just affects c locally, and leaves b unchanged
}
def mutate(b: Box): Unit = { b.length = 123; }
// Crash in ImperativeCodeElimination and "duplicates" effect on both b1 and b2
def g(b1: Box, b2: Box): Unit = {
val c = if (true) b1 else b2
mutate(c)
// The above call leads AntiAliasing to produce nonsensical conditions for mutating b1 and b2:
// > var res: (Unit, Box) = mutate(if (true) b1 else b2)
// > (if (res._2.isInstanceOf[Box]) {
// > b1 = @unchecked Box(@unchecked res._2.asInstanceOf[Box].length).asInstanceOf[Box]
// > } else {
// > ()
// > })
// > (if (res._2.isInstanceOf[Box]) {
// > b2 = @unchecked Box(@unchecked res._2.asInstanceOf[Box].length).asInstanceOf[Box]
// > } else {
// > ()
// > })
}
// Crash in ImperativeCodeElimination and omits effect on b1/b2
def h(b1: Box, b2: Box): Unit = {
val c = if (true) b1 else b2
val d = c // Interacts with Let handling `computeExprEffects`
// We try to assign to val c (and nothing else!)
mutate(d)
}
}
I had forgotten about this PR, I guess it has many useful changes that could be included? |
No description provided.