[WIP] Stop materializing placeholder liveness at all points #157
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
To align the code with the datalog rules, and clean up our
lazinesslove for simplicity, we'd have to remove the workaround where we materialized liveness data for the placeholder origins to be live at all points, where we historically had more complex rules in our soufflé prototypes, involving;
for every join ensuring an origin's liveness.Doing that with datafrog is painful and verbose of course, but the point of opening this draft PR is for discussion during the next sprint. And see whether the cost is worth the gain.
While the
LocationInsensitive
andNaive
uses are mostly straightforward cases of switching fromorigin_live_on_entry(Origin1, TargetNode)
predicates to(origin_live_on_entry(Origin1, TargetNode); placeholder(Origin1, _))
predicates by expanding the whole rule for each of the alternatives, theOpt
variant contains more subtle cases (in addition to these simple but verbose ones).Since the variant is trying to limit the subset TC to origins dying along an edge, detecting when liveness ends is required, and relies on many instances of
!origin_live_on_entry(Origin1, TargetNode)
antijoins. In some rules, there seems to be nothing required for these antijoins to support the now-missing materialization: some rules which were not applicable to placeholders (whether because of fact generation, or earlier-rules filtering them out) when they were present inorigin_live_on_entry
, still aren't.For example, Opt rule 14:
No loans are issued in placeholders, this relation is the same whether they're present in
origin_live_on_entry
or not.For others, like Opt rule 3 which uses both polarities, it doesn't seem to be that simple:
If
origin2
is a placeholder, the antijoin would have previously eliminated it fromlive_to_dying_regions
, but now it seems we'd need to add another predicate like so:otherwise
origin2
would be considereddying
here !And the translation rules would be:
origin_live_on_entry
had an implicit union with a join to theplaceholder
relation, which now needs to be made explicit in the rules with an alternative predicate. That mechanically expands to the combinations of all alternatives (some rules like Opt rule 10 have 2 such alternative predicates, and therefore 4 datafrog expansions, for a whopping 62 line piece of code) (these expansions are done in this PR, in all 3 variants, and were all done using leapjoins with either a "filter" or "extend" leaper, depending on the WF-ness of the resulting leapjoin. Note that this is operationally very similar sinceplaceholder_origin
is a single-value relation, there's only a single unit tupleextend_with
leapers can find for the join; veryfilter_with
-like)origin_live_on_entry
had an implicit antijoin with theplaceholder
relation, which now needs to be made explicit in the rules, with a new antijoin predicate. That mechanically expands to either a new leaper when leapjoins are used, or another intermediate relation when regular joins are used (neither of which is done in this PR yet).The rustc UI test suite seems to behave similarly with and without this PR, so there may not be code which can exercise these subtleties, but at least we can talk about the theoretical aspects.
Does this seem correct to you all ? What do you think of the complexity this adds, versus the (hard to quantify, but I haven't really tried to do so for reals yet) gains ? (Maybe a macro based approach would be acceptable, or maybe more work in datapond to handle leapjoins and the most complicated rules could hide all of this under a proc macro veneer of simplicity 🌠 ).
For easier review/analysis, there's a commit for each variant (and from there one can go see the complete resulting file to have an idea of the damage):
LocationInsensitive
d8cfac9Naive
26a3966Opt
213c996Let's discuss !