Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Erratic stack overflow in infer.rs 'live_vars' #633

Closed
acl-cqc opened this issue Oct 30, 2023 · 5 comments · Fixed by #638
Closed

Erratic stack overflow in infer.rs 'live_vars' #633

acl-cqc opened this issue Oct 30, 2023 · 5 comments · Fixed by #638
Assignees
Labels
bug Something isn't working

Comments

@acl-cqc
Copy link
Contributor

acl-cqc commented Oct 30, 2023

  • See branch tests/infer_stack_overflow for the test (not very minimal, sorry - I haven't tried to reduce)
  • Commit 6d4b446 precisely identifies the place where the HashSet's iteration order determines the outcome, and makes the test panic every time (without affecting any other test - obviously this hasn't come up before)
  • The problem seems to be that merge_equal_metas has found a set of equal variables where one also has a Plus constraint on another. This results in the merged variable effectively having a plus constraint on itself!
  • I also tried the fix of fix(Inference): Work harder in variable instantiation #591 (see branch tests/infer_stack_overflow2) , but although that PR adds code to deal with cycles, it doesn't fix the bug. I think the problem is that live_var (where the infinite-recursion happens) is called from results before we get to instantiating variables. This suggests that we may need to deal with cycles of Plus nodes (as fix(Inference): Work harder in variable instantiation #591 does) earlier, i.e. in main_loop rather than instantiate_variables ??
@acl-cqc acl-cqc added the bug Something isn't working label Oct 30, 2023
@croyzor
Copy link
Contributor

croyzor commented Oct 31, 2023

😱

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Nov 1, 2023

Ok, so a sufficient condition for creating a Meta that has a Plus on itself (the source of the infinite loop) appears to be to have nodes A,B,C; and edges A->B, A->C, B->C.

  • Clearly if B's extension_reqs are not already in A's output-extensions set, then a lift node will be needed on the edge A -> C
  • However, if B's extension_reqs are present in A's output-extensions set, no lift node is really required: you wouldn't need one if you specified all the input_extensions yourself. But if you don't do this, this might hang the algorithm. (I've not established what conditions there might be beyond a Plus constraint from a meta to itself)....

In this case the node A was a Const used in two places. "Fixes" to the test include:

  • Adding a Lift on the second use makes the test pass, see 09c07b6.
  • Adding a Lift on the first use lead to a similar infinite-loop but more difficult to spot (a cycle of Plus nodes of length > 1) :-( :-(
  • Adding a Lift node on both uses solved the above problem but meant we couldn't infer a set of input extensions to the Const node (!), so the test still fails.
  • An alternative fix (39e8740) was to fix the input-extensions of the Const (a unary-unit-sum!) to include the extension id (collections)! Which says that a solution does exist but that the inference algorithm is hanging rather than finding it.

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Nov 1, 2023

So, some easy steps for this particular case where node A is a Const:

  • For Const nodes: default to no input-extensions, as per refactor: NodeType constructors, adding new_auto #635
  • Remove the Equals constraint for Static edges. (A single line deletion.) Whilst this is not removing the need for lift nodes, it solves a needlessly-painful common case (i.e. using constants and calling functions).

For the general case, we'll need to work on the algorithm...

  • I note live_vars are only used in fn results. So rather than a live_var function that recurses (perhaps infinitely :-( ), we could calculate the set of live_vars (by a scan through all Plus constraints, using a queue and a "seen" hashset) at the beginning of fn results. This would avoid all such infinite loops :-).
    • This could be tricky if we really need to know which "variable" makes a particular meta live. However AFAICS there is only one variable (the input-extensions to the whole hugr)??? So we could make that clear by changing variables: HashSet to variables: Option....
  • But I have to ask how much this two-stage algorithm with variables is helping us; it adds a lot of complication. Could we not require the Hugr input extensions to be fixed (or constrain them to empty-set if open)?

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Nov 1, 2023

EDIT: Remainder moved to #639
This relates quite strongly to how we deal with cycles of Plus constraints, so I'll dump some thoughts here.

  • I suspect we should deal with cycles much earlier, i.e. in main_loop - either in merge_equal_metas or in parallel with it, rather than in instantiate_variables as in PR fix(Inference): Work harder in variable instantiation #591. That is, a cycle of Plus constraints, means all nodes on the cycle must be equal, so can be merged.
    • However, we must record the ExtensionId's added by the Plus constraints somehow....
    • How about not (just?) solutions but lower_bounds - a cycle of Plus constraints sets a lower bound, the merged meta must include the lower-bound extensions. (Maybe in addition to solutions? or maybe we also have upper_bounds when known - if there is a single concrete solution, then lower=upper. Or maybe we have another variant Constraint::IncludesSet - note we could remove Constraint::Plus if we then also had Constraint::IncludesMeta).
    • Note that when all Meta's connected by Equals constraints are merged, and all cycles of Plus constraints are merged, we have a tree of plus constraints. The leaves can therefore only be Metas with concrete solutions, or "variables". Every node will have a solution (if it doesn't depend upon a variable) or a lower-bound (if it does - the lower-bound is the union of Plus(ExtensionId,...) on any path to a variable). If there is only one variable (the input-extensions to the whole Hugr), then the solutions are just calculated by taking union of each lower-bound with the value of said variable....

I think the big advantage of this is it avoids the need for the lift node? In some cases? I.e. it'll fix a lower-bound for the outputs of A just because they must be equal to themselves union B.extension_reqs ?

@acl-cqc
Copy link
Contributor Author

acl-cqc commented Nov 1, 2023

EDIT: Remainder moved to #639

I think the big advantage of this is it avoids the need for the lift node? In some cases? I.e. it'll fix a lower-bound for the outputs of A just because they must be equal to themselves union B.extension_reqs ?

Of course no - if we require that input extensions at the target of every (static/value/control) edge equal output extensions at the source, we would still need lift nodes. But validation is different from inference, and it would be easy to change validation to check merely that extensions at the source of each edge were a subset of extensions at the target; perhaps the above then gives a procedure for inference?

github-merge-queue bot pushed a commit that referenced this issue Nov 7, 2023
closes #633

* Add test case. Reduced this to a 2-node DFG (plus Input and Output).
That this hasn't been showing up repeatedly in other tests - I put down
to #388: most custom ops have empty `extension_reqs` so do not generate
Plus constraints.
* Run (50%-failing) test 10 times -> failure rate ~~ 1 in 2^10
* Fix calculation of `live_vars` and `live_metas` by one-off traversal
of constraints+solutions in `fn results()`.

---------

Co-authored-by: Craig Roy <[email protected]>
github-merge-queue bot pushed a commit that referenced this issue Aug 16, 2024
Bumps [pytest-cov](https://github.com/pytest-dev/pytest-cov) from 4.1.0
to 5.0.0.
<details>
<summary>Changelog</summary>
<p><em>Sourced from <a
href="https://github.com/pytest-dev/pytest-cov/blob/master/CHANGELOG.rst">pytest-cov's
changelog</a>.</em></p>
<blockquote>
<h2>5.0.0 (2024-03-24)</h2>
<ul>
<li>Removed support for xdist rsync (now deprecated).
Contributed by Matthias Reichenbach in
<code>[#623](pytest-dev/pytest-cov#623)
&lt;https://github.com/pytest-dev/pytest-cov/pull/623&gt;</code>_.</li>
<li>Switched docs theme to Furo.</li>
<li>Various legacy Python cleanup and CI improvements.
Contributed by Christian Clauss and Hugo van Kemenade in
<code>[#630](pytest-dev/pytest-cov#630)
&lt;https://github.com/pytest-dev/pytest-cov/pull/630&gt;</code><em>,
<code>[#631](pytest-dev/pytest-cov#631)
&lt;https://github.com/pytest-dev/pytest-cov/pull/631&gt;</code></em>,
<code>[#632](pytest-dev/pytest-cov#632)
&lt;https://github.com/pytest-dev/pytest-cov/pull/632&gt;</code>_ and
<code>[#633](pytest-dev/pytest-cov#633)
&lt;https://github.com/pytest-dev/pytest-cov/pull/633&gt;</code>_.</li>
<li>Added a <code>pyproject.toml</code> example in the docs.
Contributed by Dawn James in
<code>[#626](pytest-dev/pytest-cov#626)
&lt;https://github.com/pytest-dev/pytest-cov/pull/626&gt;</code>_.</li>
<li>Modernized project's pre-commit hooks to use ruff. Initial POC
contributed by
Christian Clauss in
<code>[#584](pytest-dev/pytest-cov#584)
&lt;https://github.com/pytest-dev/pytest-cov/pull/584&gt;</code>_.</li>
</ul>
</blockquote>
</details>
<details>
<summary>Commits</summary>
<ul>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/5295ce01c84262cec88f31255e9ac538718f3047"><code>5295ce0</code></a>
Bump version: 4.1.0 → 5.0.0</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/1181b067972bf94569f8011f3b18f271690f9ab1"><code>1181b06</code></a>
Update changelog.</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/9757222e2e044361e70125ebdd96e5eb87395983"><code>9757222</code></a>
Fix a minor grammar error (<a
href="https://redirect.github.com/pytest-dev/pytest-cov/issues/636">#636</a>)</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/9f5cd81a0dbe3fe41681efdbef516c08988fe8ff"><code>9f5cd81</code></a>
Cleanup releasing instructions. Closes <a
href="https://redirect.github.com/pytest-dev/pytest-cov/issues/616">#616</a>.</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/93b5047ec5050d63c10a6fe16a09b671a7a03df8"><code>93b5047</code></a>
Add test for pyproject.toml loading without explicit --cov-config. Ref
<a
href="https://redirect.github.com/pytest-dev/pytest-cov/issues/508">#508</a>.</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/ff50860d7c67b920503745d92a3f0944cf41f982"><code>ff50860</code></a>
docs: add config instructions for pyproject.toml.</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/4a5a4b5fa4b1c63ddcab5cbc1813798c9b6f1d36"><code>4a5a4b5</code></a>
Keep GitHub Actions up to date with GitHub's Dependabot</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/1d7f55963d5138f41c452a946f7cca7e0b6ee8b2"><code>1d7f559</code></a>
Fix or remove URLs that are causing docs tests to fail</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/6a5af8e85b8242ac815f33e26adf9068f5f0ebc3"><code>6a5af8e</code></a>
Update changelog.</li>
<li><a
href="https://github.com/pytest-dev/pytest-cov/commit/d9fe8dfed15023d3410dd299c5092e755b8981c2"><code>d9fe8df</code></a>
Switch to furo. Closes <a
href="https://redirect.github.com/pytest-dev/pytest-cov/issues/618">#618</a>.</li>
<li>Additional commits viewable in <a
href="https://github.com/pytest-dev/pytest-cov/compare/v4.1.0...v5.0.0">compare
view</a></li>
</ul>
</details>
<br />


[![Dependabot compatibility
score](https://dependabot-badges.githubapp.com/badges/compatibility_score?dependency-name=pytest-cov&package-manager=pip&previous-version=4.1.0&new-version=5.0.0)](https://docs.github.com/en/github/managing-security-vulnerabilities/about-dependabot-security-updates#about-compatibility-scores)

Dependabot will resolve any conflicts with this PR as long as you don't
alter it yourself. You can also trigger a rebase manually by commenting
`@dependabot rebase`.

[//]: # (dependabot-automerge-start)
[//]: # (dependabot-automerge-end)

---

<details>
<summary>Dependabot commands and options</summary>
<br />

You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits
that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after
your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge
and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating
it. You can achieve the same result by closing it manually
- `@dependabot show <dependency name> ignore conditions` will show all
of the ignore conditions of the specified dependency
- `@dependabot ignore this major version` will close this PR and stop
Dependabot creating any more for this major version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop
Dependabot creating any more for this minor version (unless you reopen
the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop
Dependabot creating any more for this dependency (unless you reopen the
PR or upgrade to it yourself)


</details>

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants