Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Limit chalk by time instead of iteration count #16981

Conversation

stepantubanov
Copy link

@rustbot rustbot added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 31, 2024
@lnicola
Copy link
Member

lnicola commented Mar 31, 2024

I don't understand how this works. Do 1000 steps take longer than one second?

@stepantubanov
Copy link
Author

I don't understand how this works. Do 1000 steps take longer than one second?

In these cases yes, it gets exponentially slower with each step. Does one second sound like a reasonable limit?

@flodiebold
Copy link
Member

This would make trait solving non-deterministic though. I'm not sure that's a good idea.

@Veykril
Copy link
Member

Veykril commented Mar 31, 2024

This would make trait solving non-deterministic though. I'm not sure that's a good idea.

I am actually curious whether this would cause problems? I guess what could happen is that we a goal stays unsolved due a timeout even if it could get calculated after a change? Though in that case it feels like the query should've been invalidated anyways meaning that shouldn't happen.

Also note that the new trait solver is likely to have timed fuel as well instead of counter as we have it right now (learned that this week from a discussion with lcnr)

@Veykril
Copy link
Member

Veykril commented Mar 31, 2024

Reading up on #4315 (an issue we had due to nondeterminism), I don't think this would be a problem here, as the trait solve query is not LRU cached (which I believe is the main issue wrt to nondeterminism).

@Veykril
Copy link
Member

Veykril commented Apr 3, 2024

@matklad maybe you have some insight on whether this kind of non-determinism could cause problems since you were there back when the problem appeared for proc-macros (I believe it shouldn't as it's not an LRU evictable query as it is the case for macro expansions)

@matklad
Copy link
Member

matklad commented Apr 3, 2024

Also note that the new trait solver is likely to have timed fuel as well instead of counter as we have it right now (learned that this week from a discussion with lcnr)

This is probably the most consequential thing in this issue. Is there any write-up for why it is the case? It seems that deterministic fuel should be fairly easy to add. In the limit, one can measure CPU instructions or some such, but there's bound to be some sort of primitive "step" which can be counted.

And I'd say that, over the long run, deterministic fuel is much more attractive. While it is true that with the current strategy of never forgetting the queries time based limiting should work, it closes over the possibility to adjust that strategy later.

Salsa as an abstract API does not specify when or why it recomputes the queries. And we can imagine wanting to experiment with this down the line. What if we do LRU auto-magically for everything? When we implement on disk persistence, what if some queries turn out to be faster to recompute than to deserialise from disk? What if, when running concurrently, salsa would race two identical queries rather than block?

So, irrespective of the current implications of the specific change, I'd be very much in favor of preserving valuable deterministic properties.


There's a separate issue that non-determinism is pain in the back in day-to-day development. If the user submits an issue where rust-analyzer sometimes colors the code red, is it due to inherent timing non-determinism, or do we have some missing vfs update somewhere? If you have built-in non-determinism, then any such issue becomes doubly mysterious.

@Veykril
Copy link
Member

Veykril commented Apr 3, 2024

Is there any write-up for why it is the case?

I don't remember the why, @lcnr can you briefly re-iterate why you'd prefer a time based fuel for the new trait solver opposed to a step / unit of work count?

@lcnr
Copy link
Contributor

lcnr commented Apr 4, 2024

In these cases yes, it gets exponentially slower with each step. Does one second sound like a reasonable limit?

This is the reason why I'd expect a time based fuel to work better. The number of steps taken by the trait solver is a bad estimate for its runtime. While we could spend some effort getting a better estimate (considering the size of goals/returned constraints as well), I expect that having a fuel which is both prevents hangs while correctly checking edit: nearly all desired goals is pretty much impossible. We have the same issue when trying to prevent hangs in rustc itself, where I've given up on that goal for now.

My perspective from RustNation - and my experience using r-a by default while writing tests for the solver - was that hangs in r-a are quite a lot worse than hangs in rustc as r-a runs by default and in the background.

@matklad
Copy link
Member

matklad commented Apr 4, 2024

Yeah, it is true that having fuel is critical for rust-analyzer, both because the consequences are much dire (one can’t ^C ra easily) and because it sees more weird incomplete broken code.

I expect that having a fuel which is both prevents hangs while correctly checking edit: nearly all desired goals is pretty much impossible.

I’d say it is trivial :-) here’s a mathematician’s solution: compile solver to WebAssembly, use “number of retired wasm instructions” as fuel. This is going to be a within an order of magnitude proxy for runtime, but fully deterministic.

Of course, this proves only that a solution exists, not that it is practical. But it seems unlikely that no more practical equivalent exists. Something like “inside the solver, decrement a counter on every function call and inside every loop” should have equivalent effect?

Furthermore, I’d imagine that you want the cancelation to be cooperative, as I don’t think non-cooperative is even possible in Rust? So you’d have to call “check canceled flag” function once in a while. And you must call it frequently enough (eg inside every loop), other wise a lot of might be spent waiting for cancellation to get through. So it seems that you want to call “check canceled” in exactly the same places you’d want to decrement the fuel in? That is, implementing a snappy cooperative cancel gives you fuel for free.


This reminds me macros. Originally, like rustc, we used macro depth as a fuel. But that doesn’t work, because macros can be exponentially wide as well.

And fuel solution doesn’t work there, as macros are recursive, and we want to memorize intermediate steps. And, if you add “fuel left” to a memoization key, you break the memo.

I think what we ended up doing there is a combination of fuel for “depth” and a constant limit on the size of any single level of macro expansion.

IIRC that’s not still precise enough (you can write code where every level gets exactly worst-case wide, which is going to be a quadratic slow down), but I believe it works well enough in practice: bad cases are usually deep NAND wide.

But its curios that time based cancellation doesn’t automatically fix the recursive memorized case either: imagine you have some code that blows up exponentially and which gets canceled. Well, the user is going to continue to type, and we are going to continue to re-try this query. And, although we do have a, say, 100ms limit for the entire thing, every time we retry, we complete 100ms worth of sub queries!

🤔 I wonder if fuel should be a generalized salsa feature (cc @nikomatsakis)? Kill a query if it is too deep in the query graph or if its dependency set is too wide? Sort of conservative approximation for “cycle detection”.

@nikomatsakis
Copy link
Contributor

🤔 I wonder if fuel should be a generalized salsa feature (cc @nikomatsakis)? Kill a query if it is too deep in the query graph or if its dependency set is too wide? Sort of conservative approximation for “cycle detection”.

Interesting thought. Wouldn't help here because chalk isn't salsa-integrated, although that's long been an ambition of my mine (though the current new trait solver design didn't go that route, but rather continued to have special case dependency tracking).

@Veykril
Copy link
Member

Veykril commented Apr 18, 2024

Closing this as it seems undesirable to have. Would be good to check if we could add another dimension of a limit here (like goal size or similar?), though that will require changes to chalk.

@Veykril Veykril closed this Apr 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Status: Awaiting review from the assignee but also interested parties.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants