You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
speculative_execution_is_fired often fails in CI. We debugged it with @wprzytula . The problem is test assumes that each execution will be sent to different node which is no longer true in all cases after #944
Speculative execution uses load balancing Plan, which wraps usages of LBP::pick and LBP::fallback. If some target in fallback is the same as the one from pick, it is filtered out by Plan.
Before #944Plan filtered out nodes, not targets (nodes + shards), which with guarantee that fallback doesn't return the same node twice guaranteed that nodes in Plan are unique.
Now those guarantees are no longer provided:
For unprepared queries shards returned from LBP are random, so it's possible for pick to return node X with shard 0 and for node X to appear in fallback with node 1. This won't be filtered out by Plan as those are different targets.
For prepared statements it's possible for the same node to appear in fallback multiple times. One case is with tablets - and that's intended. But it's possible even without tablets - first elements of fallback are replicas, and then other nodes in some order, with random shards. Those other nodes won't be filtered out because they are different targets (because of different nodes).
For unprepared queries ignore the problem - it will be possible for the first node in Plan to appear later with a different shard number. This is just one repetition, no more, so it shouldn't be a problem, as those are unprepared queries (so the user probably doesn't care that much about performance of such query).
Alternative solution for unprepared queries that we didn't discuss is to always return None from pick. Then fallback can take care of uniqness of nodes. WDYT @wprzytula ?
For prepared statements: don't eliminate repetitions in the first part of plan (that contains replicas). For the random part of the plan skip the nodes that appeared in the first part of the plan. This way there will be no repeating nodes in Plan except if 2 replicas happen to be on the same node.
The text was updated successfully, but these errors were encountered:
* Alternative solution for unprepared queries that we didn't discuss is to always return `None` from `pick`. Then `fallback` can take care of uniqness of nodes. WDYT @wprzytula ?
Now that I think of it I'm really starting to like the idea. It will require some allocations in the unprepared case, but will provide consistent behavior in exchange.
speculative_execution_is_fired
often fails in CI. We debugged it with @wprzytula . The problem is test assumes that each execution will be sent to different node which is no longer true in all cases after #944Speculative execution uses load balancing
Plan
, which wraps usages of LBP::pick and LBP::fallback. If some target in fallback is the same as the one from pick, it is filtered out byPlan
.Before #944
Plan
filtered out nodes, not targets (nodes + shards), which with guarantee that fallback doesn't return the same node twice guaranteed that nodes inPlan
are unique.Now those guarantees are no longer provided:
pick
to return node X with shard 0 and for node X to appear infallback
with node 1. This won't be filtered out byPlan
as those are different targets.fallback
multiple times. One case is with tablets - and that's intended. But it's possible even without tablets - first elements offallback
are replicas, and then other nodes in some order, with random shards. Those other nodes won't be filtered out because they are different targets (because of different nodes).Solution after a long discussion with @wprzytula:
Plan
to appear later with a different shard number. This is just one repetition, no more, so it shouldn't be a problem, as those are unprepared queries (so the user probably doesn't care that much about performance of such query).None
frompick
. Thenfallback
can take care of uniqness of nodes. WDYT @wprzytula ?Plan
except if 2 replicas happen to be on the same node.The text was updated successfully, but these errors were encountered: