-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scylla::transport::topology ... Could not fetch metadata ... Protocol Error: system.peers or system.local has invalid column type #1044
Comments
Dup of #1021 ? |
Probably yes. Testing 2024.2 we get it a lot of times -> ~20k times in 1 hour. |
I'm reopening this one since it's a bit different from just a warning
In this case it's a multi DC case, we get into cases the stress command is failing straight away cause of this condition We are trying to move more cases into tools using the rust driver, this kind of thing is holding us back. Regardless of the issue with system.peers, the fact it doesn't fall back to any node, is something for sure different then other drivers (at least from the java behavior we know in c-s) |
This seems to be caused by a strict rule in default load balancing policy that if datacenter failover is not permitted (and it may be forbidden due to various causes, such as a local consistency being used, or being manually turned off), the query will never be routed to any node that does not belong to the preferred datacenter. As the initial contact points have their DC unspecified, they are rejected by the load balancer and hence the query plan is empty. |
This issue indeed highlights an important problem: if the initial metadata fetch fails for any reason, then due to strict datacenter requirements the default load balancer may yield the driver inoperable. |
How is that an issue? If user requested specific DC, and disabled failover, and we are unable to contact this DC, then I don't see what else should the driver do but fail. |
Note that we may be able to contact the DC and issue queries, but due to some reasons the initial metadata fetch might fail. Then we would never learn which contact points belong to the preferred DC, and thus we would end up with empty query plans. |
If by "contact DC" you mean having network connectivity to DC, then sure, driver may have that, but it's not helpful. |
What we could do is:
|
Users would need to use this, and I suspect they won't. It also introduces more cases (what if DC fetched from schema is different than provided one?).
I like the idea, but note that it would not help with this issue - because fetching peers failed. |
sorry but the use said specifically to enable failover: policy_builder = policy_builder.prefer_datacenter(dc.to_owned()).permit_dc_failover(true); see scylladb/latte@52db0cd#diff-d6346fd7d17270b1282142aeeda9c4bc2b7d8fd0f37b24a1c871a9257f0ed0aaR67 |
also I think the missing part, is also that in a few seconds the data would be filled by scylla, i.e. on the next refresh. as other driver doesn't fails, and keeps on reading the peers table until correct data appear there. I think it already agreed that rust driver is a bit too happy trigger to fail on missing data in peers tables compared to java/go/python. |
OK, but failover can be also disabled by using local consistency. See code of the default load balancer: fn is_datacenter_failover_possible(&self, routing_info: &ProcessedRoutingInfo) -> bool {
self.preferences.datacenter().is_some()
&& self.permit_dc_failover
&& !routing_info.local_consistency
} whereas local consistency is defined as: let local_consistency = matches!(
(query.consistency, query.serial_consistency),
(Consistency::LocalQuorum, _)
| (Consistency::LocalOne, _)
| (_, Some(SerialConsistency::LocalSerial))
); |
I'm pretty sure that Rust Driver will try to re-read schema too.
Wdym by fail? Until correct schema is read, queries will fail if DC is specified and failover disabled, that is true, and imo correct behavior. But the driver should keep trying to re-read the schema, and I think it does (doesn't it?) |
the user didn't do a single CQL statement yet at this point, it's just connecting ? why it's need to specify which consistency is going to do at this point ? maybe I'm gonna use both on the same connection ? I'm confused, and not sure not what exactly we should do in latte code, for enabling the DC failover ? |
again in this case, we don't have queries failing, we failing in establishing the connection at all, or latte command didn't got to do a single CQL command, it failed and bailed out. |
see that log |
This is false. Looking at the logs:
we clearly see that the driver issued a query and failed. Moreover, the following log line shows that the query was executed with a local consistency (
which explains why driver rejected initial contact points as coordinator for the query - it could not determine that they belong to the local datacenter, which is required to allow queries having local consistency specified. |
Use non-local consistency. E.g. Quorum instead of LocalQuorum. |
but we do want to use LocalQuorum, and still have a fallback in case of errors. and as a user, it's really not clear, especially when I explicitly define I do want fallback. |
What is the point of using LocalQuorum and expecting the driver to perform datacenter failover? Local consistencies only make sense when a node from the local datacenter becomes the coordinator. Else, an error seems to be the most reasonable option. |
again, we run longevity, which we expect to be running for multiple hours, and we do want to be using LocalQuorum, since that how customers are using when they have multi DC setups, we don't mind that in cases of failure with a DC, it would fallback to the other DCs, on the other way around, we don't want to be stopped or block cause of issue with scylla or the driver. if there are configuration and safe gurde, as a user I wan't to be able to configure it, telling me that in this case there's a consistency mode I can't use, I don't think this is an answer we could give a paying customer. |
The We definitely can use the Moreover, we can even completely remove it in our fork if we need to. @wprzytula thanks for the spotting problematic point in the @fruch it is very easy to fix, will be part of the next latte-fork tag. |
Well, if a customer sets a particular consistency mode, we are obliged to either satisfy it or return an error. |
Maybe the error handling here could be improved? |
Definitely. User should not see a |
As I mentioned in the private message, Scylla documentation does not impose this requirements, and does not define LOCAL this way.
This is a strictly server-side definition and does not state any driver-side requirement - because consistency level is something to be handled server-side, not driver-side. Not sure who to ask here. @nyh ? @avikivity ? |
Or maybe @kbr-scylla or @piodul ? |
Also the docs of the java 4.x, explain the failover option quite good: I think we need to fix scylla regression, cause it can cause issues for some of the drivers. in current situation, a user would be force to implement some retries on the application end, or stop using LOCAL_*, which might won't necessarily match his needed |
A fix that prevents metadata refresh failure on invalid peer entry has been merged. The soonest release is going to contain the fix. |
I think it is both true that:
Driver should probably not send user queries until it learns topology (location in dc and rack) information, if those queries are configured to reach only the "local" dc (whatever the user specified as "local" dc in driver config). Up to that point, queries should be probably queued up. And it doesn't matter what CL those queries are using. If the user specifies that driver must only contact dc X, then all queries, even those using CL=quorum or CL=one or whatever, should go only to dc X. |
@kbr-scylla I think the main question here is slightly different. The argument for forcing CL=LOCAL* requests to always go to local DC is that we can achieve consistency (as defined by https://opensource.docs.scylladb.com/stable/cql/consistency-calculator.html for read CL and write CL = QUORUM - not sure what are the exact semantics, that reads will see all previous writes I think?) with regards to all clients connected to this DC. If we route the request outside of this DC then this guarantee no longer holds. But there are counterarguments:
|
Is "local" and "preferred" the same thing? You're using a bunch of terminology that I'm not sure I understand. So let's assume there is a "local" DC and there is a "remote" DC configured by the driver. The user should be able to configure all their queries e.g. to some chosen keyspace, with CL=LOCAL_QUORUM, to go to the "remote" DC, right? Then the consistency guarantees will be provided, and everything should work. But it sounds like with this restriction you have, this is impossible: the user's desire to send only to "remote" DC conflicts with enforcing all LOCAL_* queries to the "local" DC. |
Ok, let me clear that up (you probably know most of that, but just to be sure we are on the same page). The question is: if a query has CL=LOCAL*, should "dc failover" setting be ignored and treated as DISABLED?
|
So the dilemma is between giving a choice to the user, or taking it away. So my vote is to give them the choice. If dc failover is enabled -- the user chose failover -- so do it as they requested. BTW what does the java driver do? |
I think no other driver forces dc failover to be disabled for CL=LOCAL* queries. |
scylla rust driver version:
0.13.0
scylla version:
2024.2.0~rc1-20240715.9bbb8c1483d7
Using
latte
stress tool we get following errors:It was not observed with previous versions of Scylla.
Ci job: https://jenkins.scylladb.com/job/enterprise-2024.2/job/longevity/job/longevity-gce-custom-d1-worklod2-hybrid-raid-test/2/
Argus: enterprise-2024.2/longevity/longevity-gce-custom-d1-worklod2-hybrid-raid-test#2
The text was updated successfully, but these errors were encountered: