-
Notifications
You must be signed in to change notification settings - Fork 590
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug(dyn-filter): left side changes after state cleaning with right watermark #17711
Comments
To ReproduceFirst, change Then, run: create table t (ts timestamp, foo int);
insert into t values (now() - interval '2 min', 111), (now() - interval '1 min', 222);
create materialized view mv as select * from t where ts >= now() - interval '1 minute'; After the MV creation, there should be nothing in Now, updare the left table: update t set foo = 123 where foo = 111; This is absolutely valid because |
I got the first part, which is that within a single epoch, if the ordering of watermark happens like:
Then the state for A is inconsistent. Didn't quite get the second part:
If no watermark, doesn't that just mean we don't encounter the inconsistency case? |
From my understanding,
|
Yes, so as I said in the issue description:
If we adopt the ignoring changes below watermark way, we will face the problem that (state table) watermark is not persisted. |
There is watermark on the right column, but on recovery it seems not guaranteed that the first right watermark will come before left changes. |
But if it comes after the left changes, there won't be inconsistency right, because the left changes will be processed first. Then the watermark will just not clean it. |
But the left changes may be actually below the watermark produced last time, which due to recovery dynamic filter doesn't know yet. |
The watermark is persisted in Hummock's metadata. See #15344. If the storage ( |
Can the CN obtain accurate and timely watermark information? If it is trustworthy, I suggest obtaining the current watermark when the state table is created, and it should be able to automatically accept and filter out all operations below the watermark. 🤔 |
Raised another issue to discuss whether we should provide watermark information on state store level: #17741. |
In DynamicFilter executor, we use watermarks on RHS to clean the state of left table, by
left_table.update_watermark(rhs_watermark)
. However in this way, later left changes will cause inconsistent table operations (e.g. double delete) on the left table, if theupdate_watermark
did take effect.An intuitive way to resolve this is to check whether the changed rows on left side is below watermark, and just ignore those rows that's below watermark. But another fact is that state cleaning watermark is not persisted, so after recovery we will temporarily have
None
watermark.I guess for the sake of simplicity, we may just use state table with
inconsistent_op
for the left table of DynamicFilter.(This is blocking #17694.)
The text was updated successfully, but these errors were encountered: