-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Discussion: union
causes unreasonable long key
#14314
Comments
I guess it is not correct SQL semantics
+1 for this. |
a case dev=> create table t1(v int, k int primary key);
CREATE_TABLE
dev=> create table t2(v int, k int primary key);
CREATE_TABLE
dev=> insert into t1 values (1,1);
INSERT 0 1
dev=> insert into t2 values (2,1);
INSERT 0 1
dev=> select distinct on (k) * from (select * from t1 union all select * from t2);
v | k
---+---
2 | 1
(1 row)
dev=> select * from t1 union select * from t2;
v | k
---+---
1 | 1
2 | 1
(2 rows) |
Then, let's try out
? Note that the aggregation is not only limited to |
So in general we don't want jsonb in stream key? |
Yes. It's mostly a mistake and will lead to very long stream key. |
#7981 🤔 |
Problems
We observed several times that incorrect usage
union
leads to extraordinary long stream key, usually including ajsonb
column. For example,RisingWave will optimize this query into
Note the
StreamExchange
andStreamAppendOnlyDedup
(orStreamGroupTopN
for non-append-only cases) are on the full columns, so the cost is high.Even worse, the long stream key will be used by downstream MVs, which leads to even higher cost.
Solution?
More than one approaches can somehow mitigate this problem.
union
by default (can be worked-around by setting a session variable)JSONB
column by default (ditto.)Personally I prefer 2.
The text was updated successfully, but these errors were encountered: