You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If one side is N-to-1 mapping (such as joining a dimension table with its PK), then both the join key and stream key from that side does not need to be included in output stream key.
For example:
Assuming
PK of orders = [order.id]
PK of customers = [customer.id]
Then for this query
select*from orders left join customers onorders.custumer_id=customers.id
Because orders.custumer_id = customers.id is an N-to-1 mapping (from left to right), the Join's (output) stream key can be [orders.id], instead of [orders.id, customers.id] or [orders.id, customers.id, orders.custumer_id].
The text was updated successfully, but these errors were encountered:
Thanks to @st1page for the correction, I was wrong. The join key must be included the stream key under any circumstance.
Taking fact table-dimension table join as an example
select*from orders left join customers onorders.customer_id=customers.id
If orders.customer_id is updated, it would generate a pair of U- and U+, which then becomes - and + and sent to 2 actors of the next fragment, saying, a Materialize.
Only by including the customer_id in the stream key can it prevent the + from arriving at Materialize before -, causing a sanity check panic.
If one side is N-to-1 mapping (such as joining a dimension table with its PK), then both the join key and stream key from that side does not need to be included in output stream key.
For example:
Assuming
Then for this query
Because
orders.custumer_id = customers.id
is an N-to-1 mapping (from left to right), the Join's (output) stream key can be[orders.id]
, instead of[orders.id, customers.id]
or[orders.id, customers.id, orders.custumer_id]
.The text was updated successfully, but these errors were encountered: