-
Notifications
You must be signed in to change notification settings - Fork 591
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(batch): fix hash join process some remaining chunk builder #15912
Conversation
I suppose I need to cherry pick this into Because this is a follow-up from previous fix here: #15853 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
Can you please a test case somewhere? The problamtic SQL in fuzz test is:
WITH with_0 AS (SELECT 'g60H9TgFkY' AS col_0 FROM (hop(auction, auction.date_time, INTERVAL '30', INTERVAL '540') AS hop_1 FULL JOIN hop(alltypes2, alltypes2.c11, INTERVAL '93', INTERVAL '1302') AS hop_2 ON hop_1.expires = hop_2.c11) FULL JOIN tumble(bid, bid.date_time, INTERVAL '12') AS tumble_3 ON hop_1.seller = tumble_3.price AND hop_2.c1 WHERE EXISTS (SELECT tumble_4.name AS col_0, hop_5.id AS col_1, (lower(max(DISTINCT 'wZCO4fLGwT' ORDER BY 'wZCO4fLGwT' NULLS FIRST))) AS col_2 FROM tumble(person, person.date_time, INTERVAL '18') AS tumble_4 LEFT JOIN hop(person, person.date_time, INTERVAL '33', INTERVAL '198') AS hop_5 ON tumble_4.credit_card = hop_5.name GROUP BY GROUPING SETS ((hop_5.date_time, hop_5.credit_card, tumble_4.city, tumble_4.state, tumble_4.date_time, hop_5.id), (hop_5.date_time, tumble_4.name, tumble_4.state, hop_5.extra), (tumble_4.name, hop_5.id, tumble_4.credit_card, tumble_4.email_address, tumble_4.state), (tumble_4.id, tumble_4.date_time, hop_5.state, tumble_4.email_address, hop_5.name), (hop_5.email_address, tumble_4.date_time, tumble_4.city, hop_5.id)) HAVING true) GROUP BY hop_2.c5, tumble_3.extra, tumble_3.bidder, hop_2.c13, hop_2.c15, hop_1.id, tumble_3.date_time, hop_2.c1, hop_2.c4, hop_1.expires, hop_1.item_name, hop_1.extra HAVING ((FLOAT '0') IS NOT NULL) LIMIT 7) SELECT tumble_6.c8 AS col_0, tumble_6.c11 AS col_1, (INTERVAL '-56') AS col_2 FROM with_0, tumble(alltypes1, alltypes1.c11, INTERVAL '13') AS tumble_6 WHERE (tumble_6.c11 = tumble_6.c11) GROUP BY tumble_6.c13, tumble_6.c8, tumble_6.c1, tumble_6.c11, tumble_6.c3 HAVING tumble_6.c1 ORDER BY tumble_6.c13 ASC NULLS FIRST
See https://buildkite.com/risingwavelabs/main-cron/builds/2154#018e7577-bd40-4e4f-ac86-a5ff0c4d0010 / fuzz test / logs / risedev-logs.tgz for full logs.
Yes, I have reproduced this bug in my local environment and ensured that is fixed. But I think the original test case is a little big as a e2e test. |
Do we need to modify the left semi-join and full outer join as well? because I remember https://github.com/risingwavelabs/risingwave/pull/15853/files this PR modified 3 places. |
Sounds like you have a more minimal reproducing case? That would be much better for a test case |
No, I just use the original case in the fuzz test 🥵 |
The three place in the previous PR is |
the sql fuzz test fails because some other issues that the frontend stack overflow.
|
… (#15923) Co-authored-by: stonepage <[email protected]>
… (#15922) Co-authored-by: stonepage <[email protected]>
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
fix #15899
The
process_xxx
function is only need for thechunk_builder
but not need for theremaining_chunk_builder
Checklist
./risedev check
(or alias,./risedev c
)Documentation
Release note
If this PR includes changes that directly affect users or other significant modifications relevant to the community, kindly draft a release note to provide a concise summary of these changes. Please prioritize highlighting the impact these changes will have on users.