Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support expression in limit clause #19834

Merged
merged 15 commits into from
Dec 27, 2024
Merged

feat: support expression in limit clause #19834

merged 15 commits into from
Dec 27, 2024

Conversation

lmatz
Copy link
Contributor

@lmatz lmatz commented Dec 17, 2024

I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.

What's changed and what's your intention?

fix #19830

required by SQLMesh's CI test

The fix is to evaluate the expression during binding stage so that we either accept the result (it is a number >=0 after casting to bigint) or reject it (all the other cases)

Checklist

  • I have written necessary rustdoc comments.
  • I have added necessary unit tests and integration tests.
  • I have added test labels as necessary.
  • I have added fuzzing tests or opened an issue to track them.
  • My PR contains breaking changes.
  • My PR changes performance-critical code, so I will run (micro) benchmarks and present the results.
  • My PR contains critical fixes that are necessary to be merged into the latest release.

Documentation

  • My PR needs documentation updates.
Release note

@@ -655,7 +655,7 @@ impl fmt::Display for OrderByExpr {
#[cfg_attr(feature = "serde", derive(Serialize, Deserialize))]
pub struct Fetch {
pub with_ties: bool,
pub quantity: Option<String>,
pub quantity: Option<Expr>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not as simple as it looks like. A parsing conflict can happen:

https://github.com/postgres/postgres/blob/REL_17_2/src/backend/parser/gram.y#L13233

* Allowing full expressions without parentheses causes various parsing
* problems with the trailing ROW/ROWS key words.  SQL spec only calls for
* <simple value specification>, which is either a literal or a parameter (but
* an <SQL parameter reference> could be an identifier, bringing up conflicts
* with ROW/ROWS). We solve this by leveraging the presence of ONLY (see above)
* to determine whether the expression is missing rather than trying to make it
* optional in this rule.

* c_expr covers almost all the spec-required cases (and more), but it doesn't
* cover signed numeric literals, which are allowed by the spec. So we include
* those here explicitly. We need FCONST as well as ICONST because values that
* don't fit in the platform's "long", but do fit in bigint, should still be
* accepted here. (This is possible in 64-bit Windows as well as all 32-bit
* builds.)

For example,

select 1 offset 1 row fetch first row only;
                                  ^

This is valid in PostgreSQL. When we see this row keyword, it should be part of the fetch first [n] row syntax where n is omitted, but it may be mistakenly treated as part of the row(field0, field1) expression because we will attempt to parse an expression (instead of a numerical constant) after first.

What is the actual parsing behavior when this commit sees this SQL?

Copy link
Contributor Author

@lmatz lmatz Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about let me revert the changes for "fetch" but keep the changes for "limit",
the CI test required by SQLMesh integration needs limit simple expression only.

let fetch still stick to a literal number

I will open a new issue for the reasoning posted above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we force parentheses around the expression in Fetch just like PG

SELECT id
FROM generate_series(1, 10) AS id
ORDER BY id
OFFSET 5 ROWS
FETCH NEXT (1+4) ROWS ONLY;

id 
----
  6
  7
  8
  9
 10
(5 rows)

SELECT id
FROM generate_series(1, 10) AS id
ORDER BY id
OFFSET 5 ROWS
FETCH NEXT 1+4 ROWS ONLY;

psql:commands.sql:11: ERROR:  syntax error at or near "+"
LINE 5: FETCH NEXT 1+4 ROWS ONLY;

Copy link
Contributor

@xiangjinwu xiangjinwu Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we force parentheses around the expression in Fetch just like PG

PostgreSQL is more complex than this, but yes parenthesized expr is a subset of what PostgreSQL allows (c_expr as defined in the linked gram.y above, while limit allows a_expr. A includes C includes parenthesized expr.).

src/frontend/src/binder/query.rs Outdated Show resolved Hide resolved
src/frontend/src/binder/query.rs Outdated Show resolved Hide resolved
e2e_test/batch/order/test_limit.slt.part Show resolved Hide resolved
@graphite-app graphite-app bot requested a review from a team December 27, 2024 09:22
@lmatz lmatz added this pull request to the merge queue Dec 27, 2024
Merged via the queue into main with commit eca573d Dec 27, 2024
28 of 29 checks passed
@lmatz lmatz deleted the lz/limit_offset branch December 27, 2024 12:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Need to support expression evaluation after LIMIT
2 participants