Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable having clause to refer to an alias #7506

Open
lmatz opened this issue Jan 20, 2023 · 8 comments
Open

Enable having clause to refer to an alias #7506

lmatz opened this issue Jan 20, 2023 · 8 comments
Labels
type/enhancement Improvements to existing implementation.

Comments

@lmatz
Copy link
Contributor

lmatz commented Jan 20, 2023

Is your feature request related to a problem? Please describe.

SQL standard disallow alias in having clause and PG unable to do so, but it doesn’t hurt I suppose?

MySQL can do so.

Describe the solution you'd like

No response

Describe alternatives you've considered

No response

Additional context

No response

@lmatz lmatz added the type/enhancement Improvements to existing implementation. label Jan 20, 2023
@lmatz lmatz self-assigned this Jan 20, 2023
@github-actions github-actions bot added this to the release-0.1.16 milestone Jan 20, 2023
@lmatz lmatz removed their assignment Jan 20, 2023
@xxchan
Copy link
Member

xxchan commented Jan 20, 2023

MySQL can do so

Just found sqlite & duckdb also supports it.

it doesn’t hurt I suppose

But I still don't think it's worth diverging from Postgres. It's a road of no return. 🤪

@CAJan93
Copy link
Contributor

CAJan93 commented Jan 24, 2023

SELECT YEAR(orderdate), COUNT(*) as Amount
FROM Sales.Orders
GROUP BY YEAR(orderdate) 
HAVING Amount > 1; 

Quick example, just in case I am not the only SQL noob out there 😁

@lmatz
Copy link
Contributor Author

lmatz commented Jan 25, 2023

Just feel that the ultimate goal is letting people use RW smoothly,
and being compatible with PG(PG's syntax works in RW) is a proxy goal because if DuckDB suddenly becomes the most popular DB and PG becomes the least popular, it makes sense for us to be compatible with DuckDB.
And providing something extra does not prevent PG users from using their favorite syntax.

Not urgent, let's wait and see if there are more opinions on this, not just about this specific alias case but the general attitude towards adding more syntax(I suppose most cases do not conflict with PG) into RW.

@xxchan
Copy link
Member

xxchan commented Jan 25, 2023

My main concern is that adding new things may break other things in unexpected ways, especially for SQL, a complex and not well-designed language.

Anyway, such topic might unavoidably involve a lot of bikeshedding, and I agree we should decide the general attitude before talking about specific sugars.

@xiangjinwu
Copy link
Contributor

the ultimate goal is letting people use RW smoothly, and being compatible with PG is a proxy goal

agree

concern is that adding new things may break other things

also agree

Now to elaborate on this specific topic:

  1. We should allow aliases (and output ordinals) to be used in group by clause first, which is also supported by PostgreSQL. feat: support group by output column ordinal or name #4681

Using the example @CAJan93 provided above:

SELECT EXTRACT(YEAR FROM orderdate) as y, COUNT(*) as Amount
FROM Sales.Orders
GROUP BY y; -- cannot be expr in PostgreSQL. Invalid example: y + 1 

SELECT EXTRACT(YEAR FROM orderdate), COUNT(*) as Amount
FROM Sales.Orders
GROUP BY 1; -- this is not literal expr 1, but references the 1st output column
  1. As for the having clause, we also need to think about the rules carefully:
  • We cannot restrict it to a name at top level. It has to be part of a (boolean) expr, as in the example HAVING Amount > 1.
  • Output ordinals should not be supported. They should be treated as literal expressions.
  • When an input column and output column shares the same name, the ___ one takes priority.
  • When the name is used inside an agg call, it cannot be an alias. Invalid example: HAVING SUM(Amount) > 1
  • ... (Maybe more rules. Need to reference MySQL/DuckDB/SQLite.)

@github-actions
Copy link
Contributor

github-actions bot commented Apr 8, 2023

This issue has been open for 60 days with no activity. Could you please update the status? Feel free to continue discussion or close as not planned.

@xiangjinwu
Copy link
Contributor

Update: Just confirming the 2 points above are still valid:

@xiangjinwu
Copy link
Contributor

xiangjinwu commented Nov 27, 2023

One clear reason NOT to support using aliases in WHERE:

In #13653, we observe the following 2 queries have different semantics:

test=# create table t (id int, cat varchar, rule varchar, at timestamptz);
test=# insert into t values (1, 'foo', 'A', '2023-11-23T12:00:42Z'), (2, 'foo', 'B', '2023-11-23T12:01:15Z');

test=# select rule, lag(rule) over (partition by cat order by at) from t where rule = 'B';
 rule | lag 
------+-----
 B    | 
(1 row)

test=# select * from (select rule, lag(rule) over (partition by cat order by at) as prev_rule from t) as with_prev where rule = 'B';
 rule | prev_rule 
------+-----------
 B    | A
(1 row)

The following query would be misleading if alias was allowed in WHERE. It looks like WHERE filtering happens after OverWindow.

select rule as curr_rule, lag(rule) over (partition by cat order by at) from t where curr_rule = 'B';

Additional notes:

  • This issue was originally about HAVING. But DuckDB also allows usage in WHERE.
  • The same question about evaluation order applies to GROUP BY as well. Not sure if similar case with misleading semantic can be constructed. The concern about evaluation order was the reason of late support in GROUP BY (2023-06-14) compared to ORDER BY (2022-03-15).
  • lag is not allowed in WHERE.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement Improvements to existing implementation.
Projects
None yet
Development

No branches or pull requests

4 participants