Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat(SQL): Ordered-Set Aggregate Functions #10038

Closed
3 of 8 tasks
st1page opened this issue May 29, 2023 · 3 comments
Closed
3 of 8 tasks

Feat(SQL): Ordered-Set Aggregate Functions #10038

st1page opened this issue May 29, 2023 · 3 comments
Assignees
Milestone

Comments

@st1page
Copy link
Contributor

st1page commented May 29, 2023

Tracking

refers

https://www.postgresql.org/docs/9.4/sql-expressions.html#SYNTAX-AGGREGATES
https://www.postgresql.org/docs/9.4/functions-aggregate.html

Placing ORDER BY within the aggregate's regular argument list, as described so far, is used when ordering the input rows for a "normal" aggregate for which ordering is optional. There is a subclass of aggregate functions called ordered-set aggregates for which an order_by_clause is required, usually because the aggregate's computation is only sensible in terms of a specific ordering of its input rows. Typical examples of ordered-set aggregates include rank and percentile calculations. For an ordered-set aggregate, the order_by_clause is written inside WITHIN GROUP (...), as shown in the final syntax alternative above. The expressions in the order_by_clause are evaluated once per input row just like normal aggregate arguments, sorted as per the order_by_clause's requirements, and fed to the aggregate function as input arguments. (This is unlike the case for a non-WITHIN GROUP order_by_clause, which is not treated as argument(s) to the aggregate function.) The argument expressions preceding WITHIN GROUP, if any, are called direct arguments to distinguish them from the aggregated arguments listed in the order_by_clause. Unlike normal aggregate arguments, direct arguments are evaluated only once per aggregate call, not once per input row. This means that they can contain variables only if those variables are grouped by GROUP BY; this restriction is the same as if the direct arguments were not inside an aggregate expression at all. Direct arguments are typically used for things like percentile fractions, which only make sense as a single value per aggregation calculation. The direct argument list can be empty; in this case, write just () not (*). (PostgreSQL will actually accept either spelling, but only the first way conforms to the SQL standard.)An example of an ordered-set aggregate call is:

SELECT percentile_cont(0.5) WITHIN GROUP (ORDER BY income) FROM households;
 percentile_cont
-----------------
           50489

which obtains the 50th percentile, or median, value of the income column from table households. Here, 0.5 is a direct argument; it would make no sense for the percentile fraction to be a value varying across rows.

@st1page
Copy link
Contributor Author

st1page commented Dec 5, 2023

@TennyZhuang I heard that you want to use another algorithm to support 2-phase percentile?

@TennyZhuang
Copy link
Contributor

I just provide some information and don’t do enough investigation, it should be decided by @yuhao-su

@st1page st1page assigned TennyZhuang and unassigned st1page Jan 9, 2024
@st1page st1page modified the milestones: release-1.6, release-1.7 Jan 9, 2024
Copy link
Contributor

github-actions bot commented Aug 1, 2024

This issue has been open for 60 days with no activity.

If you think it is still relevant today, and needs to be done in the near future, you can comment to update the status, or just manually remove the no-issue-activity label.

You can also confidently close this issue as not planned to keep our backlog clean.
Don't worry if you think the issue is still valuable to continue in the future.
It's searchable and can be reopened when it's time. 😄

@st1page st1page self-assigned this Aug 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants