-
-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom media types + conversion for Content-Type
#2826
Comments
@wolfgangwalther Any feedback on this one? Doing it will involve a big refactor. So I'm wondering if you see anything wrong before I start. |
I did not have the time, yet, to follow along what exactly has been implemented for Assuming the above is correct, then:
The answer to those questions in the second paragraph is: Casts. Those functions are effectively implementing a cast, so we should create proper casts for them and detect them via casts. Then they can be located anywhere, don't conflict with RPCs, etc. I would suggest to do the function interface first, and then extend later on. |
Yes (for aggregates too (#2825 (comment))) and yes.
Agree, that sounds good. |
Oh, thinking more about the above. Wouldn't the interface for media type handlers be simpler too if it's defined via CASTs? So the example on https://postgrest.org/en/stable/references/api/media_type_handlers.html#handlers-for-tables-views would be: create or replace function twkb_handler_transition (state bytea, next lines)
returns bytea as $$
select state || st_astwkb(next.geom);
$$ language sql;
create or replace aggregate twkb_agg (lines) (
initcond = ''
, stype = bytea
, sfunc = twkb_handler_transition
);
CREATE OR REPLACE FUNCTION twkb(lines) RETURNS "application/vnd.twkb" AS $$
select twkb_agg($1);
$$ LANGUAGE SQL IMMUTABLE;
CREATE CAST (lines AS "application/vnd.twkb") WITH FUNCTION twkb(lines) AS IMPLICIT; That way the aggregate is an internal detail and the media type is not at all involved in its definition. I believe it's much more clear despite having to define one more function too. @wolfgangwalther WDYT? This would be a breaking change on current media type handlers, but not for functions only for tables (which doesn't seem to be used that much yet). |
Damn, the above makes sense but the current query doesn't work: WITH pgrst_source AS ( SELECT "test"."lines".* FROM "test"."lines")
SELECT
null::bigint AS total_result_set,
pg_catalog.count(_postgrest_t) AS page_total,
twkb(_postgrest_t::"test"."lines") AS body,
nullif(current_setting('response.headers', true), '') AS response_headers,
nullif(current_setting('response.status', true), '') AS response_status,
'' AS response_inserted
FROM ( SELECT * FROM pgrst_source ) _postgrest_t;
ERROR: column "_postgrest_t.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 5: twkb(_postgrest_t::"test"."lines") AS body, And an aggregate cannot be used as cast function: CREATE CAST (lines AS "application/vnd.twkb") WITH FUNCTION twkb_agg(lines) AS IMPLICIT;
ERROR: cast function must be a normal function |
The only way I see to make the CAST abstraction work is to scan the body of the |
Maybe that's not too bad, we can apply the function body directly like: WITH pgrst_source AS ( SELECT "test"."lines".* FROM "test"."lines")
SELECT
null::bigint AS total_result_set,
pg_catalog.count(_postgrest_t) AS page_total,
(select twkb_agg(_postgrest_t::"test"."lines")) AS body,
nullif(current_setting('response.headers', true), '') AS response_headers,
nullif(current_setting('response.status', true), '') AS response_status,
'' AS response_inserted
FROM ( SELECT * FROM pgrst_source ) _postgrest_t; That also lets apply functions on top of the aggregation, which is not possible with the current interface. This makes it possible to do The any handler interface would have to change though, since the aggregate final function would no longer be used for setting the response headers. |
The "correct" way to do this, would be to use -- note the argument type is now an array
CREATE OR REPLACE FUNCTION twkb(lines[]) RETURNS "application/vnd.twkb" AS $$
-- whatever you need to do to the array
$$ LANGUAGE SQL IMMUTABLE;
CREATE CAST (lines[] AS "application/vnd.twkb") WITH FUNCTION twkb(lines[]) AS IMPLICIT; WITH pgrst_source AS ( SELECT "test"."lines".* FROM "test"."lines")
SELECT
null::bigint AS total_result_set,
pg_catalog.count(_postgrest_t) AS page_total,
twkb(array_agg(_postgrest_t::"test"."lines"))) AS body,
nullif(current_setting('response.headers', true), '') AS response_headers,
nullif(current_setting('response.status', true), '') AS response_status,
'' AS response_inserted
FROM ( SELECT * FROM pgrst_source ) _postgrest_t; That means the user can either provide custom aggregates, probably mostly useful when implemented in C for performance (i.e. pass through the full response once instead of twice with the cast). Or they can provide the cast, in which case we still do the aggregation for them beforehand, but in a type-preserving way via |
How would that work? I'm guessing like: CREATE OR REPLACE FUNCTION twkb(lines[]) RETURNS "application/vnd.twkb" AS $$
select twkb_agg(x) from unnest($1) x;
$$ LANGUAGE SQL IMMUTABLE; Wouldn't it be wasteful to aggregate the array ( AFAICT, operations over arrays on pg always require unnesting, there's no
Hm, ideally handling both content-type and accept would use the same interface with CASTs. Otherwise this complicates the docs and the implementation. Currently it doesn't look good how we mix the media type with the aggregate definition. |
What makes you think that unnest/agg is less efficient than a map function if that was implemented? I have no idea whether it's inefficient, but unnest/agg just seems to be idiomatic SQL code. So certainly not bad. Also note that this is a special case for a cast function written in SQL. For example the function given in #3160 (comment) would be much more efficient with array_agg. Currently it uses a poor-man's aggregation via json/jsonb, but that's just because that's the only way to do it right now. An aggregation via array_agg would be much better - and then the python code doesn't care how it receives the data.
Agreed. But the example in #2826 (comment) is insufficient: CREATE FUNCTION convert_projects("application/json") RETURNS projects AS $$
-- ...
$$ LANGUAGE SQL; This would only ever allow a single project to be returned. What if the json document contained multiple projects? I'm quite sure that a cast can't return a This is exactly the opposite of So basically:
Very symmetric. |
Problem
Currently we allow handling raw input for
application/json
,text/xml
,text/plain
,application/octet-stream
on functions(ref).However it's inflexible, no other custom media types can be handled. It also doesn't work for tables/views.
Solution
Use a domain-based media type as a function unnamed parameter to do custom conversion. Unlike ref this can work for both functions and tables/views.
Functions interface
Tables/views interface
e.g. the first function will be used when doing:
This means we'll use the unnamed parameter + return type to apply a conversion function for a particular table/view.
Inside the function a
json_to_recordset
can be used (this is what we do internally). With this interface the user can override our default behavior.Thanks to overloaded functions the same function name can be used to handle different media types.
Notes
pgrst.accept
setting to RPC #1582 and Data Representations #2523. (long threads) .Related
The text was updated successfully, but these errors were encountered: