Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: pg_sequence not found when write data with polars #15934

Closed
jetjinser opened this issue Mar 26, 2024 · 1 comment
Closed

bug: pg_sequence not found when write data with polars #15934

jetjinser opened this issue Mar 26, 2024 · 1 comment
Labels
type/bug Something isn't working
Milestone

Comments

@jetjinser
Copy link
Contributor

jetjinser commented Mar 26, 2024

Describe the bug

RisingWave does not have the pg_sequence view. When polars using method Dataframe.write_database to writes data and specifies if_table_exists as replace, the generated sql will use this view, which leads to the error.

Error message/log

2024-03-27T03:33:06.617702+08:00 ERROR        rw-main handle_query{mode="simple query" session_id=1 sql=SELECT pg_catalog.pg_attribute.attname AS name, pg_catalog.format_type(pg_catalog.pg_attribute.atttypid, pg_catalog.pg_attribute.atttypmod) AS format_type, (SELECT pg_catalog.pg_get_expr(pg_catalog.pg_attrdef.adbin, pg_catalog.pg_attrdef.adrelid) AS pg_get_expr_1
FROM pg_catalog.pg_attrdef
WHERE pg_catalog.pg_attrdef.adrelid = pg_catalog.pg_attribute.attrelid AND pg_catalog.pg_attrdef.adnum = pg_catalog.pg_attribute.attnum AND pg_catalog.pg_attribute.atthasdef) AS "default", pg_catalog.pg_attribute.attnotnull AS not_null, pg_catalog.pg_class.relname AS table_name, pg_catalog.pg_description.description AS comment, pg_catalog.pg_attribute.attgenerated AS generated, (SELECT json_build_object('always', pg_catalog.pg_attribute.attidentity = 'a', 'start', pg_catalog.pg_sequence.seqstart, 'increment', pg_catalog.pg_sequence.seqincrement, 'minvalue', pg_catalog.pg_sequence.seqmin, 'maxvalue', pg_catalog.pg_sequence.seqmax, 'cache', pg_catalog.pg_sequence.seqcache, 'cycle', pg_catalog.pg_sequence.seqcycle) AS json_build_object_1
FROM pg_catalog.pg_sequence
WHERE pg_catalog.pg_attribute.attidentity != '' AND pg_catalog.pg_sequence.seqrelid = CAST(CAST(pg_catalog.pg_get_serial_sequence(CAST(CAST(pg_catalog.pg_attribute.attrelid AS REGCLASS) AS TEXT), pg_catalog.pg_attribute.attname) AS REGCLASS) AS OID)) AS identity_options
FROM pg_catalog.pg_class LEFT OUTER JOIN pg_catalog.pg_attribute ON pg_catalog.pg_class.oid = pg_catalog.pg_attribute.attrelid AND pg_catalog.pg_attribute.attnum > 0 AND NOT pg_catalog.pg_attribute.attisdropped LEFT OUTER JOIN pg_catalog.pg_description ON pg_catalog.pg_description.objoid = pg_catalog.pg_attribute.attrelid AND pg_catalog.pg_description.objsubid = pg_catalog.pg_attribute.attnum JOIN pg_catalog.pg_namespace ON pg_catalog.pg_namespace.oid = pg_catalog.pg_class.relnamespace
WHERE pg_catalog.pg_class.relkind = ANY (ARRAY['r', 'p', 'f', 'v', 'm']) AND pg_catalog.pg_table_is_visible(pg_catalog.pg_class.oid) AND pg_catalog.pg_namespace.nspname != 'pg_catalog' AND pg_catalog.pg_class.relname IN ('taxi_zones') ORDER BY pg_catalog.pg_class.relname, pg_catalog.pg_attribute.attnum}: pgwire::pg_protocol: error when process message error=Failed to run the query: Failed to bind expression: (SELECT json_build_object('always', pg_catalog.pg_attribute.attidentity = 'a', 'start', pg_catalog.pg_sequence.seqstart, 'increment', pg_catalog.pg_sequence.seqincrement, 'minvalue', pg_catalog.pg_sequence.seqmin, 'maxvalue', pg_catalog.pg_sequence.seqmax, 'cache', pg_catalog.pg_sequence.seqcache, 'cycle', pg_catalog.pg_sequence.seqcycle) AS json_build_object_1 FROM pg_catalog.pg_sequence WHERE pg_catalog.pg_attribute.attidentity <> '' AND pg_catalog.pg_sequence.seqrelid = CAST(CAST(pg_catalog.pg_get_serial_sequence(CAST(CAST(pg_catalog.pg_attribute.attrelid AS REGCLASS) AS TEXT), pg_catalog.pg_attribute.attname) AS REGCLASS) AS OID)): Feature is not yet implemented: pg_catalog.pg_sequence is not supported, please use `SHOW` commands for now.
`SHOW TABLES`,
`SHOW MATERIALIZED VIEWS`,
`DESCRIBE <table>`,
`SHOW COLUMNS FROM [table]`

Tracking issue: https://github.com/risingwavelabs/risingwave/issues/1695

To Reproduce

Run the following code twice.

import polars as pl

def send_csv_records(
    endpoint: str,
    tbl_name: str,
    conn_string: str,
    write_disposition: str = "replace",
    engine: str = "sqlalchemy",
):
    df = pl.read_csv(endpoint).rename(
        mapping={
            "LocationID": "location_id",
            "Borough": "borough",
            "Zone": "zone",
            "service_zone": "service_zone",
        }
    )

    return df.write_database(
        table_name=tbl_name,
        connection=conn_string,
        if_table_exists=write_disposition,  # type: ignore
        engine=engine,  # type: ignore
    )

if __name__ == "__main__":
    rw_host = "localhost"
    rw_port = 4566
    rw_db = "dev"
    rw_user = "root"

    num_records = send_csv_records(
        # endpoint="https://s3.amazonaws.com/nyc-tlc/misc/taxi+_zone_lookup.csv",
        endpoint="./taxi+_zone_lookup.csv",
        tbl_name="taxi_zones",
        conn_string=f"postgresql://{rw_user}@{rw_host}:{rw_port}/{rw_db}",
    )

    print(num_records)

The original code comes from: https://github.com/iobruno/data-engineering-zoomcamp/blob/80f2e1a2ff881cb5dfa174144c5ef9ae7c9e8f09/module6-stream-processing/risingwave/seed.py

Expected behavior

No response

How did you deploy RisingWave?

Build from source.

The version of RisingWave

e3f3fb8f89bbb4879855297f65965fdcc4f8eb9d

Additional context

No response

@jetjinser jetjinser added the type/bug Something isn't working label Mar 26, 2024
@github-actions github-actions bot added this to the release-1.8 milestone Mar 26, 2024
@jetjinser jetjinser changed the title pg_sequence not found when write data to RisingWave with polars bug: pg_sequence not found when write data with polars Mar 26, 2024
@jetjinser
Copy link
Contributor Author

Closed because risingwavelabs/sqlalchemy-risingwave#29 was merged and the problem was solved.

To get polars working, pip install sqlalchemy-risingwave and then connect using conn_string=f"risingwave://{rw_user}@{rw_host}:{rw_port}/{rw_db}".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant