Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug(duckdb): StructColumns are compiled to untyped ROW, which aren't CAST correctly #10417

Closed
1 task done
NickCrews opened this issue Nov 2, 2024 · 4 comments
Closed
1 task done
Labels
bug Incorrect behavior inside of ibis
Milestone

Comments

@NickCrews
Copy link
Contributor

NickCrews commented Nov 2, 2024

What happened?

import ibis
import sqlglot

sqlglot.__version__  # '25.20.2'

s1 = ibis.struct({"i": 1, "s": "foo"})
s2 = ibis.struct({"i": ibis.literal(1), "s": ibis.literal("foo")})
e1 = s1.cast("struct<s: string, i: int32>").name("x")
e2 = s2.cast("struct<s: string, i: int32>").name("x")

e1.execute()  # Works
s1.op()  # Literal
print(ibis.to_sql(e1))
# SELECT CAST(CAST(ROW(1, 'foo') AS STRUCT("i" TINYINT, "s" TEXT)) AS STRUCT("s" TEXT, "i" INT)) AS "x"

e2.execute()  # ConversionException: Conversion Error: Could not convert string 'foo' to INT32
s1.op()  # StructColumn
print(ibis.to_sql(e2))
# SELECT CAST(ROW(1, 'foo') AS STRUCT("s" TEXT, "i" INT)) AS "x"

I think the solution would be to add the explicit CAST to the StructColumn compilation, to match how the Literal is compiled? It obviously isn't always needed, based on how all tests are currently passing, but it appears to be context-dependent on where the val is getting used, and I think I would like to just remove any possibility of bugs, even if it sometimes more verbose than needed.

It also could be considered a bug in sqlglot, because in the sqlglot IR, there is enough info to see what the correct thing to do is:

Cast(
    this=Struct(
        expressions=[
            PropertyEQ(
                this=Identifier(this="i", quoted=True),
                expression=Literal(this=1, is_string=False),
            ),
            PropertyEQ(
                this=Identifier(this="i", quoted=True),
                expression=Literal(this="foo", is_string=True),
            ),
        ]
    ),
    to=DataType(
        this=Type.STRUCT,
        expressions=[
            ColumnDef(
                this=Identifier(this="s", quoted=True), kind=DataType(this=Type.VARCHAR)
            ),
            ColumnDef(
                this=Identifier(this="i", quoted=True), kind=DataType(this=Type.INT)
            ),
        ],
        nested=True,
    ),
    copy=False,
)

eg is sqlglot's implementation for compiling Cast was really clever, they would see this reordering and fix it.

What version of ibis are you using?

main

What backend(s) are you using, if any?

duckdb, IDK if it is present in others.

Relevant log output

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@NickCrews NickCrews added the bug Incorrect behavior inside of ibis label Nov 2, 2024
@cpcloud
Copy link
Member

cpcloud commented Nov 2, 2024

With any sqlglot related issues, please show your sqlglot version.

@NickCrews
Copy link
Contributor Author

sqlglot.__version__=="25.20.2". Just added to the post above as well

@NickCrews
Copy link
Contributor Author

I filed an issue with sqlglot, I think the issue should be fixed there, curious what they will say though.

@cpcloud
Copy link
Member

cpcloud commented Nov 11, 2024

Looks like this was fixed upstream, so I'm going to close it out here.

@cpcloud cpcloud closed this as completed Nov 11, 2024
@github-project-automation github-project-automation bot moved this from backlog to done in Ibis planning and roadmap Nov 11, 2024
@cpcloud cpcloud added this to the 10.0 milestone Nov 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Incorrect behavior inside of ibis
Projects
Status: done
Development

No branches or pull requests

2 participants