Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return column structure in case of empty resultset #257

Open
zerafachris opened this issue Oct 23, 2023 · 3 comments
Open

Return column structure in case of empty resultset #257

zerafachris opened this issue Oct 23, 2023 · 3 comments
Labels
enhancement New feature or request

Comments

@zerafachris
Copy link

zerafachris commented Oct 23, 2023

Describe the bug

Return column structure in case of empty resultset

Steps to reproduce

  1. Use the following query "SELECT 1 as 'column1' LIMIT 0"

Expected behaviour

Have a new setting, which will change behaviour in case of an empty result set being returned.

In case of pandas, driver.query_df(ch_query) should return pd.DataFrame({'column1' : [] })

This was resolved in the previous driver here ClickHouse/ClickHouse#33522

@zerafachris zerafachris added the bug Something isn't working label Oct 23, 2023
@genzgd genzgd added enhancement New feature or request and removed bug Something isn't working labels Oct 23, 2023
@genzgd
Copy link
Collaborator

genzgd commented Oct 23, 2023

There is no easy way to do this using the HTTP interface, as ClickHouse doesn't return the column name or types in Native Format when the result set is empty. The driver would have to use a different format (such as JSONEachRow) for the query in order to be able to build the column structure. Note that the native interface used in clickhouse-driver works differently and does return an "empty block" with the column structure even for empty queries, so the implementation is more straightforward in that case.

If you know ahead of time that the query result will be empty (for example, with your LIMIT 0 example), maybe there could be something like a "structure only" query that would use the JSONEachRow format over HTTP.

@zerafachris
Copy link
Author

zerafachris commented Oct 23, 2023

Hi @genzgd , Thanks for the reply.
Using query_arrow is am able to return the columns, whilst using 'query_df' the column set is empty. Maybe we can use a partial implementation from pyarrow to get the column lists?

I have tagged the part of code which is returning the empty dataframe and would require an update here #119

@genzgd
Copy link
Collaborator

genzgd commented Oct 23, 2023

Eventually it may make sense to switch the Pandas code to use Arrow instead of the ClickHouse Native format, in particular for Pandas 2.x, which is now backed by Arrow, but that's a pretty significant change. In the meantime I've opened a feature request ClickHouse/ClickHouse#55939 that would make this possible without major code changes, but I don't know if and when the ClickHouse team will get to it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants