Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Array support #138

Open
jontxu opened this issue Dec 8, 2022 · 0 comments
Open

Array support #138

jontxu opened this issue Dec 8, 2022 · 0 comments

Comments

@jontxu
Copy link

jontxu commented Dec 8, 2022

I've have been working on adding array support into VOLLT lately and I have partially succeeded:

  • What works: Queries using array columns, and UDF which support them as parameters and return values.
  • What doesn't work: array element access (myarray[1] or myarray[1:4]) on ADQL queries.

Some implementation notes:

  1. Of the DBMS supported by VOLLT, only PostgreSQL and H2 (used in testing) support array columns, although with different syntax and functionality.
  2. The name of array types can be different: a real array is REAL ARRAY in h2 and _real in PostgreSQL. The java.sql.Types class can be used to obtain the type in an agnostic manner, but it seems to be exclusive (i.e, for a real array type == Types.ARRAY). An extra step is needed, to obtain the base type, with either getColumnTypeName (by replacing the underscore or ARRAY) or getBaseTypeName.
  3. java.sql.Array fields are returned as Object arrays, but due to the underlying usage of STIL for output they need to be converted into their primitive counterparts, where null values aren't allowed. This has been achieved with Apache Commons Lang3 (ArrayUtils and StringUtils).
  4. Both DBType and VotType need an arraysize so the outputs can be managed accordingly.
  5. In the case of UDF, the regexes have been updated to allow a parameters or return type to end with \[([0-9]+)?\], and parsing it so the arraysize is respected in both input and output.

As for the array element access, I am having some difficulties on adding support: looking into the code I think that it is a special (for lack of a better word) case of ADQLColumn, or perhaps an operation, where you call for the n-th or the n to m-th elements of an existing array database column. The only main difference I see is that the arraysize will differ of the one in the metadata (thus making arraysize not final).

It might be solved by adding a new rule such as ARRAY_ELEMENT_ACCESS, which would take care of these on-the-fly (so to speak) update of the metadata only with array columns... but I'm not sure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant