Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit regular expression functions #134

Open
kemp-verily opened this issue Jun 9, 2021 · 3 comments
Open

Revisit regular expression functions #134

kemp-verily opened this issue Jun 9, 2021 · 3 comments
Labels
feature New feature or request

Comments

@kemp-verily
Copy link
Contributor

As noted by @jfuerth , these functions are very useful for "just in time" harmonization.

We may want to look at:

  • adding some polyfill style UDFs to pave over differences
  • listing them, with suggestions about known differences
@jfuerth
Copy link
Contributor

jfuerth commented Jun 9, 2021

Here is an example of a UDF in MySQL that polyfills json_extract_scalar (not relevant to regex, but illustrates the concept):

create function json_extract_scalar(json_doc JSON, json_path TEXT)
returns VARCHAR(512)
deterministic
return json_unquote(json_extract(json_doc, json_path));

Then:

select json_extract_scalar(CAST('{"hello":"yes"}' AS JSON), '$');       --returns '{"hello": "yes"}'
select json_extract_scalar(CAST('{"hello":"yes"}' AS JSON), '$.hello'); --returns 'yes'

@NehaAr
Copy link

NehaAr commented Jan 13, 2023

If it is related to the optimisation of SQL queriesfor Regualr expression. We can try and improve the indexing scheme of the tables like using multigram indexes which optimise the search operation

@jfuerth
Copy link
Contributor

jfuerth commented Jan 16, 2023

Thanks Neha. We have also had success using n-gram indexes to speed up pattern matching in a few of our implementations.

I think the original reason for eliminating regular expression functions from the SQL grammar in the spec was to reduce the barrier to new implementations on various database platforms (MySQL, MS SQL Server, Oracle, various cloud databases, and so on).

I'm curious: would a standardized requirement for JSON functions in Data Connect help with your use case?

@mcupak mcupak removed this from the 1.1.0 milestone Dec 6, 2023
@mcupak mcupak added the feature New feature or request label Dec 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants