Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add is valid sql descriptor #1332

Open
wants to merge 10 commits into
base: main
Choose a base branch
from

Conversation

Sifr-un
Copy link
Contributor

@Sifr-un Sifr-un commented Oct 4, 2024

Implementing a SQL validator for LLM responses as a new descriptor.
Resolves #1321

@DimaAmega
Copy link
Collaborator

Hi @Rayryu

You have some CI failed jobs, please check it

@Sifr-un
Copy link
Contributor Author

Sifr-un commented Oct 7, 2024

DimaAmega

I've pushed a typo fix.

@Sifr-un
Copy link
Contributor Author

Sifr-un commented Oct 8, 2024

I see that the minimal requirements CI step has failed. I've now added sqlvalidator to the requirements file 👌

from typing import ClassVar
from typing import Optional

import sqlvalidator
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's move this import to the beginning of the is_valid_sql function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes made. What was causing the issue?

Copy link
Collaborator

@DimaAmega DimaAmega Oct 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now all the UI checks are being passed in CI 😊

It's a bit tricky: other fix is to add llm extra to install step in these tests but llm extra at the same time is kinda overkill and unnecessary

return self.is_valid_sql(value)

def is_valid_sql(self, query: str) -> bool:
queries = query.strip().split(";") # Split by semicolon
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's consider the following sql

SELECT * FROM your_table WHERE your_column LIKE '%;%';

or

SELECT CONCAT('Hello', '; ', 'World') AS greeting_with_semicolon;

we will get invalid in your implementation but it is actually valid sql

Do we actually need to split by ;?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The split by ; is a bit of a stretch since the column could contain multiple SQL queries. However, we can stick to the basic usage and treat the column values as single queries.
How do we view the scope of this feature?

@mike0sv
Copy link
Collaborator

mike0sv commented Oct 14, 2024

Hi! To fix mypy you need to add ignore block to setup.cfg in repo root. You can check out mypy section for examples

@Sifr-un
Copy link
Contributor Author

Sifr-un commented Oct 14, 2024

Hi! To fix mypy you need to add ignore block to setup.cfg in repo root. You can check out mypy section for examples

Thank you @mike0sv for chiming in. I've added a sqlvalidator block in setup.cfg.

@emeli-dral
Copy link
Contributor

Hi @Rayryu,

It looks like we're almost there!

Could you please add the new descriptor to the list of descriptors under the TextEvals section on the documentation page? The file that needs to be updated is located here: all-metrics.md#text-evals.

Once that’s done, we’ll be all set to proceed with the merge!

@emeli-dral emeli-dral added the hacktoberfest Accepted contributions will count towards your hacktoberfest PRs label Oct 27, 2024
@emeli-dral
Copy link
Contributor

Hi @Sifr-un ,
do you need some help with adding IsValidSQL to the list of descriptors under the TextEvals section all-metrics.md#text-evals?

@emeli-dral emeli-dral added the hacktoberfest-accepted PR accepted for hacktoberfest, which are supposed to be merged later label Oct 28, 2024
@vectorvp
Copy link

@emeli-dral, how can I help to speed this up? Merge the changes from this branch and prepare new PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hacktoberfest Accepted contributions will count towards your hacktoberfest PRs hacktoberfest-accepted PR accepted for hacktoberfest, which are supposed to be merged later
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a new IsValidSQL() descriptor to Evidently
5 participants