Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code for evaluating open source embedding models #305

Merged
merged 3 commits into from
Dec 19, 2024

Conversation

ihis-11
Copy link
Contributor

@ihis-11 ihis-11 commented Dec 13, 2024

This is the code for the blog post about finding the best open-source embedding model. Please start with the Jupyter notebook for explanations on how to run the code.

Thank you for the review!

@ihis-11 ihis-11 requested a review from a team as a code owner December 13, 2024 21:20
Copy link
Contributor

@Askir Askir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got stuck sadly :( But it does look promising overall.

"metadata": {},
"outputs": [],
"source": [
"with connect_db() as conn:\n",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of loading the data from a csv file (and leaving the user to figure that out), try this:

SELECT ai.load_dataset('sgoel9/paul_graham_essays', table_name => 'essays');

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addendum:

SELECT ai.load_dataset('sgoel9/paul_graham_essays', table_name => 'essays', if_table_exists => 'append');

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 for using this but this then also needs pgai 0.6.0 not 0.5.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually couldn't find the function when using pgai 0.6.0 yesterday. @JamesGuthrie

…ng_model_rag_app.ipynb

Co-authored-by: James Guthrie <[email protected]>
Signed-off-by: Hervé Ishimwe <[email protected]>
Copy link
Contributor

@Askir Askir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sweet, thanks for incorporating all the feedback

The ci pipeline complains because we use conventional commits btw: https://www.conventionalcommits.org/en/v1.0.0/

You might wanna change your commit message to be in line as well as the PR title

@ihis-11 ihis-11 merged commit a4298c3 into main Dec 19, 2024
4 of 5 checks passed
@ihis-11 ihis-11 deleted the herve/best-open-source-embedding-model branch December 19, 2024 13:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants