-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: Haystack OPEA Integration #222
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Gad Markovits <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the RFC @gadmarkovits. Some questions inline.
|
||
4. GenAIEval | ||
|
||
The evaluation, benchmark, and scorecard suite for OPEA, targeting for performance on throughput and latency, accuracy on popular evaluation harness, safety, and hallucination. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
feels like a dangling sentence .. what if anything will be delivered as part of the integration here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're correct, this shouldn't be here - removed.
|
||
2. OPEA Text Embedder | ||
|
||
This component will receive text input and embed it using an OPEA embedding microservice. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between text versus document embedder. If the text is long, it too might need chunking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They're very similar, it's mainly done to conform with similar Haystack integrations and allow for embedding of both raw text and Document objects.
|
||
Following a discussion with Haystack's technical team, it was agreed that a ChatQnA example, using this OPEA integration, would be a good way to showcase its capabilities. To support this, several component wrappers need to be implemented in the first version of the integration (other wrappers will be added gradually): | ||
|
||
1. OPEA Document Embedder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any inclusions/exclusions with respect to document types? Word, pdf, ppt, images, ..?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Embedding documents that are not purely textual is beyond the scope of this integration. We can think about adding document parsers/preprocessors as additional wrappers to OPEA's dataprep components at a later stage.
Signed-off-by: Gad Markovits <[email protected]>
This RFC is used to discuss an implementation of an OPEA integration for Haystack.