Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Integration] Adding a better OCR vision model #125

Open
yoeven opened this issue Dec 12, 2024 · 0 comments
Open

[Integration] Adding a better OCR vision model #125

yoeven opened this issue Dec 12, 2024 · 0 comments

Comments

@yoeven
Copy link

yoeven commented Dec 12, 2024

Hey there! We love the approach of using a vision model to generate markdown but it isn't full proof all the time. So we trained a vision model (vLLM) with traditional OCR and now we get more consistency and cleaner data output and native support for PDFs, image and supports response structure like JSON & Markdown. It's called JigsawStack vOCR.

If you think it makes sense, happy to create a PR that adds this integrations as an option between a default LLM or the vOCR model. Let me know what you think :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant