[Integration] Adding a better OCR vision model #125

yoeven · 2024-12-12T17:48:46Z

Hey there! We love the approach of using a vision model to generate markdown but it isn't full proof all the time. So we trained a vision model (vLLM) with traditional OCR and now we get more consistency and cleaner data output and native support for PDFs, image and supports response structure like JSON & Markdown. It's called JigsawStack vOCR.

If you think it makes sense, happy to create a PR that adds this integrations as an option between a default LLM or the vOCR model. Let me know what you think :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Integration] Adding a better OCR vision model #125

[Integration] Adding a better OCR vision model #125

yoeven commented Dec 12, 2024

[Integration] Adding a better OCR vision model #125

[Integration] Adding a better OCR vision model #125

Comments

yoeven commented Dec 12, 2024