Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stream OCR result by page & code restructure #112

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

kailingding
Copy link
Contributor

@kailingding kailingding commented Dec 3, 2024

Major Changes:

  1. Code reorganization

    • Split large utils.ts file into smaller, focused modules under /utils directory
    • Moved OpenAI-related code to /models directory
  2. OCR latency reduction by 4s-10s

    • previous process: preprocess all images -> run LLM calls in parallel
    • improved process: [preprocess one image -> run LLM call] in parallel
  3. Error handling improvements

    • Added new ErrorMode enum (THROW or IGNORE)
    • Implemented retry mechanism with maxRetries parameter
    • Added error status tracking in page processing
  4. Progress tracking

    • Added Page status tracking (SUCCESS/ERROR)
    • New summary object with success/failure counts
    • Added pre/post processing callbacks for better progress monitoring

Documentation Updates

  • Added new parameters documentation in README
  • Fixed markdown formatting issues
  • Updated example outputs to reflect new response format

The changes primarily focus on code organization and latency enhancement during document processing.

@kailingding kailingding changed the title Stream OCR result by page [WIP] Stream OCR result by page Dec 4, 2024
@kailingding kailingding changed the title [WIP] Stream OCR result by page Stream OCR result by page Dec 5, 2024
@kailingding kailingding changed the title Stream OCR result by page Stream OCR result by page & code restructure Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant