Vectorized infer_beam_batch for improved performance #697
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This pull request introduces key optimizations and improvements in
model_48px
aimed at increasing the efficiency of the OCR pipeline:Vectorized infer_beam_batch:
The
infer_beam_batch
function inmodel_48px
has been vectorized, significantly improving the speed of OCR inference. The originalinfer_beam_batch
is also kept, so you can easily switch between the two for testing purposes.Refactored
encoders
anddecoders
withforward
methods:Extracted the forward methods for both encoders and decoders. This refactoring enables more straightforward model exporting (e.g., to ONNX), allowing further optimization of inference performance and integration with deployment platforms like Triton Inference Server.