You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Implement parallelization for document ingestion to speed up processing.
Support configurable parallelization strategies (e.g., local multiprocessing, ray, remote endpoints on distributed machines etc.).
.
Motivation
We need fast and scalable, production ready process of document processing
Additional context
Out of the box python parallelization can be somehow problematic - each of the processing threads would either need to send requests to external services (eg. unstructured) which may cause throttling or will need to use multiple instances of processing models increasing gpu requirements.
.
The text was updated successfully, but these errors were encountered:
Feature description
.
Motivation
We need fast and scalable, production ready process of document processing
Additional context
Out of the box python parallelization can be somehow problematic - each of the processing threads would either need to send requests to external services (eg. unstructured) which may cause throttling or will need to use multiple instances of processing models increasing gpu requirements.
.
The text was updated successfully, but these errors were encountered: