Refactor doc_loader.py to load documents concurrently using Ray actors or Spark tasks, instead of loading them all at once and then putting them into a dataset #510

chaojun-zhang · 2023-12-26T06:57:50Z

Refactor doc_loader.py to load documents concurrently using Ray actors or Spark tasks, instead of loading them all at once and then putting them into a dataset

github-actions bot mentioned this issue Dec 26, 2023

[ISSUE-510] Refactor doc_loader.py to load documents concurrently using Ray actor… #511

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor doc_loader.py to load documents concurrently using Ray actors or Spark tasks, instead of loading them all at once and then putting them into a dataset #510

Refactor doc_loader.py to load documents concurrently using Ray actors or Spark tasks, instead of loading them all at once and then putting them into a dataset #510

chaojun-zhang commented Dec 26, 2023

Refactor doc_loader.py to load documents concurrently using Ray actors or Spark tasks, instead of loading them all at once and then putting them into a dataset #510

Refactor doc_loader.py to load documents concurrently using Ray actors or Spark tasks, instead of loading them all at once and then putting them into a dataset #510

Comments

chaojun-zhang commented Dec 26, 2023