Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bulk Running the Inference Pipeline #387

Open
2 tasks
CarsonDavis opened this issue Sep 20, 2023 · 0 comments
Open
2 tasks

Bulk Running the Inference Pipeline #387

CarsonDavis opened this issue Sep 20, 2023 · 0 comments
Assignees
Labels
PI 24.1 Oct, Nov, Dec 2023

Comments

@CarsonDavis
Copy link
Collaborator

Description

Right now, the inference pipeline has only been tested on small batches of URLs, like 150. Since we will need to run it on the millions of URLs that exist in the SDE, it will need to be able to run without overloading the server.

For this issue we need to do two things

  • test the pipeline on the bigger collections that will show current failure types
  • make any modifications necessary to the batch size processing necessary to run it successfully
  • have some code that can run this in batch on all our collections

Implementation Considerations

  • type your first consideration here

Deliverable

  • code to run on all our data
  • any updates to the batch process that are necessary

Dependencies

depends on

@code-geek code-geek added the PI 24.1 Oct, Nov, Dec 2023 label Jan 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PI 24.1 Oct, Nov, Dec 2023
Projects
None yet
Development

No branches or pull requests

3 participants