Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline runs as background jobs #154

Open
dinmukhamedm opened this issue Nov 5, 2024 · 1 comment
Open

Pipeline runs as background jobs #154

dinmukhamedm opened this issue Nov 5, 2024 · 1 comment
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@dinmukhamedm
Copy link
Member

Currently, our network configurations in the managed versions cut TCP (TLS to be more specific) connections after 350 seconds. For most of our APIs this is more than enough, but pipeline runs sometimes take longer, especially with the rise of larger/slower models, like o1.

We need to add an ability to run pipelines as background jobs.

Currently, this is rather a discussion, not a call for PR. I see two possible ways forward, but we are open to more suggestions as always.

  1. Polling job. Client submits a job, gets a run_id and polls on it.
    • Pros:
      • No need to care about network cutting anything short, as all responses are very quick
    • Cons:
      • If polling is user's responsibility, then this overall makes UX much worse. If polling is hidden in our SDK, then we need to be careful about the intervals in order not to cause too much load.
      • We'll need some infrastructure (separate DB table?) to keep the status of running jobs
  2. Websocket. Client opens a websocket connection, and it's the server's responsibility to periodically ping the connection to keep it alive.
    • Pros:
      • Job state is kept in memory, similar to now, so not much additional infra
    • Cons:
      • We need to design extensible messaging protocol with reliable ping/pong requests to make sure the connection does not close

We are open to discussions for the best way forward and any other suggestions

@dinmukhamedm dinmukhamedm added enhancement New feature or request help wanted Extra attention is needed labels Nov 5, 2024
@nagxsan
Copy link

nagxsan commented Dec 10, 2024

Can we integrate approach 1 with some sort of notification functionality?

  • Instead of the user polling continuously on the run_id why not keep it completely asynchronous, and once the run has completed execution, the server sends a notification object to the front-end indicating the run has completed execution (success/failure).
  • We may need to maintain a separate table which includes the run_id and the status and the server updates this status upon run completion.
  • The front-end would get this status from the database and show it to the user. Until the notification is received, we can show the status as Pending.
  • Also if needed the run can have a scheduled timeout (for example 5 minutes) and if this time has passed and there is no response, we indicate the same with a failure status?

Please let me know if I am missing something crucial or going wrong somewhere.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants