Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

worker+runner: Dynamically start live inference runners #275

Merged
merged 17 commits into from
Nov 19, 2024

Conversation

victorges
Copy link
Member

This implements support on the worker to dynamically start the runner containers
and manage their lifecycle.

This is slightly different from how the existing APIs work, since the runner container
runs long past the original request that started it. The solution was:

  • Just start the container on the request, but don't return it automatically
  • Implement a logic on the runner to stop the process when the input stops
  • Implement a monitor on the worker to track when the container stops
  • Update the worker's docker manager state when the container stop to allow other containers to run

I haven't been able to test this yet because my dev machine exploded. I'll create
a new one to try this out.

@victorges victorges requested review from j0sh and leszko November 14, 2024 00:27
@victorges victorges requested a review from rickstaa as a code owner November 14, 2024 00:27
@victorges victorges requested a review from mjh1 November 14, 2024 00:27
Copy link
Contributor

@j0sh j0sh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚢 🚢 🚢

@@ -5,8 +5,9 @@
class Pipeline(ABC):
@abstractmethod
def __init__(self, model_id: str, model_dir: str):
self.model_id: str # type hint so we can use the field in routes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this comment mean?

# type hint so we can use the field in routes

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This avoids a typing error on all the route/ implementations when they used the model_id field on the abstract Pipeline class (currently tons of typing errors on this proj, I'll try to clear them up as I make changes).

I'm changing this comment to this which might be clearer?

# declare the field here so the type hint is available when using this abstract class

Copy link
Contributor

@leszko leszko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

worker/docker.go Outdated
@@ -275,6 +285,8 @@ func (m *DockerManager) createContainer(ctx context.Context, pipeline string, mo
m.containers[containerName] = rc
m.gpuContainers[gpu] = containerName

go m.watchContainer(ctx, rc)
Copy link
Member

@rickstaa rickstaa Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@victorges, @leszko I like your changes! However, I think there’s an advantage to keeping containers warm for a certain amount of time for batch jobs if requests continue to come in. This could help improve response times and resource efficiency. I haven't diven to much in the code yet but my group can add this as a parameter in a subsequent pull request.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! So we should not remove the container when the context is cancelled right? I can change the logic to just return the container on that case instead WDYT? With that I can even remove the "return container" function that is currently called explicitly, and instead just let the callers pass a ctx that is cancelled when the request is done and the container is auto-returned.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented on 4d5adbe !

Noticed there was a potential race happening on the
Stop function all updating the state concurrently.
@victorges victorges merged commit f3c00fe into main Nov 19, 2024
4 checks passed
@victorges victorges deleted the vg/feat/live-v2v-api branch November 19, 2024 23:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants