Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runner/live: Make sure to resize images outside of infer process #272

Merged
merged 2 commits into from
Nov 12, 2024

Conversation

victorges
Copy link
Member

On the LiveKit PoC we resized the images on the agent code, and were missing that part from this infer.py piece. This is probably one reason for the inference to be lower on the ai-worker as the resize still happened, but it happened on the same process as we run the model so it was competing for the same Python GIL (and resizing on CPU might be some relevant time).

On the LiveKit PoC we resized the images on the agent code, and
were missing that part from this infer.py piece. This is probably
one reason for the inference to be lower on the ai-worker as the
resize still happened, but it happened on the same process as we
run the model so it was competing for the same Python GIL (and
resizing on CPU might be some relevant time).
@victorges victorges requested a review from rickstaa as a code owner November 11, 2024 20:43
@victorges victorges requested a review from j0sh November 11, 2024 21:22
@j0sh
Copy link
Contributor

j0sh commented Nov 12, 2024

We already resize as part of the video-to-image flow. Do we need to re-process it here?

@victorges
Copy link
Member Author

victorges commented Nov 12, 2024

@j0sh hmm I see. bad news that this doesn't explain the worse performance then :3

I think it might still be worth having this logic "self-contained" and not rely on an external logic to work (it doesn't just get slow, it actually explodes when using tensorrt).

Since I also have the checks to not reprocess in case it has the right size, I think it's beneficial to have it. WDYT?

Copy link
Contributor

@j0sh j0sh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now it is fine to try and get past this performance bottleneck. Does this (re-)encode the frame or are we only passing around raw frame data after this point?

For what it's worth, the ffmpeg resize tries to preserve the aspect ratio when rescaling which will probably trigger this extra work here unless the input is square. If we want to avoid that, feel free to adjust the ffmpeg CLI - cropping will might also need to be added before the rescale if we want to match fidelity. It's also still using CPU rather than GPU, although I am not sure if there is an equivalent to crop_cuda yet.

If this needs additional configuration for a specific model later, we can always pass it in to the ffmpeg command line from the initial params.

@victorges
Copy link
Member Author

@j0sh yeah I don't think we should preserve the aspect ratio. The input has to be 512x512

@victorges victorges merged commit 7df0b05 into main Nov 12, 2024
4 checks passed
@victorges victorges deleted the vg/fix/image-resize branch November 12, 2024 20:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants