Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RFC] Support non-GPU hardware-based video decoding and encoding #3841

Open
cdzhan opened this issue Oct 14, 2024 · 2 comments
Open

[RFC] Support non-GPU hardware-based video decoding and encoding #3841

cdzhan opened this issue Oct 14, 2024 · 2 comments

Comments

@cdzhan
Copy link

cdzhan commented Oct 14, 2024

🚀 The feature

Support users to obtain the encoding and decoding capabilities of non-GPU devices (may be a out-of-tree device of torch) by using the familiar APIs of torchaudio/torio.io.

Proposed Solution

Firstly, abstract a base class for the device backend, with subclasses for different device backends inheriting from this base class. This class provides device-related parameters and functionalities such as AV_PIX_FMT_CUDA, AV_HWDEVICE_TYPE_CUDA, and D2D copying. Then, we can separate the device-related logic from the device-independent logic.
As for out-of-tree devices, allow them to implement their own device backend subclasses within a torchaudio Python extension package. After importing torchaudio, importing this Python extension package will enable it.

import torchaudio
import torchaudio_npu   # torchaudio Python extension

Moreover, we can support autoloading of device extension pytorch/pytorch#122468.

Motivation, pitch

I'm working on making use of the video decoding and encoding capabilities of MLU which is a out-of-tree device utilizing PrivateUse1 dispatch key supported by ffmpeg-mlu. I found that the current ffmpeg-related code is tightly coupled with the GPU, and I have to make extensive modifications to the code in https://github.com/pytorch/audio/tree/main/src/libtorio/ffmpeg to get torchaudio/torio.io to run on ffmpeg-mlu.

Alternatives

No response

Additional context

We're happy to collaborate and support this goals. If the community is open to considering this feature, we can further refine the specific implementation plan.

cc @mthrok

@mthrok
Copy link
Collaborator

mthrok commented Oct 29, 2024

Hi @cdzhan

I am currently working on a new project, https://github.com/facebookresearch/spdl
The project has just started and it's not widely used, but it is fast and it has a code structure better suited (or extending) for abstracting away the devices.
If you are interested, then we can collaborate there.

@cdzhan
Copy link
Author

cdzhan commented Oct 31, 2024

@mthrok Thank you for your response. For now, we will continue to focus on torchaudio.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants