Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Audio file treated as a video file #90

Open
Rushi1109 opened this issue Dec 22, 2024 · 5 comments · May be fixed by #94
Open

Bug: Audio file treated as a video file #90

Rushi1109 opened this issue Dec 22, 2024 · 5 comments · May be fixed by #94
Assignees
Labels
bug Something isn't working core Media processing component, the core C++ binary.

Comments

@Rushi1109
Copy link

Describe the bug
When uploading a .mp3 file. The backend serves the .mp4 file which is very short in duration. So, The correct output is not being served from backend.

To Reproduce
Steps to reproduce the behavior:

  1. Upload or add .mp3 file to server. (which contains cover image)
  2. Play the audio when playback widget is available
  3. See error

Expected behavior
As a user I expect the correct output of my audio file. ( Instead of .mp4 file, The backend should serve correct audio file.)

Logs and Debug Information
Please include any relevant log messages and error outputs here.

Screenshots
If applicable, add screenshots to help explain your problem.

System and software (please complete the following information):

  • OS: Windows 10 Intel i5 9300HF
  • Output of ffmpeg -version : ffmpeg version 5.1.6-0+deb12u1 Copyright (c) 2000-2024 the FFmpeg developers
    built with gcc 12 (Debian 12.2.0-14)

Additional context
Please include system settings, configuration changes, or anything else you believe might contribute to the problem.

@Rushi1109 Rushi1109 added bug Something isn't working triage labels Dec 22, 2024
@Rushi1109
Copy link
Author

My proposed solution is to maintain an array of audio file extensions. When a file is uploaded, we can check if its extension matches any entry in our array. If a match is found, the file can be treated as an audio file.

@omeryusufyagci omeryusufyagci added core Media processing component, the core C++ binary. and removed triage labels Dec 23, 2024
@omeryusufyagci
Copy link
Owner

omeryusufyagci commented Dec 23, 2024

Hi @Rushi1109, thanks for reporting this bug and also the proposed solution!

For completeness, this was initially reported over at discord, and with your help we understood that the issue stemmed from the audio file having a cover picture. This cover image was being mistakenly identified as a video stream (with no duration) by ffprobe, which is the tool we're using to determine media type to invoke the appropriate pipeline.

Currently, for an audio file, we do not expect any video streams and the existence of that stream is causing the video pipeline to be erroneously invoked.

Briefly checking this, it seems it's intended behavior for ffprobe to list this image as a video stream due to the mjpeg codec. So, it's not really a bug on ffmpeg's side, but us not realizing this was a possibility yet.

Regarding how we should fix this, your suggestion of keeping a list of audio file extensions may be a possibility, but I feel it'd be error prone to solely rely on the extension. Given the fact, that ffprobe provides the entire list of streams we could deduce if the video streams for a given file should be neglected.

Revisiting the image you'd provided in discord, below:
image
We see that the cover picture is reported without a duration and has mjpeg codec.

I'd propose to use the json flag of ffprobe to get this information and checking for the duration and possibly the codec type. Although, just checking for the duration will likely be enough.

Thanks again for finding this and the initiative to take it on!

@Rushi1109
Copy link
Author

Hello @omeryusufyagci,

For the file which I provided, we are receiving two stream. One audio stream which contains duration. Another is video stream which doesn't contain duration.

When I try to get the duration of video stream, I get the same duration as audio stream. So, It can't be used to determine the media type. Instead, I used codec_name from the stream to determine media type. And It successfully fixes the issue.

But, It will only work for cover image which has mjpeg codec.

check the pr #91.

Also, If you have any idea for solution better than the current one. Let me know.

@mdlaat
Copy link

mdlaat commented Dec 25, 2024

I have cover images that are png, so selecting mjpeg is not going to work for me.
Would it be possible to select for the comment tag that is not starting with "Cover"?

ffprobe -show_entries stream=codec_type:stream_tags -of json -i audiofilename 2>/dev/null | jq .streams[]
shows
{ "codec_type": "audio", "tags": { "encoder": "Lavc58.35" } } { "codec_type": "video", "tags": { "title": "Album cover", "comment": "Cover (front)" } }

@omeryusufyagci
Copy link
Owner

Hi @mdlaat, thanks for your feedback. Indeed, that may not be a robust solution, and I've proposed to use directly the disposition.attached_image field. @Rushi1109 is working on it over at #91.

We could use cover, or Rushi has found another alternative as well, but I'd rather stick to the field mentioned above as it seems to be the official way of checking for this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core Media processing component, the core C++ binary.
Projects
None yet
3 participants