-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: Video and Audio synchronized #326
Comments
I agree Alan, thats why TS is already preferred when doing something like this. Amongst other things, this is one of the reasons the TS exist. But that being said, I am sure there is a way to sync (yes at the mercy of latency) the audio and video without having to make a TS for it. What I'd like to know is why some codecs are quite stable, and others are not. I can understand about encoding time and latency, but if you examine OPUS it covers wide range of frequencies and acceptable latency, but its not consistent. Ref: https://capture.dropbox.com/Jb2vzPUIXeYaHc62 PCM for example which is the raw audio out of the SDI, its logically the closest as its not being compressed additionally, but as you can see drift fix has to be on to keep it accurate (its also 8mbps worth of audio in my test setup), but on the other side A-Law (not wide band and can't be used for my application) is spot on, but drift fix is off. I am fine with sticking with one audio codec, I could care less on which it is, as long as it's wide band and sync'd and stable. My concern is even on a localhost that its not stable. I can live with variances once it goes out to the network, but I can't even get a stable encode / decode on a local box. I am totally fine with manually adjusting if I have to at the decoder side, but as I mentioned in my other post #219 I would be happy if I could at least be able to control it with some sort of reliability, but I can't even do that. I can make a test setup available if that is helpful. The lip sync analyzers are not cheap $20k each, so I could help with giving remote access to that. |
We've briefly discussed this request and maybe we can partially implement that. If I understood it correctly, for both of you it is interesting DeckLink->DeckLink (having BMD on both sides) transmission, right? Since there is still the scheduled playback mode (no-low-latency), it can be used for that. Anyways, it will be almost certainly incompatible with drift_fix. It depends whether the receiving clock is slower, in which case it won't be so much problem, except that the latency will continually increase. If the opposite, it will be worse. Depending on the use-case, setting bmdDeckLinkConfigClockTimingAdjustment might be a better option, anyways. Of course if possible (device supports that and the output doesn't need to be ref-locked). |
Hi Martin, I use both Decklink->Decklink and Decklink->Vulkan_sdl |
Sure, but I am afraid there is not much possible to be done with SW displays – audio and video frame can be presented at (approximately) the same time, which is how it is done now. |
Hi, I've recently merged the synchronous output for DeckLink, it is documented here. I've also removed the older no-low-latency mode which has evolved to this mode. There are some limitations mentioned in the linked wiki, namely it will work only decklink->decklink (or testcard vidcap as a source). It would be feasible to implement also for other devices if requested. Also I am not sure how it will work in case of clock drift between DeckLinks (I also don't know how it will play with Feel free to test - I've only tested in very basic setup... It is possible that there appear problems, but it is a bit tricky for me to build up some more real-world setup. |
Hi Martin... can you please clarify what |
Would it be possible to implement for vulkan display/alsa audio? That suffers from sync issues too.
|
it is basically number of buffered frames - Preroll and Buffer size (the number off frames that get stored before dropping). It is documented in |
Would it be possible to implement for vulkan display/alsa audio? That suffers from sync issues too. Well, ALSA and Vulkan cannot be explicitly synchronized. I can play approximately at the same time, which is how it is done so far. Can you be more specific how much desync are you experiencing now in number milliseconds? If it is within one video frame time, I'd say that it is ok. If more, it is rather a question why - whether frame-level video compression/decompression isn't used etc.
I meant rather displays (or capturers) like AJA and similar, where audio is bound to video and I can tell the device when to play the audio/video. |
Thanks Martin I will test it. Wow..... 100ms, thats a lot! Is there no other way to keep that lower? |
If this is not compatible with drift_fix, does it mean it will suffer from audio hits/drops, or does the way this work inherently prevent that? |
I don't know exact amount, I don't have equipment to test the latency easily. So it can be lower or higher... You can also fiddle with the buffer parameters but if you restrict it too much, it can be at the expense of stability.
I cannot say - you'll need to test. It needs to be said that overflows/underflows aren't inherently caused by UltraGrid but by drifting clocks (UG just doesn't care). As for this, I also cannot fully test this because I cannot reasonably reproduce the drift(well, 8K Pro AFAIK supports clock adjustment, which could help testing, I haven't tried yet). It is also possible that drift_fix would normally work but then I cannot guarantee and I wouldn't recommend without knowing, that it really helps (which is true for drift_fix in general, otherwise it would be enabled by default). |
Whats the suggestion to keep prevent clocks from drift, reference? |
In my opinion you have 3 options if it is a problem:
|
Do not use source TS as (a base for) RTP TS by default anymore. Since this was essential for synchronized DeckLink playback, require `--incompatible` to enable this. The reason to disable for now is because it breaks compressed audio. Eg. for Opus, one receives 2 packets for 40 ms input. Currently only the first gets source TS, second is undefined, thus getting the default, loosely related TS, that may however create TS discontinuity (especially with the time, when PC and DeckLink time diverges). There is no good solution for the above yet, sending both packet with the same TS and m-bit on second isn't sufficient now, because it gets joined in receiver buffer and eg. Opus is not self delimiting so it will need changes on receiver side to pass RTP packets to Opus decoder as Opus packets. refer to GH-326
FYI, I've noticed some bad thing about the synchronized mode – it is currently incompatible with any audio compression (I've already updated the wiki page accordingly). It also needs |
Oh.. that does sound bad. Sending PCM is not ideal, but then neither is de-synchronization. |
I'll try to fix it, although it is not completely trivial, I have an idea how to do that. A disadvantage is that it will break compatibility with older decoders, so that it will need to be enabled explicitly for some transition time. |
I can provide you a remote test environment to validate the accuracy. I have all the tools needed for up to 1080p at the moment. And can upgrade later for 4K. |
Should be fixed now. As already noted, sender needs |
Great. I try to test this within the next few weeks. Thanks! |
I finally had time to test this, sorry for the long delay. UltraGrid 1.8+ (master rev 87e0c61 built Oct 2 2023 07:24:04) encoder: receiver: I tried to figure out the syntax for
encoder: receiver:
|
Hi Alan,
Do you have any steps to reproduce? I've just tried with 4K Extreme, BMD 12.7 on U22.04 following command:
and it just works as before. Please create eventually a separate issue for that.
I wouldn't expect that this could work. What would you expect from Please start with some minimal working example, something like:
(eventually adding other options only if really necessary). I've just tried :
and it doesn't seem to produce errors; using -t decklink with Hi59 signal also seem to work. |
Hi Martin, In my previous post, the audio delays were an oversight from prior behavior when I had to try and sync things manually.
Anyway, below I've done a very minimal capture and and display. And while there are less errors than before, there are still distracting video frame holds or audio drops at the Decklink Missing/Dismissed frames. both encoder/receiver: encoder:
receiver:
using --audio-codec=AAC:bitrate=256K with synchronized definitely makes things even worse with a constant clicking noise. I thought at one point I saw a commit that fixed compressed audio with synchro. Using PCM (not explicitly defining a codec) increases the data load by about 12mb/s for 8 channel SDI, although the constant clicking is gone, but still get the frame drops as above. Thank you again for everything. |
Hi,
I see, yes, it does. But it isn't defined for the synchronized playback. As such, it may even break synchronized altogether because it could drop timestamps.
Yes, I actually recall it. It is set by default with
I think it gives decoder slightly more work, but nothing significant I believe (may be a concern just for plenty of channels).
If there is only the remapping, it shouldn't scale. It might only if there is you are mixing 2 or more channels together to prevent overflows (we also do not do this often so I am not sure how good it works now).
I'll try if I am able to reproduce. The thing is that it is quite timing sensitive, which could be eg. influenced by the content of the video and/or network.
Interesting, I'll try to look into it as well. IIRC it may not be related but I can be wrong. As noted there, I've noticed that only Opus was affected because it is not self-delimiting, which means that the new format, where there are multiple OPUS frames with the same timestamp doesn't work with older UG receiver. When I've been testing, AAC was not entirely content (producing warnings) but seem to decode properly, at least from output
Thanks for the info. |
Hi Martin, Can anything be done about the dropped missed frames? I don't understand the syntax to adjust the synchronized parameters, but maybe tuning those would help? Also, please listen to this attached recording of the clicking when using synchro with AAC. Most noticeable in the beginning of the clip. |
Hi,
I was able to reproduce. #345
Unfortunately I was still not able to reproduce. Just to exclude the network, is it reproducible when running just on the receiver?
If it is, which parameter added (toward you setting) causes it to start malfunctioning. If the problem is caused by the network jitter, you can try something like |
Hi Martin, Thanks for taking a look, and I saw the new issue. Strange it is seemingly only 24fps. Regarding the Decklink sync dismissed frames. In the log output above, it shows 100% packet received, so would that not eliminate the network as an issue? Could you explain the options for syncro some more please? I don't understand what the 2 values mean. Thanks,
|
It is not only 24 but actually for every frame-time not divisible by 20 ms. Unfortunately, I've tested with 25p only and it works nicely there, because the audio frame is evenly divisible by 20 ms. Where it isn't, it is more difficult. Params - first value is the initial filling, you can try less than 3 but generally it doesn't work well. More gives you better stability at the expense of higher latency. The second value is the maximal buffer size, there is nothing other about it. The delay then fluctuates somewhere between those 2 values - if it would drop below, last frame is re-scheduled, in the opposite case, the frame is dropped. |
OK this is working well except this the following below condition. For reasons unknown, sometimes the encoder goes just under realtime. When this happens we get the Decklink Missing Frame warning. The end result is that for a frame or two, the video stutters, this is expected, but the audio cuts out. My recollection of behavior prior to "sync", is that the audio would keep going. Having the audio drop is actually more disturbing that a frame or two of video stutter. There is probably no way around that with "sync", but if there was a "keep audio playing" mode that would be good. Encoder: Receiver: I'm also testing Ubuntu 23.10 on NUC12, since oneVPL is supported using native repos, and the newer 6.5 kernel may be a performance regression. With Ubuntu 22.04, I don't recall having encoder FPS drops like I am seeing now. |
I believe this to be solved. Closing. |
Hey everyone, I'm finally back in the office and can do some testing on this with a professional lip sync analyzer. I tired all of the above and was not able to get a stable test setup with either continuous or release. The option --param incompatible does not exist, or at least I am putting it in the wrong spot, which I don't think I am. Just as fyi using UG on a local network or the same machine has somewhere between 80-110ms of drift in the audio. I was told broadcast industry standard is within one frame, sometime accepting two frames of miss-match. Here is what I am testing: ./UltraGrid-continuous-x86_64.AppImage -t decklink:4 -c libavcodec:encoder=libx264:bitrate=8000k -s embedded --audio-codec=MP3:sample_rate=48000:bitrate=128k --audio-capture-format channels=2 192.168.99.123 -P 10000 and ./UltraGrid-continuous-x86_64.AppImage -d decklink:sync:device=0 -r embedded 192.168.99.123 -P 10000 When running sync option I get flooded with overflow messages and occasional missing frame messages. Surprisingly when running without the sync option it works "as UG should" but as of the last release and continious build, now I am seeing random decoding errors: [lavc h264 @ 0x7f339413c0c0] corrupted macroblock 69 24 (total_coeff=-1) Suggestions? |
I've opened a new issue #362, please refer there. |
Hello,
Getting reliable V/A synchronization with UG can be a challenge. It can drift from minute to minute in fact. It requires a lot of trial and error in playing with delay settings, and then on one run when you think you have found it, the next run it is out of sync again. Sometimes these are small values, that might be hard for the normal person to detect, and other times it is quite off. I believe this stems from the fact that in UG, V and A are completely separate pipelines, even being transmitted separately. So the request is to add some mechanism/timestamp for V & A so that even if they are captured and transmitted separately, the receiver will inherently synchronize them, or just replace the underlying transport mechanism with something like MPEG-TS that has V/A packetized together.
Let's discuss.
Thanks,
Alan
The text was updated successfully, but these errors were encountered: