Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expectations/requirements for VideoFrame and AudioData timestamps #80

Open
chcunningham opened this issue Feb 9, 2022 · 4 comments
Open

Comments

@chcunningham
Copy link

Is it valid to append mutliple VideoFrames or AudioData objects with the same timestamp (e.g. timestamp = 0) to a MediaStreamTrack? If so, what is the behavior? Does the spec describe this?

@aboba
Copy link
Contributor

aboba commented Feb 9, 2022

The mediacapture-transform specification does not currently describe how timestamp is processed.

Related: #96

Potential future issue when spatial scalability is supported:

With spatial scalability, you can have multiple encodedChunks with the same timestamp (e.g. base layer as well as spatial enhancement layers). Does this result in the decoder producing multiple VideoFrames with the same timestamp? Or does the decoder wait until encodedChunk.timestamp advances before providing a single VideoFrame combining all the layers provided?

Currently, we do not configure the operating point in the WebCodecs decoder, so that decoder doesn't know the desired operating point and the layers that the operating point depends on. So at any given timestamp, the decoder could be provided with just a base layer encodedChunk per timestamp or maybe the base layer plus perhaps some spatial enhancement layer frames. It can only know what it has to work with once the timestamp of encodedChunks advances (which adds delay) or if it is configured with the operating point (in which case it can start decoding once it has been provided with all the layers that the operating point depends on).

@chcunningham
Copy link
Author

With spatial scalability, you can have multiple encodedChunks with the same timestamp (e.g. base layer as well as spatial enhancement layers). Does this result in the decoder producing multiple VideoFrames with the same timestamp? Or does the decoder wait until encodedChunk.timestamp advances before providing a single VideoFrame combining all the layers provided?

In this case the decoder would produce multiple VideoFrame's with the same timestamp, but authors would be expected to discard many of these, passing only their desired resolution to MSTG.

@dontcallmedom-bot
Copy link

This issue had an associated resolution in WebRTC November 19 2024 meeting – 19 November 2024 (Issue #80: Expectations/Requirements for VideoFrame and AudioData timestamps):

RESOLUTION: Add to mediacapture-main extensiblity consideration to make sure sink define their behavior on frame timestamps and file issues on sink specs accordingly

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants