-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
We shouldn't require track transferability #113
Comments
What problem is this solving? Transferring the track has other benefits like being able to apply constraints and read track stats and settings. |
It solves the problem that you don't need track transferability to implement this API, which we consider a blocker, at least for the medium term. This proposal uses a pattern that we are already using in encoded transform and should easily allow us to have interoperable implementations. |
One use case for transferring media stream track is to create a track (via VideoTrackGenerator) and send it to sinks like RTCRtpSender or MediaRecorder. |
The proposal is that createVideoTrackGenerator() is called from Window and returns a promise with a MediaStreamTrack on Window, where the RTCRtpSender or MediaRecorder are. The generator (which no longer has a track field) is created in the worker (the application gets it via an event, just like an RTCRtpScriptTransformer). This removes the need to transfer the track. You only needed to transfer the track from the worker to window because the current spec creates the track on the worker, where it is largely useless, as all the track APIs are on Window. |
Another benefit of this API surface is that it allows feature detection on main without creating a worker. |
This can be feature detected on main like this: function isMstTransferable() {
try {
const [track] = document.createElement('canvas').captureStream().getVideoTracks();
new MessageChannel().port1.postMessage(track, [track]);
return true;
} catch (e) {
if (e.name != "DataCloneError") throw e;
return false;
}
} |
@guidou why is transfer a blocker? What do you mean by medium term? Safari has already shipped this and it works. If you explain the problem, perhaps their engineers can help? |
It's a blocker for Chromium to ship it in the short term since Chromium doesn't implement track transferability and will not have it for quite some time. I don't expect Chromium to have track transferability in the short term, so I guess we won't have an interoperable API for a long time. |
This feature-detects track transferability, not mediacapture-transform. |
I doubt that's needed. As you said, "tracks are useless on workers except for this API". If MST transfer is detected, it seems reasonable to assume some purpose awaits these tracks in the worker. This works in the only current implementation: "WebKit for Safari 18 beta adds support for MediaStreamTrack processing in a dedicated worker." This seems like a property worth emulating. I've added a note to Firefox's implementation bug to do the same. Thanks for bringing attention to this! |
If there's some difficulty or problem with the spec's transfer steps as specified, please bring it to our attention so we can address it. |
Yes, but waiting on postMessage for these measurements hardly seems ideal. In the current spec, the worker transform can inspect real-time track stats counters like |
That goes the other way too if you want access to the track on Window (which is the more common case today).
I'm not opposed to supporting transferability. I'm opposed to making it a requirement to use mediacapture-transform, as that will have the practical consequence of delaying interoperable implementations. We already have a pattern for adding worker support without requiring transferability of tracks or streams. This doesn't mean applications are forbidden from transferring tracks on browsers that support it if they want to. |
No, because tracks can be cloned. With transfer, stats are readily available in both places. So the problem of a transformer needing a roundtrip to main to read settings and applyConstraints, for lack of transfer, would be new with this proposal. |
Great! Since you said tracks are useless on workers except for the worker API, does this mean you support the worker API?
It already is a requirement.
I doubt attempting to standardize a third new API and waiting for three implementations will get us to interop quicker. Safari has shipped, and Firefox is working on it. 1½ < 3 + one WG. I've filed w3c/mediacapture-extensions#158 to help. Creating a permanent web API to solve one implementer's short-term scheduling seems against § 1.7. Add new capabilities with care and § 1.9. Leave the web better than you found it.
Having web developers navigate between 3 instead of 2 different APIs to do the same thing sounds worse, not better. |
What is the Worker API?
An artificial requirement. It would be very easy to have a spec that does not require track transferability for worker support. That also applies to new implementations (or updating existing ones), since the proposed approach is based on pre-existing patterns already implemented by all major browser engines.
I also doubt an API that ignores developer requirements and concerns by at least one implementor will get us to interop.
w3c/mediacapture-extensions#158 does not address this issue.
The specific change proposed in this issue is not about "short-term" scheduling. It is to make the API better. Ignoring the needs of web page authors and at least one user agent implementor, which the current API does overall, is directly against 1.1. Put user needs first (Priority of Constituencies). User needs come before the needs of web page authors, which come before the needs of user agent implementors, which come before the needs of specification writers, which come before theoretical purity. The track transferability requirement is IMO the opposite of § 1.7. Add new capabilities with care. That principle refers to adding "new capabilities to the web with consideration of existing functionality and content". Adding a feature that requires a dependency on another feature is not better than adding the feature following existing patterns that don't require such dependency.
What 3 different APIs? Are you referring to the requirement of using AudioWorklet for audio processing, which is a different API that, in addition, is not suitable for all types of processing? |
This is not artificial, transferring a track to a worker has real benefits compared to the approach you mention. First, lifetime management is easier. Second, configuration management. This has a real user consequences: a few frames will likely be missed by VideoTrackGenerator when getUserMedia track gets unmuted if the web app has to postMessage. With the worker approach, missing frames would be a bug in the UA implementation. The same principle applies to Finally, we introduced MediaStreamTrack transferability as a way to cover some longer term use cases (grabbing camera in an iframe but do rendering/processing in another iframe). The current spec is more future-proof from that point of view as well.
Right, I think user needs will likely be better served with the current API, as described above.
What are the developer requirements that have been ignored? |
Yes, it is an artificial requirement. If you have a use case where having the track in the worker is useful, then that can be very valid, but it doesn't justify making transferability it a requirement for mediacapture-transform. I didn't say track transferability is an artificial feature. I'm just saying it is an artificial requirement for mediacapture-transform that, in addition, is often detrimental.
You don't need transferability as a requirement to support this use case.
I don't think this is an actual problem because if the getUserMedia track is muted, it will produce no frames and the VideoTrackGenerator will see no frames.
The same applies. I haven't heard developers request this, but even if it's a useful use case, you don't need transferability as a requirement to support this. You just need transferability.
That is an actual use case for which I have seen developer demand and IMO is the main value track transferability can provide. This is completely independent of having track transferability as a requirement for mediacapture-transform.
User needs are not better served by having transferability as a requirement. On the other hand, requiring transferability does make things difficult for some common use cases. The most obvious is playing both the gUM and VTG tracks on a video element in a before/after effects view. In this case you no longer have the gUM track on Window and therefore can't play it on an element. The same applies if you want to use any other track sink available only on Window.
All the arguments I've seen so far are benefits of transferability as a standalone feature. None of these benefits are derived from that transferability being a requirement for mediacapture-transform. So, transferability as a standalone feature supports both the use cases you presented and the ones I presented, but transferability as a mediacapture-transform requirement only supports the use cases you presented and fails to properly support the ones I presented.
Here are some developer requirements that are well known to us and which are ignored by the current version of the spec (not all of these are related to the issue we're discussing which is track transferability as a mediacapture-transform requirement):
|
Let's only talk about the requirements that are relevant to this particular issue (audio support and processing on window are out of scope).
The before/after view can be implemented by transferring a clone of the track instead of transferring the track itself.
@jan-ivar provided a feature detection approach that works in Safari (and will likely work in Firefox). I am sympathetic to the needs of browser implementors. So far though, I haven't seen new information that warrants revisiting the design of this API. |
I think this API is shortsighted. It's tightly coupled, artificially tied to main thread, and it reinvents postMessage. Our goal is to enable MediaStreamTrack processing in dedicated workers. This might include MediaStreamTracks originating in the worker someday, e.g. from an OffscreenCanvas.captureStream() or other sources already exposed in the worker. Or an RTCDataChannel in a worker feeding a VideoTrackGenerator created there. Since we all agree MediaStreamTracks will exist in workers eventually, the simplest API is the one that accepts them there. The idiomatic way to get data to workers is with postMessage, using transferable objects if needed. So I disagree we shouldn't depend on other web platform features. It's doing it all ourselves that's the mistake. At least that's how I read § 1.7 Add new capabilities with care. |
I thought the plan was to summarize our positions in a separate issue and ask TAG for their opinion, but here's my reply.
This API is not tighly coupled with anything. If a developer wants to transfer a track to a Worker and manage all its state and lifetime there, there is nothing in the proposed API preventing it. Just like nothing prevents developers from managing the track on Window if that is what they prefer. The one that forces developers to use track transferability even if they'd rather not use it is the one that tightly couples two features that should be independent of each other.
This API does not reinvent postMessage anymore than encoded transform does. If this is such a bad thing, should I file an issue in encoded transform to eliminate the same pattern there and require that RTCRtpSender and RTCRtpReceiver (or some other object) be transferrable too?
All this can be supported without tightly coupling mediacapture-transform with track transferability.
No it's not the simplest API. It is a lot more complex to tightly couple two features that should be independent. Even more importantly, the proposed API does not even need to be a replacement for the existing one.
Again I ask, why is this a problem. Is encoded transform non-idiomatic? Should we eliminate the RTCRtpScriptTransform constructor and introduce a new transferable object there to be used with postMessage, or make senders and receivers transferable?
The mistake is to force a dependency on another feature that should be independent.
We read it very differently. Adding dependencies between features that should be orthogonal and forcing developers to use complex workarounds to deal with those unnecessary dependencies is, in my view, the opposite of adding capabilities with care. |
@guidou I appreciate your efforts to simplify the API, but I believe your proposal introduces more complexity rather than reducing it. It seems unclear whether your proposed API is intended to replace the existing MediaCapture Transform API or to coexist alongside it. If it's meant to coexist, then we're asking developers to navigate between multiple APIs that achieve similar goals, which can lead to confusion and fragmentation. This also increases the burden on browser implementers to support multiple APIs, delaying interoperability. If it's meant to replace the existing API, it disregards the implementations already shipped in Safari and in progress in Firefox, which would fragment the ecosystem further and negate the developer feedback we've already received. Moreover, your proposal doesn't seem to stand on its own because it doesn't cover all the use cases the current API does — particularly future scenarios where tracks originate in workers or need to be fully managed within a worker context. Requiring track transferability isn't an unnecessary dependency; it's a design choice that provides significant benefits to developers, such as simplified lifetime and configuration management, as well as access to track stats and settings directly within the worker. Adding another API also goes against the web platform design principles of keeping the platform consistent and avoiding unnecessary complexity. I believe it's better for us to focus on implementing the existing API consistently across browsers and addressing any implementation challenges together, rather than introducing an alternative that could fragment the ecosystem. |
Can you elaborate on how is it more complex? Especially for Web developers.
It can be both. I would prefer replace.
It would be better to replace, but since there is one implementation, coexist seems acceptable.
For this reason, coexist would be acceptable.
I'm talking about real use cases deployed in production right now, not hypothetical ones that might never exist. I believe the former should have more weight than the latter in the design of the WG's APIs.
Requiring track transferability for use cases that don't need it is indeed an unnecessary dependency.
There is no simplified lifetime and configuration management. Any use case that prefers to manage track lifetime on Window (basically all use cases deployed today) requires much more complex lifetime management with the existing API.
Requiring track transferability for use cases that don't need it is precisely adding unnecessary complexity.
The ecosystem is already fragmented. This proposal might have the side effect of making it easier to reduce that fragmentation as it makes it possible for UA implementors to develop two features independently using patterns that are already implemented and tested. Forcing two features than can (and should) be orthogonal to have a dependency such that one has to be implemented before the other does nothing to help reduce the already existing fragmentation. Finally, I think we have reached the point in which we are just repeating the same arguments without achieving consensus. Shouldn't we go ahead with the plan to file our positions in separate issues and get TAG's input? |
Encoded transform is bespoke.
No, because unique tradeoffs were involved, and that FPWD has already shipped in two browsers.
I believe it's on the person filing the issue to produce a convincing problem that needs fixing. Otherwise I see no new information since FPWD that warrants revisiting the design of this API. Usage of the spec API seems fine, as seen in this blog. |
In what way that doesn't apply here?
There is a similar tradeoff here. There is nothing magical about FPWDs such that they cannot be improved.
The use case where the application needs the track on Window is a convincing one.
Doesn't seem particularly fine to me. |
Well, we are very far from FPWD, the spec reached consensus within the WebRTC WG and has been very stable for a few years now.
So far, I have not seen a usecase where the API you propose is providing more than the existing API. Could you be more specific? For instance, the API you are proposing is most probably shimable using the current API. Taking a real example to compare the two APIs as an exercise, we could take the use case of doing real time encoding and sending of a video track to the network (using data channel, web transport, or the future encoded source proposal). This requires the web page to potentially adapt the frame rate and/or resolution to the network conditions. With the current API, the adaptation logic is all happening solely in the worker via
Could you precise what use case and what advantages you see? |
That there are two implementations doesn't prevent changing the spec to improve it. We do that all the time with specs that are much more mature than this one.
All existing applications today (including the ones that do video processing) currently do so with the tracks on Window, where they manage all the logic. Forcing them to migrate to manage the track in the worker is an unnecessary barrier.
That sounds more like a hypothetical example than a real one. Either way, my proposal supports it fine. There is nothing that prevents transferring the track to the worker in my proposal, so saying that some use case works better when you transfer the track is not an argument against the proposal. You would have to show evidence that forcing the transfer of the track is better than making it optional in all, or at least most use cases.
The spec version forces a pattern of transferring a track and managing its lifetime there, while historically all applications manage tracks on Window. My proposal makes it easy to support both. Forcing the transfer of the track to the worker is a not necessarily a good pattern, especially for existing applications. The use case for managing the track on Window is all existing applications. |
What is proposed is an entire rewriting the API/WebIDL, which means rewriting a large part of the spec.
It would help immensely if we have a solid proof that the new API is hard to use. AIUI, the current API and the proposed API have the same feature set, so there should be no impact on end users. All we are debating is ease of use of the two APIs. We should compare this potential ease-of-use benefit with the known amount of work building the new proposal would require:
This is a big ask. |
Nothing of the sort. The proposal is basically adding a new factory method for MSTP /VTG. No need to remove anything
Safari implements it. Are you aware of any major applications moving from canvas capture to the new API? Can you share any adoption numbers?
The proposed API is just an additional factory method so that you xan keep the tracks on Window, which will make it easier to migrate existing applications and develop new ones following commonly used patterns withvMediaStreamTracks
The needs of users and Web developers are above the needs of browser implementers and spec writers in the priority of constituencies so I don't see this as a blocker. |
If we are not removing the existing API, why adding an API that can be shimed on the existing API?
This is still early as Safari shipped it recently. |
It removes the transferability requirement and makes it easy to do feature detection without creating a worker.
It would be good to ask those potential users if they would prefer an API that forces to transfer tracks to a worker or one that lets them choose.between Worker and Window. |
It's a bit more than a factory method. For comparison, here's a trivial factory function (works in Safari): async function createVideoTrackGeneratorAndProcessor(worker, track) {
const before = track.clone();
worker.postMessage({before}, [before]);
const {data} = await new Promise(r => worker.addEventListener("message", r));
return data.after;
} But this differs from what @guidou seems to want: the worker reacting to Having processing react to tracks on other threads seems more complex than requiring the track to be on the same thread (Safari stops processing if the It feels like we haven't thought past main thread here. Asking websites to postMessage constraints seems simpler. |
The current version of the API requires track transferability, but this shouldn't be necessary.
Currently, tracks are useless on workers except for this API, so we shouldn't add that as a requirement.
A way to keep the API worker first which has several benefits is to follow the postMessage-like approach of webrtc-encoded-transform.
Something (subject to discussion) like:
For MediaStreamTrackProcessor:
For VideoTrackGenerator:
The text was updated successfully, but these errors were encountered: