Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Store videos in different qualities, audio-only version #3806

Closed
traumschule opened this issue May 28, 2022 · 5 comments
Closed

Store videos in different qualities, audio-only version #3806

traumschule opened this issue May 28, 2022 · 5 comments

Comments

@traumschule
Copy link
Contributor

traumschule commented May 28, 2022

Runtime update to support multiple versions per video assets to increase compatibility (encoding, bitrate, audio/video only) and possible applications like podcasts, kiosk mode, skimming, etc.

Rationale

The Joystream CDN currently is bitrate agnostic and does not reprocess for example very high definition videos for consumption via mobile devices or users with slow connection:

Status Quo

Query to fetch duration, size and codec for videos:

query { videos { id duration
  mediaMetadata { pixelWidth pixelHeight size encoding {codecName}}
} }

This shows video bitrate for actual video files:

for file in /home/joystream-storage/uploads/*
do
  echo -n "$file: " ; ffprobe -i $file -v quiet -select_streams v:0 -show_entries stream=bit_rate -of default=noprint_wrappers=1:nokey=1
done

Scope

Store uploaded videos in known YT qualities, starting with possibly most common 144p and 720p.

video-quality

Sounds really cool but do we really want to do all things like this client-side in the long term? I mean extraction of images, possibly transcoding and other stuff. I think this can get complicated quite quickly. Do we have any long-term plans to support different resolutions per video so the experience can be optimized? I would expect us to start introducing some features like that on the storage provider side, some kind of processing of assets, otherwise we can end up with a lot of unnecessary uploads from the client
I think the near term way to handle multiple resolutions would just be to support uploading multiple assets, and then requiring the user to have done that in advance. With that, we could at least take advantage of this feature on the consumer side, and we could automatically deal with it in the YT-synch feature we are planning.

Allow a publisher to provide more than one media file, both during original publishing, or at a later time
Basically allow the viewer to both see the resolution of the media currently being viewed, but also discover whether others exist, and the opportunity to change during play.
I don't believe this feature is a precondition for launching youtube-synch, or even mainnet, but it seems like a sensible feature enhancement, and some of the exploratory engineering RnD can be done as a filler task at any time.
Maybe instead of doing this client-side we could figure out how to enable server-side support with minimal work? I think we could even do this separately from Orion, as a completely separate service that would later be integrated into Orion.

Processing

This is the easy tho time consuming part.
Which actors can be trusted to reprocess all videos?

  • Gateways: do not exist yet and their concept needs to be fully fleshed out first.
  • Outsourcing this task was discussed and discarded as too complex (see below)

Workflow

draft to figure out which role gateways would play in this

  1. creator A registers with some special Joystream Processing Service (JPS) via their membership account
  2. creator uploads one or more video file(s) to the JPS via their UI and selects target formats
  3. creator issues on-chain transaction to pay JPS for transcoding
  4. JPS generates all desired formats and informs A for example via email when it's done
  5. in the meantime A prepared all metadata and now creates the video on-chain
  6. by signing the transaction in the JPS UI (website/cli/app) which uploads all versions to a SP
    There can be multiple JPS and they could be regional Gateway Providers that also take care of other obligations on the way based on local laws (KYC, content screening, IP/ copyright checks).
    In any way the chain needs to be able to link multiple media assets to one video. Ideally rather short-term (or at least before mainnet) to give possible investors time to develop their infrastructure around that. Also for auditing which is on the way for other features already and takes time.

RICE estimates

Would require
updating runtime to allow Orion to do upload after transcoding is done.
updating query node to reflect this change.
some way to do authentication between user and Orion, which is unclear, as seen in discussion above.
adding capability of Orion to accept, store, dispatch work to some cloud transcoding service, like AWS Transcoding, monitor progress, expose this progress in an API for end-user, and complete final upload to storage system when done.
updating Atlas design to represent state of interaction with Orion on these uploads.
updating Atlas to know when to use this service and when to not, hopefully without involving the user.
updating Atlas to use its already designed player which allows resolution controls.
updating Atlas to have some reasonable policy of selecting what encoding to use for playback by default, given the device and connection.

Gateway ideas

There is an increasing list of enhancements to atlas that may warrant the introduction of a proto-Gateway node, well in advance of the core access financing function that Gateways are supposed to serve.

Gateways are a future infrastructure service, operated by what will be called gateway operators. They are responsible for the last mile between the blockchain economics and infrastructure and a mainstream consumer audience. This is not a part of the system which is currently fully designed or built. Briefly put, the role of a gateway is to achieve the following goals
Glue: Deliver the product features which are either too hard to prioritise facilitating through decentralised infrastructure accountable to the DAO directly. So this is stuff like post-processing of videos, like transcoding or preview image extraction, translation, etc.
Currently, the Orion node, which holds viewership statistics and content featuring policy information, is expected to become the precursor for the gateway node. A good introduction to some of these topics can be found in this video https://play.joystream.org/video/1266
Transcoding: There is no ability for uploaded videos to get converted into a range of different resolutions and encoding formats, making it easier for users to consume the content in a way which is suitable for their device and product.

Downside of leaving this to off-chain internal Gateway processes may defeat CDN advantages to store and distribute different versions per bag and asset.

Storage

Open questions:

  • How to store multiple versions per video on chain?
  • Who uploads versions to SP, ie who signs transaction and owns the video file?
  • Either the creator has to accept and sign-off after processing or this process is opaque to users and has to happen automatically by trusted entities (gateways).
  • if there will be more than one GP, will each bag be assigned to exactly one GP? if not who is authority or how do they negotiate about the correct version?

=> update storage system to be able to save multiple versions per "video"
=> update QN support needed queries for consumer apps (available versions with bitrate by asset id)

Discussion

July 2021

First discussed In July 21 Operations was asked to look into client side transcoding.

@l1dev hey, are you going to be able to tackle those two problems I asked about? benchmarking client side transcoding and preview extraction?
for the transcoding, we should be picking some output format similar to what YT has, as it is likely already the optimal tradeoff between computational load and browser support

I think it is a terrible idea. Can setup a server for transcoding instead if you are interested.

a) Can you please unpack what specifically is terrible about one or both of the ideas?
b) The reason these ideas are interesting from my point of view is that, at least in principle, they could possibly work in the main production system (depending on what we find about speed and cost). You running a server is not really an alternative for mainnet, it would just be a bandaid. If someone, or multiple people, are going to be runnin transcoding servers on mainnet, then that is a much bigger and more complex idea that I would like to avoid, we already have enough infrastructure operated by the DAO, it is getting very complex. The only alternatie I see here is if Gateways have to run their own transcoding server on behaf of their users, but not sure about what impact that has for the fixed costs of strting a Gateway, which we want to be low.

  • heavy processing in a browser is a no-go from my perspective. not only is it bad for mobile devices but it also easily blocks the browser on desktops and users might think it is some kind of hidden miner. I might be wrong.
  • doing it server side could get too complex - this point does not make sense to me. complex tasks are ususally run server-side because we can control the environment and in contrast to frontends install any software we need
  • for the near term - let us set up a system that is a good solution in the long term (mainnet)

I see two options worth exploring:

  • going to be runnin transcoding servers on mainnet - yes, like storage providers we might need processing providers
  • outsourcing the task to services like file.video (livepeer) or flixier.com even if they cost some
  1. I agree we need to be concerned about doing heavy processin in the browser, so we need to try to nail down just how heavy it actually would be.
  2. The complexity comes not from technically running a server, but the incentive and protocol side of having it all work well together. We are now building storage v2 based on a lot of experience, and it won't be flawless, transcoding is a totally differnt problem, and Livepeer is entirely focused on just that, thats what I mean by complexity.
  3. outsourcing to livepeer is very complex, because remember, the user is not supposed to deal with this, so the DAO would ahve to pay livepeer at the protocol level, which means you would have to ahve a cross chain bridge from Joystream to Ethereum, and then connect the Ethereum side to Livepeer. Additionally, the DAO would have to hold LPT on Ethereum over this bridge... this is super complex, and its not even clear if Livepeer will actually allow us to do everything we need, or in the way we need, its still very much in developmnt.

Nov 2021

without third party post-processing, will a user be able to upload multiple versions of the same video, with the same metadata, at the same time?
this is just about making a product which allows this. So Atlas would have to be adapted to support this sort of multi asset kind of uploading, and the playback side would have to understand this, and perhaps offer user control over which version to play back and so on.

Dec 2021

The platform should have the capability to expose the format and other characteristics of the media via some metadata
The platform in future maybe used to build applications on top of it, for which it is important to consider:
a. To design some kind of mechanism where in the platform has the capability to allow and comprehend some kind of configurable features like:
i. Adaptive bitrates while uploading, and as well as streaming via Argus
ii. Allow application to dictate the streaming formats like HLS, HDS, MPEG etc.,
It is important to define where or who is expected to implement the media format conversions.
a. The applications that may be built on top of joystream platform may expect some kind of metadata information either from Colossus or Argus to enable it to convert the ingest stream of data to desired formatted output.

May 2022

Lately discussed here suggesting the DASH concept to switch bitrate.

@bedeho
Copy link
Member

bedeho commented May 31, 2022

I think this is a great idea, but I think this is a big project, which has two distinct parts we should tackle separately, both in order to learn as we go, but also unlock some incremental benefits earlier.

  1. Make it possible for Atlas user to upload a single video, and then somehow have the system encoding and transcoded this into a portfolio of assets.
  2. Make it possible for Atlas to stream a single video in adaptive streaming mode, e.g. using DASH.

Unlocking nr. 2 first makes a great deal of sense because

  • easier:, no new infrastructure, node or role is needed
  • incremental benefits: will pay dividends right away when combined with YT-synch, where we already may be fetching multiple transcoded versions of the same video.
  • no overhead: there is no reason to think we need to change much about how we unlock 2 even when we get to unlocking

So perhaps you can make an issue for just unlocking 2 first, where you describe full set of changes you believe will need to be done to

  • Atlas
  • Video metadata standard.
  • Distributor
  • Argus
  • Query Node

Be as detailed as possible, so as to make it possible to review before we start any sort of execution on this. Things I am particularly interested in

  • would it be possible to edit, remove or add an asset for a particular bitrate after the initial upload?
  • please describe how key product features will work, like
    • user being able to see and pick a desired quality level
    • player being able to adaptivly adjust
  • what would a proof-of-concept attempt project for this look like in terms of scope, man hours, dependencies, etc.

@traumschule
Copy link
Contributor Author

traumschule commented Oct 16, 2022

Well, the on-chain db in particular the content pallet model wasn't designed with this feature in mind (assets 1:1 files, not 1:n) and it is unlikely a change like the placeholder #3861 will come anytime soon.
Since we discussed server-side transcoding for a year the cheapest and maybe best solution now is to keep the current storage and distribution system as a legacy for original content like 100% user generated (often in higher resolution than needed).

On top of that there can be services (GW) that 1) transcode them into additional formats 2) host an index of those 3) cut them into equal minute-long pieces according to DASH and 4) make them accessible via an API for a profit.

  • DP have an interest to reduce bandwidth costs.
  • Gateways want better experience for their users which means video qualities that fit available end-user bandwidth / screen size.

Combining those incentives might pay off extra service fees to be able to offer additional device-specific formats. That way end-user GW pay another GW to offer (all / specific) assets in the optimal format / time-slice.
I see no other way than using the service based approach to cover all needs asked for in https://github.com/Joystream/atlas/labels/upstream%3Abackend - maybe there is but in my opinion it's not worth to possibly break any of the products you mentioned five minutes before launch or even early mainnet (and you made that decision already #4376).

We could try to determine costs of such an approach via bounties.

@kdembler
Copy link
Member

Well, the on-chain db in particular the content pallet model wasn't designed with this feature in mind (assets 1:1 files, not 1:n) and it is unlikely a change like the placeholder #3861 will come anytime soon.

Content pallet is quite transparent when it comes to that and we can easily support multiple resolutions for a single video without any runtime changes. Chain already allows you to specify any number of files (data objects) to be created alongside a video. It's only a matter of updating the metadata protobuf definitions for VideoMetadata to support multiple medias instead of just one. It won't be that hard - most of the work would be to actually enable Atlas to use those multiple resolutions.

@traumschule
Copy link
Contributor Author

That's great news. metadata protobuf is a scary term though, you'll need to teach the community how to do that :)

@kdembler
Copy link
Member

When you create a video or a channel, one of the extrinsic args is meta: Bytes. This blob of bytes is never accessed by runtime and only used by QN. The reason for it being blob of bytes is that this way you can make upgrades and changes to the format of that data and the runtime doesn't care - it's still a blob of bytes. Metadata protobuf definitions are what describe how to serialize an object of metadata to that blob of bytes (and the other way around). It uses protocol buffers for that.

Here is our @joystream/metadata-protobuf package that describes different types of metadata (e.g. VideoMetadata) and provides utilities for (de)serialization: https://github.com/Joystream/joystream/tree/carthage/metadata-protobuf

And here is an example usage in Atlas: https://github.com/Joystream/atlas/blob/carthage/packages/atlas/src/joystream-lib/metadata.ts#L75

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants